This document provides an overview of the HTTP protocol. It discusses HTTP architecture including clients, servers, proxies, and caches. It describes HTTP request and response messages, methods like GET and POST, status codes, and header fields. It also covers topics like URIs, MIME types, encodings, tunnels, cookies, and sessions.
9. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
HTTP
situated on the application layer
access control to the data transmission
medium (MAC – Medium Access Control)
network interconnection + data routing
(IP – Internet Protocol)
reliable transport via sockets
(TCP – Transmission Control Protocol)
hypertext/hypermedia transfer
(HTTP – HyperText Transfer Protocol)
14. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
HTTP: architecture
Web Server
Apache, Internet Information Services, Lighttpd, Nginx,…
Web Client
MosaicNetscapeMozillaFirefox,
Internet Explorer, Chromium, wget, iTunes, Echofon, etc.
details in “Web browser architecture” presentation:
http://profs.info.uaic.ro/~busaco/teach/courses/cliw/web-film.html#week2
21. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
HTTP: concepts
Gateway
can assure:
traffic distribution across servers – load balancing
short-term data storage – caching
message or request translation (e.g., HTTPSHTTP)
other negotiation operations – role of mediator/broker
open source solutions: HAProxy, Squid, Varnish
cloud-based: Amazon ELB (Elastic Load Balancing)
advanced
31. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
HTTP: methods
GET
request – performed by a client – to access
a resource representation
HTML document, CSS stylesheet,
image in PNG format, vector illustration as SVG,
JavaScript program, Atom or RSS (XML) news feed,
PDF presentation, JSON data,…
38. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
HTTP: methods
A method is considered idempotent when it can be
called many times without different outcomes,
returning the same response (representation)
GET, HEAD, PUT and DELETE are idempotent
POST is not idempotent
advanced
44. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
HTTP: header fields (attributes)
Content-Type
specified by Media Types – MIME
(Multipurpose Internet Mail Extensions)
denotes a set of primary content types
+ additional sub-types
initially, used in the e-mail context
47. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
HTTP: header fields (attributes)
Primary types
audio denotes audio content
audio/mpeg – resource encoded in MP3 format
specification for audio data according to
the MPEG (Motion Picture Experts Group) standard
audio/ac3 – compressed audio resource
conforming to AC-3 standard
49. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
HTTP: header fields (attributes)
Primary types
application signifies formats that can be
processed by applications on the client-side
application/javascript – JavaScript program
application/json – JSON (JavaScript Object Notation) data
application/octet-stream – stream of arbitrary bytes
50. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
HTTP: header fields (attributes)
Primary types
multipart used to transfer composed data
multipart/mixed – mixed content
multipart/alternative – alternative contents
e.g., different qualities
of multimedia streams
51. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
N. Freed et al., Media Types (February 2017)
http://www.iana.org/assignments/media-types/media-types.xhtml
calendar+json application/calendar+json Calendar in JSON format
csv text/csv CSV data
opus audio/opus Opus audio resource
msword application/msword Word (MS Office) document
tiff image/tiff Image in TIFF format
vnd.rar application/vnd.rar RAR archive
zip application/zip ZIP archive
52. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
HTTP: header fields (attributes)
Location
Location ":" "http(s)://" authority [ ":" port ] [ abs_path ]
redirects the client to the other resource representation
(HTTP redirect)
Location: http://somewhere.info:8080/moved.html
53. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
HTTP: header fields (attributes)
Referer
denotes the URI of a Web resource
that refers to the current resource
used to know the URI source of the requests
to a given document (i.e. back-links)
for analytics, logging, caching,…
72. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
HTTP: Web server
Fulfills multiple requests from the clients
respecting the HTTP protocol
each request is considered independent from others,
although it was issued by the same Web client
connection state is not kept – stateless
76. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
HTTP: Web server
Case study: Apache HTTP Server configuration
(from April 1996, the most popular Web server)
http://httpd.apache.org/
global configuration: httpd.conf file
6 httpd instances are created by default
a user specific configuration (per directory/URI) is defined
via .htaccess – see also https://github.com/phanan/htaccess
advanced
81. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
HTTP: Web server
Usually, the Web server architecture is modular
kernel (core)
+
modules implementing specific functionalities
examples (Apache): mod_auth_basic, mod_cache,
mod_deflate, mod_include, mod_proxy, mod_session, mod_ssl
advanced
94. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
cgi: variables
A CGI script has access to environment variables
associated to the request sent to the CGI program:
REQUEST_METHOD – HTTP method (GET, POST,…)
QUERY_STRING – data transmitted to the client
REMOTE_HOST, REMOTE_ADDR – client address
CONTENT_TYPE – content type as MIME (Media Type)
CONTENT_LENGTH – content length in bytes
96. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
a result received by Web client after
the invocation via GET on Web server
of variabile.cgi script
(having read & execution rights)
#!/bin/bash
# Setting the content type
echo "Content-type: text/plain";
echo
# Executing 'set' command in Linux
# to show environment variables
set
97. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
/* hello.c
(compile with gcc hello.c –o hello.cgi) */
#include <stdio.h>
int main() {
int msgs; /* number of messages */
printf ("Content-type: text/htmlnn");
for (msgs = 0; msgs < 10; msgs++) {
printf ("<p>Hello, world!</p>");
}
return 0;
}
#!/usr/bin/python
# hello.py.cgi
print "Content-type: text/htmln"
for messages in range (0, 10):
print "<p>Hello, world!</p>"
#!/bin/bash
# hello.sh.cgi
echo "Content-type: text/html"
echo
MESSAGES=0
while [ $MESSAGES -lt 10 ]
do
echo "<p>Hello, world!</p>"
let MESSAGES=MESSAGES+1
done
CGI programs written in C,
bash, Python generating
the same HTML content
advanced
106. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
cgi: invocation
Data processing – GET and/or POST
in case of application servers or frameworks,
data is encapsulated into specific structures/types
ASP.NET (C#) – HttpRequest class
PHP – associative arrays: $_GET[] $_POST[] $_REQUEST[]
Play (Java, Scala) – play.api.mvc.Request
Node.js (JavaScript) – http.ClientRequest
advanced
107. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
GET vs. POST
GET method is used to generate
the representations of the requested resources
e.g., HTML documents, JPEG images,
Atom/RSS news feeds, ZIP archives, etc.
the server state should not be modified
108. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
GET vs. POST
GET method is used to generate
the representations of the requested resources
obtaining data with GET, the user can set a bookmark
for further accesses to the Web resource
(by using the URL of the generated representation)
e.g., https://duckduckgo.com/?q=web+programming&ia=videos
110. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
GET vs. POST
POST method is used when the data transmitted
to the server is large (e.g., upload of file content)
or sensitive – typically, passwords
plus, when the script invocation
can produce a state change on the server:
adding a record, altering a file,...
112. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
cgi: ssi
CGI scripts could be directly invoked from
a HTML document via SSI (Server Side Includes)
http://www.ssi-developer.net/ssi/
Apache: http://httpd.apache.org/docs/trunk/howto/ssi.html
Nginx: http://nginx.org/en/docs/http/ngx_http_ssi_module.html
advanced
154. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
sessions
A session could be implicitly (automatically) or
explicitly (manually, by programmer) registered,
depending on the Web application server
or the default configuration
Web session info is persistently stored on the server by
using non-relational database systems – e.g., DynamoDB,
Memcached, Redis,… – or, in most cases, files
155. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
POST / HTTP/1.1
Accept: text/html,application/xhtml+xml,
application/xml;q=0.9,*/*;q=0.8
Accept-Encoding: gzip, deflate
Accept-Language: en,en-GB;q=0.5
Connection: keep-alive
Cookie: language=en_US
Host: mail.info.uaic.ro
Referer: http://mail.info.uaic.ro/?_task=login
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 … Gecko/20100101 Firefox/51.0
user authentication by using POST method
(already existing cookies are transmitted)
advanced
156. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
sesiuni: exemplificare
HTTP/1.1 302 Found
Cache-Control: private, no-cache, no-store, must-revalidate…
Connection: Keep-Alive
Content-Length: 0
Content-Type: text/html; charset=UTF-8
Date: Thu, 23 Feb 2017 10:25:44 GMT
Keep-Alive: timeout=5, max=100
Last-Modified: Thu, 23 Feb 2017 10:25:44 GMT
Location: ./?_task=mail&_token=cb1924…c9c97819
Server: Apache/2.4.6 (CentOS) mod_fcgid/2.3.9 PHP/5.4.16
Set-Cookie: roundcube_sessid=vnqrt4…2uv2; path=/; HttpOnly
roundcube_sessauth=S92ee64…2c71; path=/; HttpOnly
<!DOCTYPE html>
…
HTTP response
a Web session-related cookie is set
advanced
redirection
after
authentication
158. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
sessions: programming
PHP – functions: session_start(), session_register(),
session_id(), session_unset(), session_destroy()
<?php
session_start (); // creating a session
if (!isset ($_SESSION['accesses'])) {
$_SESSION['accesses'] = 0; } else {
$_SESSION['accesses']++; }
?>
accesses variable
attached to
the session
details at http://php.net/manual/en/book.session.php
159. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
sessions: programming
By using an application server or framework,
the cookie and session management is simpler
various examples:
HttpSession class (ASP.NET), HttpSession interface (Java servlets),
HTTP::Session (Perl), session (Flask – Python framework), web.session
(web.py), HttpFoundation (component of Symfony – PHP framework),
SessionComponent class (CakePHP), session array (Ruby on Rails),
play.mvc.Http.Cookie (Play! for Java/Scala), sessions (Gorilla – Go)
cookie-parser and express-session (Node.js modules for Express)
advanced
160. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
alternatives
HTML5 provides Web Storage
W3C recommendation (2015)
browser-level storage for lists of key—value pairs
via sessionStorage and localStorage attributes
for details, study
profs.info.uaic.ro/~busaco/teach/courses/cliw/web-film.html#week11
avansat
162. Dr.SabinBuragaprofs.info.uaic.ro/~busaco/
next episode: Web programming
Web application servers, Web application architecture
brow-
ser
presen-
tation
pro-
cessing
data
access
<Web/> pages
HTML, CSS,…
fat serverdumb client
frontend backend