This is a tutorial for implementing application level traffic analyzer by using SF-TAP flow abstractor.
http://sf-tap.github.io/
https://github.com/SF-TAP/
https://github.com/SF-TAP/flow-abstractor
https://www.usenix.org/conference/lisa15/conference-program/presentation/takano
http://ytakano.github.io/
6. Download Source Code
and Compile It
6
$ git clone https://github.com/SF-TAP/flow-
abstractor.git
$ cd flow-abstractor
$ cmake -DCMAKE_BUILD_TYPE=Release CMakeLists.txt
$ make
7. Configuration File (cont.)
7
# global configuration
global:
home: /tmp/sf-tap # directory, on which UNIX domain files are placed
timeout: 600 # close long-lived (over 600[s]) but do-nothing connections
lru: yes # bring the least recently used pattern to front of list
cache: yes # use cache for regex
# loopback interface for injecting L7 traffic to the flow abstractor
loopback7:
if: loopback7
format: text
tcp_default:
if: default # for every flow that wasn't matched by any rules
proto: TCP
format: text
body: yes
udp_default:
if: default # for every flow that wasn't matched by any rules
proto: UDP
format: text
body: yes
8. Configuration File
8
http:
up: '^[-a-zA-Z]+ .+ HTTP/1.(0r?n|1r?n([-a-zA-Z]+: .+r?n)+)'
down: '^HTTP/1.[01] [1-9][0-9]{2} .+r?n'
proto: TCP # TCP or UDP
if: http # file name of UNIX domain socket
format: text # text or binary
body: yes # if specified 'no', only header is output
nice: 100 # the smaller a value is, the higher a priority is
# balance = 2 # flows are balanced by 2 interfaces
dns_udp:
proto: UDP
if: dns
port: 53 # port number
format: text
nice: 200
9. Run Flow Abstractor
9
$ sudo ./src/sftap_fabs -i en1 -c ./examples/fabs.yaml
run the fow abstractor
$ ls -R /tmp/sf-tap
loopback7= tcp/ udp/
/tmp/sf-tap/tcp:
default= http= smtp= torrent_tracker=
dns= http_proxy= ssh= websocket=
ftp= irc= ssl=
/tmp/sf-tap/udp:
default= dns= torrent_dht=
confirm that flow abstraction interfaces were created
10. Sniff HTTP Flows
10
$ sudo nc -U /tmp/sf-tap/tcp/http
$ curl http://www.google.com/
read the abstraction interface of HTTP
access some web sites
11. Protocol Format of Flow
Abstraction Interfaces
11
$ sudo nc -U /tmp/sf-tap/tcp/http
ip1=192.168.24.54,ip2=216.58.221.196,port1=59547,port2=80,hop=0,l3=ipv4,
l4=tcp,event=CREATED
ip1=192.168.24.54,ip2=216.58.221.196,port1=59547,port2=80,hop=0,l3=ipv4,
l4=tcp,event=DATA,from=2,match=down,len=494
HTTP/1.1 302 Found
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Location: http://www.google.co.jp/?gfe_rd=cr&ei=oVcLVvL7JsHD8AfZnYHQAQ
(omitted)
ip1=192.168.24.54,ip2=216.58.221.196,port1=59547,port2=80,hop=0,l3=ipv4,
l4=tcp,event=DATA,from=1,match=up,len=78
GET / HTTP/1.1
Host: www.google.com
User-Agent: curl/7.43.0
Accept: */*
ip1=192.168.24.54,ip2=216.58.221.196,port1=59547,port2=80,hop=0,l3=ipv4,
l4=tcp,event=DESTROYED
header
header
data
header
data
header
12. Header Format
CSV like key-value pairs.
Consisting of one line. (ended with n)
12
ip1=192.168.24.54,ip2=216.58.221.196,port1=59547,port2=80,hop=0,
l3=ipv4,l4=tcp,event=CREATED
{
“ip1”: “192.168.24.54”,
“ip2”: “216.58.221.196”,
“port1”: 59547,
“port2”: 80,
“hop”: 0,
“l3”: “ipv4”,
“l4”: “tcp”,
“event”: “CREATED”
}
equivalents for
13. Life Cycle of a Flow
13
CREATED DESTROYED
DATA
When TCP connection is established
(performed 3-way handshake),
CREATED event is invoked.
When TCP connection is destroyed
(received FIN/RST, or timeout),
DESTROYED event is invoked.
When arriving data, DATA event is invoked.
14. Protocols of UDP
UDP is not connection oriented.
Therefore, only DATA event is invoked.
14
15. Flow Identification
Each flow is identified by IP addresses,
Port numbers and hop count.
Flows are Identified by tuple of
(ip1, port1, ip2, port2, hop)
Hop filed indicates that how many times
the flow is re-injected to the L7 loopback
interface.
15
ip1=192.168.24.54,ip2=216.58.221.196,port1=59547,port2=80,
hop=0,l3=ipv4,l4=tcp,event=CREATED
16. Origin of DATA
TCP is connection oriented.
Therefore, data is coming from 2 origins.
16
(ip1, port1) (ip2, port2)
data from host1
data from host2
host1 host2
ip1=192.168.24.54,ip2=216.58.221.196,po
rt1=59547,port2=80,hop=0,l3=ipv4,l4=tcp
,event=DATA,from=2,match=down,len=494
from field indicates the origin of data
17. Length of DATA
Len filed indicates the length of data.
17
ip1=192.168.24.54,ip2=216.58.221.196,port1=59547,port2=80,hop=0,
l3=ipv4,l4=tcp,event=DATA,from=2,match=down,len=494
header
event=DATA,len=494
data
494 bytes
18. Upstream and Downstream
Match filed indicates that which pattern is
used for matching.
18
ip1=192.168.24.54,ip2=216.58.221.196,port1=59547,port2=80,hop=0,l3=ipv4,
l4=tcp,event=DATA,from=2,match=down,len=494
http:
up: '^[-a-zA-Z]+ .+ HTTP/1.(0r?n|1r?n([-a-zA-Z]+: .+r?n)+)'
down: '^HTTP/1.[01] [1-9][0-9]{2} .+r?n'
proto: TCP # TCP or UDP
if: http # file name of UNIX domain socket
format: text # text or binary
body: yes # if specified 'no', only header is output
nice: 100 # the smaller a value is, the higher a priority is
# balance = 2 # flows are balanced by 2 interfaces
Configuration
Matched with the pattern of downstream
Matched with the pattern of upstream
ip1=192.168.24.54,ip2=216.58.221.196,port1=59547,port2=80,hop=0,l3=ipv4,
l4=tcp,event=DATA,from=1,match=up,len=78
19. Write Your Own Analyzers
Skelton in Pseudo Code
19
// connect to socket
s = socket();
connect(s, “/tmp/sf-tap/tcp/http”);
for (;;) {
// read header
readline(s, line);
h = parse_header(line);
// generate session ID
sid = new sessionID(h[“ip1”], h[“ip2”],
h[“port1”], h[“port2”], h[“hop”]);
if (h[“event”] == “DATA”) {
read(s, buf, h[“len”]);
}
}
21. Examples
Protocol Parsers
21
$ git clone https://github.com/SF-TAP/protocol-
parser.git
$ cd protocol-parser/http
$ sudo python3 sftap_http.py
more information is available on
https://github.com/SF-TAP/documents/blob/master/
tutorial_fabs_ubuntu1504.md