A Secure and Reliable Document Management System is Essential.docx
Â
Remote file path traversal attacks for fun and profit
1. Remote File Path Traversal
Attacks for Fun and Profit
Dr. Dharma Ganesan
2. Disclaimer
â The opinions expressed here are my own but not the views of my employer
â The source code fragments shown here can be reused but
â without any warranty nor accept any responsibility for failures
â Do not apply the exploit discussed here on other systems
â without obtaining authorization from owners
2
3. Goal
â Demonstrate how attackers can steal information from servers
â Present an anti-pattern that enables file path traversal attacks
â Discuss how to prevent file path traversal attacks (in C)
â Present some metrics to compare # of lines before and after patching
3
4. Intended Audience
â Anyone interested in foundations of secure programming (in C)
â Exploits discussed here are well-known to the security community
â But I hope it is still informative for newcomers to software security
4
5. Context: Client-Server Architectural Style
â Clients send request for a file to the server
â Clients can be web-browsers, telnet clients, web clients, etc.
â Server sends the requested file to the clients
â Server can be any program (e.g., web server) that responds to requests
â Of course, the server should not disclose not-public files to the clients
5
6. Context: Client-Server Architectural Style...
Request (public) File
Response File
Threat: Attackers could steal files from the private folder of the server.
Caution: This threat is not only applicable to web but also to any distributed systems
Server
6
7. System under attack and Tools
â System under attack: The web server used here is part of a Hacking book by
Jon Erickson
â The web server vulnerability presented here is not discussed in the book
â Slides complement the book by exploiting a different vulnerability
â Tools
â Telnet web client - part of standard Linux installations
â Telnet has its own security problems but it is fine for this demo
â But you may use any other proxy/plugins to browsers (e.g., developer tools, etc.)
7
8. A simple web application - Proof of Concept
What are the attack surfaces
to steal private files?
- No forms to submit
- No javascripts
Take sometime before seeing
the rest of the slides
8
9. Right click to view page source of index.html
<html>
<img src="image.jpg">
</html>
So image.jpg must be the smiley face
9
10. Under-the-hood steps to just smile
1. Request: âGET / HTTP/1.1â from the browser to the server
2. Response: index.html from the server to the browser
3. Request: âGET /image.jpgâ from the browser to the server
4. Response: image.jpg contents from the server to the browser
5. Request: Browser sends an implicit request for the url icon
1.GET / HTTP/1.1
2. Contents of index.html
3. GET /image.jpg HTTP/1.1
4. Contents of image.jpg
Server
10
11. Confirm these steps - take a peek at serverâs log
â Just to be sure letâs see our web serverâs log fragment:
â Got request from 127.0.0.1:50628 "GET / HTTP/1.1"
Opening ./webroot/index.html't 200 OK
Got request from 127.0.0.1:50630 "GET /image.jpg HTTP/1.1"
Opening ./webroot/image.jpg't 200 OK
â The log shows that the browser actually sent two requests as expected
â The files are delivered relative to the (public) webroot directory
11
12. Attack surface to reach the serverâs private files
â Of course, what if we target one of the GET requestâs path?
â For example, the second GET request and mess with the file name
â Core Idea: Instead of image.jpg what if we request any other file
12
13. Letâs steal arbitrary file contents (telnet client)
â Instead of a web browser, letâs use a telnet client
â Telnet has its own security problems but for our purpose it is fine.
â # telnet localhost 80 (i.e., connect to our web server listening at port 80)
...
GET / HTTP/1.1 (I typed this command)
HTTP/1.0 200 OK
Server: Tiny webserver
<html>
<img src="image.jpg">
</html>
13
14. Letâs steal serverâs /etc/passwd using a telnet client
# telnet localhost 80
...
Connected to localhost.
Escape character is '^]'.
GET /../../../etc/passwd HTTP/1.1
HTTP/1.0 200 OK
Server: Tiny webserver
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
...
Contents of the
/etc/passwd
exposed
14
15. The web server log fragment and analysis
# ./tinyweb
Accepting web requests on port 80
Got request from 127.0.0.1:50750 "GET /../../../etc/passwd HTTP/1.1"
Opening ./webroot/../../../etc/passwd't 200 OK
â The attacker was able to get out of the public root directory!
â The tiny web server is clearly vulnerable to remote file content disclosure
â It appears that the server strcat the web root with the incoming file name
â We will confirm by looking into the source code of the web server
15
16. Analysis of the web server code fragment
if(strncmp(request, "GET ", 4) == 0) // GET request
ptr = request+4; // ptr is the URL.
...
strcpy(resource, WEBROOT); // Begin resource with web root path
strcat(resource, ptr); // ptr points to the input url string in GET
âŠ
}
â The anti-pattern is that the server concatenate the public web root with the input
filename (can be evil)
â The resource variable contains the filename as a string but it was not
evaluated/canonicalized before opening the file
16
17. How to fix this vulnerability? (high-level idea)
â It is not that difficult to find this anti-pattern but fixing is important
â For large systems, grep for strcpy followed by strcat with variable names (e.g. filename)
â grep -A 1 âstrcpyâ -r * | grep âstrcatâ âŠ
â Canonicalize after combining the public webroot with the input filename
â Evaluate whether the canonicalized file is within the webroot
â If yes, we are safe and can disclose the content
â Otherwise, raise a generic error that the file is not found
17
18. Canonicalize and then validate filenames - core idea
â canonicalize_file_name is a library function
â For example, if the input string is ./webroot/../../../etc/passwd
â Output of canonicalize is : /etc/passwd (in my case)
â Do not forgot to free the memory returned by canonicalize
â Check whether the prefix of the canonicalized file name is within the public dir
â See starts_with_substr in the appendix
â I donât think the C language has built-in functions to check the prefix
â If you find bugs in my patch (see the appendix), please contact me
18
19. The patched web server stopped the exploit
# telnet localhost 80
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET /../../../etc/passwd HTTP/1.1
HTTP/1.0 404 NOT FOUND
Server: Tiny webserver
<html><head><title>404 NOT Found</title></head><body><h1>URL not
found</h1></body></html>
Connection closed by foreign host 19
20. The patched web server log fragment and analysis
# ./tinyweb_secure
Accepting web requests on port 80
Got request from 127.0.0.1:50816 "GET /../../../etc/passwd HTTP/1.1"
input file name = /etc/passwd
Unsafe file ./webroot/../../../etc/passwd't 404 Not Found
â The server knows that the attacker is reaching into non-public directories
â The server successfully stopped the attacker
20
21. Number of lines before and after patching
# wc -l tinyweb_secure.c tinyweb.c
224 tinyweb_secure.c (after patching)
122 tinyweb.c (before patching)
â To my surprise, the patched version (tinyweb_secure.c) has nearly two times
more code than the original version (tinyweb.c)
â Comments are inlined using â//â - so they do not contribute much to the metrics
â This shows to me that secure coding (in C) will take at least two times more
coding effort than âtraditionalâ coding
â If I add my code review and testing effort, it is at least three times more expensive!
â More study is needed on other systems to confirm my claims on effort
â May be it is the C language and its small library contribute to more application code
â Or, it is me who did not patch it in a compact way - but I doubt 21
22. Space of inputs for traditional vs secure programs
22
Evil input values
Valid input values
â Traditional programs handle only valid input
values well
â Secure coding requires the programs to
handle evil input values, too
â The problem is that the threats (evil inputs)
have to identified up-front and
â software has to be designed to resist
and recover
23. Conclusion and broader applicability
â Using string concat to construct file names can be dangerous
â This anti-pattern should be avoided
â The server should canonicalize file names and check the resulting filenames
â Otherwise attackers will get into private directories and steal files
â File Path exploitation is independent of web-applications
â Any client-server architecture must close this attack surface
â Usage of TLS between clients and server will not stop the attack
â In this case, TLS will just help attackers to securely download private files :)
â Firewalls do not usually stop file path exploitation payload
23
24. References
1. HTTP: https://developer.mozilla.org/en-US/docs/Web/HTTP
2. OWASP: https://www.owasp.org/index.php/Path_Traversal
3. Jon Erickson, Hacking - The Art of Exploitation
a. The web server of this book is exploited for buffer overflows in the book
b. My slides show a different vulnerability not discussed in the book AFAIK
4. Robert Seacord, Secure coding in C and C++.
a. There is a nice chapter dedicated to file system exploits but my slides show a detailed demo
b. Usage of strcat to construct and test filenames is also well-explained
24
27. How to fix this vulnerability?
strcpy(resource, WEBROOT); // Begin resource with web root path
strcat(resource, ptr); // and join it with resource path pointed by ptr.
if(is_safe_file(resource)) { // Is it inside the web root?
fd = open(resource, O_RDONLY, 0); // Try to open the file.
printf("tOpening %s't", resource);
}
else {
printf("tUnsafe file %s't", resource); // Hacker is attacking us.
fd = -1;
}
27
28. My implementation of is_safe_file
/* Returns -1 if the absolute filename is NULL.
* Returns 1 if the absolute filename is present inside the web root.
* Returns 0 otherwise.
*/
int is_safe_file(char *filename) {
char *realFname;
char *fullwebRootPath;
int status = -1;
if(NULL == filename)
return status;
fullwebRootPath = getWebrootFullPath();
if(fullwebRootPath != NULL) {
realFname = canonicalize_file_name(filename);
status = starts_with_substr(realFname, fullwebRootPath);
printf("input file name = %sn", realFname);
free(realFname);
free(fullwebRootPath);
}
return status;
}
cont...
28
29. My implementation of starts_with_substr
/* Returns 1 if the given str starts with the given prefix. * Returns -1 if the
arguments are invalid. Otherwise, returns 0.
*/
int starts_with_substr(char *str, char *prefix) {
if(NULL == str || NULL == prefix)
return -1;
while(*prefix) {
if(*str++ != *prefix++) return 0;
}
return 1;
} 29
30. My implementation of the get web root path
/* Returns the full web root file path. The caller must free the returned memory.
*/
char* getWebrootFullPath() {
long size;
char *cwd; /* current working dir */
char *ptr = NULL;
char *webrootPath = NULL;
char *webrootCanoPath = NULL;
size = pathconf(".", _PC_PATH_MAX);
if ((cwd = (char *)malloc((size_t)size)) == NULL) return NULL;
30
31. get web root path ...
if(getcwd(cwd, (size_t)size)) {
size = strlen(cwd) + strlen(WEBROOT) + 2;// +2 is for '/' and '0'
if ((webrootPath = (char*)malloc((size_t)size)) == NULL) {
free(cwd);
return NULL;
}
strcpy(webrootPath, cwd);
strcat(webrootPath, "/");
strcat(webrootPath, WEBROOT);
}
webrootCanoPath = canonicalize_file_name(webrootPath);
free(cwd);
free(webrootPath);
return webrootCanoPath;
} 31