[Russia] Bugs -> max, time <= T

26. Oct 2015

Más contenido relacionado


[Russia] Bugs -> max, time <= T

  1. Bugs -> max Time <= T Omar Ganiev 11/10/2015
  2. Hi! • I’m Beched • I do application security assessment and penetration testing at IncSecurity • Also I compete as RDot.Org independent team member
  3. Contents • Intro • Technical view • Algorithmic view • Whitebox • Outro
  4. The problem Common situations where the problem of rapid testing of web application arises: • Pentesting a huge scope full of web apps. You have a couple of weeks to analyse and pwn them • The similar case – bug bounty. You want to collect (low) hanging fruits faster than others • Customer asks about the costs of your work. You want to estimate the cost by looking at web app for 5 minutes • Competition (Capture The Flag). You want to pwn the tasks quickly to focus on others and to get extra points
  5. The solution • Prioritizing • Parallelizing • Automation • Guessing • Heuristics • ??? • PROFIT • That’s it?
  6. Manual testing • Tons of articles and books are written about testing methodology (including OWASP Testing Guide) • Manual testing includes application logic analysis along with fuzzing • Manual testing is more careful (no or 1=1 in DROP-queries, etc)
  7. Manual testing • You can capture low hanging fruits in <= T time manually, but not in N applications • Generally automated scanning surely sucks • But anyway we’ll focus on improving tools rather than hands =)
  8. Semi-automated testing • NMap, Burp Suite, Recon-NG, CMSMap, RAFT, etc… • The tools are cool and save time, but still, you need to do a lot by hands, and the combination of such tools is poorly scalable
  9. Automated testing • Most of pentesters write their own specialized tools for automated pentesting • Generally It is rather complex task with a bit of rocket science • There’re a lot of problems like rich application crawling or natural language processing (your program actually needs to read human language to understand the application)
  10. Automated testing • There’re two main variables for measuring complexity (speed) of the testing methodology: time (depends on CPU & memory usage) and number of network requests • They correlate, and time can be decreased by technical measures • This is coverage vs requests count trade-of • Bugs -> max; time <= T; requests <= Q • We’ll mainly focus on the second parameter
  11. Automated testing • Let’s take a look at some tips’n’tricks useful for pentesting toolkit • We’ll observe technical and algorithmic ways to decrease testing time and number of network requests
  12. Contents • Intro • Technical view • Algorithmic view • Whitebox • Outro
  13. Technical view • Well-known things first • HTTP speed-up: Keep-Alive & HEAD • HEAD method can be used for directory listing and any other checks, which only need response headers (length- or time-based payloads, response splitting, etc) • Keep-Alive is always useful, decreases number of connections and hanshakes and hence the server load
  14. Technical view • Trivial paths first • Why crawling the whole site, if there’re sitemap.xml, robots.txt and Google dorks? • Why scanning the whole site, when you can detect a CMS and version and check for vulns in database? • Why fuzzing a login form a lot, when you can hit top passwords?
  15. Technical view • Scaling • Threading and horizontal scaling increase the speed very much, hence they can provide better covering (if we limit time, but not requests) • Recent example of distributed scanning platform is written in Go
  16. Contents • Intro • Technical view • Algorithmic view • Whitebox • Outro
  17. Algorithmic view • Algorithmic view is quite interesting. How can we increase the number of fuzzed points and checked vulnerabilities without increasing requests count? • Let’s remember the problems we face while conducting (semi-)manual testing
  18. Algorithmic view • Ever seen such? • How do you process it manually? • URL patterns, similar pages
  19. Algorithmic view • Already mentiond Gryffin project by Yahoo uses quite a handy algorithm – Simhash • Take a look: r215.pdf • If we build a Simhash-index of pages, we can skip duplicates, saving a lot of time • Possibly it’s better to take into account not only response body, but response status, headers, parameters, etc
  20. Algorithmic view • How to gather input points (GET, POST, Cookie, headers, …)? • Classical way: automate browser (PhantomJS) and crawl the website, process each request • Quick way: • Parse forms, parse links with query strings, parse XHR parameters from JS
  21. Algorithmic view • How to gather unknown input points? • Brute force • Quick: Iterative binary search • Collect a list of common parameter names, hit them all in query string at once and check page for changes, then perform dichotomy
  22. Algorithmic view • How to fuzz input points? • Long way: take a big list of fuzzing strings and fuzz each parameter • Quick way: construct polyglot payloads and check for a bunch of vulns at once • Take a look: olyglot-payloads-in-practice-by-avlidienbrun n-at-hackpra
  23. Algorithmic view • Polyglot payloads can be constructed because of ignored contexts (such as comments) in diferent languages • Example of polyglot string: <tagxss> %0dresponse:splitting'"attributexss • Null-byte or backslash should be placed last • Time-based for (Postgre|My)SQL-injection: '/*! +sleep(10)*/+n1#(select 2 from (select pg_sleep(10))x)n+'
  24. Algorithmic view • Ok, what do we actually do, when we look at web app by eyes? • We estimate “hackableness” of app or page and then think how can we hack it • Why not automate thinking? %)
  25. Algorithmic view • The thinking flow is like this: “Hm… It’s enterprise .NET site with a single login form. Probably not that hackable … Hm… It’s default WordPress installation without plugins and custom themes. Probably not hackable … Hm… It’s shitty custom PHP engine with a lot of forms and input parameters. Instantly hackable! 8) “
  26. Algorithmic view • What makes us think one or another way? Let’s point out some of features: Platform (PHP, ASP, Ruby, …) Server (Apache, Nginx, IIS, …) Engine (WordPress, Django, RoR, …) Queries (count parameters in links on main page) Scripts (number of script tags on main page) Inputs (number of input tags on main page) SSL (if the site works with HTTPS or not)
  27. Algorithmic view • The simpliest vulnerable-vs-secure classifier ever: if PHP: vulnerable = True else: vulnerable = False • Ok, just kidding =)
  28. Algorithmic view Machine learning FTW!
  29. Algorithmic view
  30. Algorithmic view • Today before the talk I scanned about a thousand sites and built this decision tree on obtained data • Actual classifier is a bit bigger than simpliest, but yet the common sense is preserved %) • If the main page is PHP-script, there’re at least 4 GET-parameters in the links on it, and there’s at least one script tag, then site is probably vulnerable =)
  31. Algorithmic view • Ok, this is about cost estimation, but how can this help us to scan the site? • Ever seen this?
  32. Algorithmic view • Let’s calculate more features for each page and build priority queue during scan • If you do it right, /favicon.ico will be scanned last, and /admin.php will be scanned first
  33. Algorithmic view • Which features can we calculate? • Dynamic/static page: detected platform (dynamic language vs none), content-type (html vs static), extension • Response status: OK vs Forbidden vs Redirect vs Not Found vs … • A bit of NLP: if the path contains important words like admin, password, login, etc
  34. Algorithmic view
  35. Algorithmic view • Lower priority() – higher priority:
  36. Contents • Intro • Technical view • Algorithmic view • Whitebox • Outro
  37. Whitebox • Static code analysis is a lot more rocket science thing, than blackbox testing • Modern enterprise static code analysis systems are big and still not enough good (some of them still not better than grep) • They may have nice ads with samples, but ads- samples can probably be constructed by hand ;)
  38. Whitebox • Most pentesters have their own dirty hacks and regexps for finding the vulns • I also use a simple grep wrapper, which allows to spot out security bottlenecks and obvious bugs in no time • Especially useful during CTF, when the source code is not that big • If integrated with IDE, can be rather cool semi-manual analyser
  39. Whitebox • Collect a list of dangerous sinks for various languages • Take a pattern for variable (like $.* in PHP) • Take a list of securing functions • Generate regexps with negative lookahead, which will search for patterns like this: danger_func(…not_a_securing_func(…$var
  40. Whitebox • Get the result like this • Parse it into any IDE and analyse traces
  41. Contents • Intro • Technical view • Algorithmic view • Whitebox • Outro
  42. Summary • Application testing can be made faster in many ways • Some of ways are achievable during manual assessment, some of them are not • We can build fast and scalable web application scanner for this • It will traverse pwning paths graph in an efficient way and halt after hitting the requests limit
  43. Results • Some of reviewed techniques are already implemented in reps on my GitHub (libpywebhack repository not updated for years): • It will be updated as soon as I finish debugging PoC scripts
  44. Questions?