Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Automated Analysis of Underground Marketplaces 
8thJanuary 2014 
Aleksandar Hudic, Katharina Krombholz, Thomas Otterbein, ...
Outline 
•Introduction 
•System Model 
–Training Phase 
–Classification Phase 
•Evaluation 
•Discussion 
•Conclusion
Underground Marketplaces 
•Places where cyber-criminals trade their goods and services 
•Mostly benign IRC chatrooms and W...
Underground MarketplacesExample 
•Benign IRC channel 
•Criminal creates a channel with a name that is known by insiders to...
Problem 
•Cyber criminals trade goods and services in publicly accessable chatrooms and Web forums 
•Forensic investigatio...
Goal 
•New and efficient method to detect and analyse underground marketplaces 
•Reduction of the manual effort
Our solution 
•Machine learning 
–Information retrieval 
–Automated text categorization 
–Vector space model-based classif...
System Model 
Two phases: 
•Training 
•Classification
Training Process 
Goal: to build a classifier that determines class membership
Text Preprocessing 
•Training data: 
•Text preprocessing 
–Noise reduction for higher accuracy of the classifier 
–Extract...
Vector Space Transformation 
•Tokenization to separate chunks of text with a specific semantic value 
•Tagging of semantic...
Document Selection 
•To reduce the amount of documents in the training set 
•Relevance in the context 
•Documents are clus...
Feature Selection 
•subset of terms from the training set 
•increases the accuracy of the classifier and reduces the featu...
Classification Process
Classification Process 
•Text preprocessing 
•Transformation to a vector space model 
•Feature weighting 
•Classification ...
Evaluation 
•Observation framework [1]: 
51.3 million IRC messages 
•Crawler: 
203,000 threads in 10 forums 
•Observation ...
Evaluation Parameters 
•푝푟푒푐푖푠푖표푛= 푛푢푚푏푒푟표푓푐표푟푟푒푐푡푟푒푠푢푙푡푠 푛푢푚푏푒푟표푓푎푙푙푟푒푡푢푟푛푒푑푟푒푠푢푙푡푠 
•푟푒푐푎푙푙= 푛푢푚푏푒푟표푓푐표푟푟푒푐푡푟푒푠푢푙푡푠 푛푢푚푏...
Performance EvaluationIRC Channels 
Cross-validation results for underground marketplace detection in IRC channels
Performance EvaluationIRC Channels 
Classification performance of document selection IRC channels
Performance EvaluationWeb Forums
Discussion 
•97% of 51.3 million correctly labeled IRC messages 
•94 % of 203,000 correctly labeled threads 
•In our sampl...
Conclusion 
Machine learning can be used as an efficient method to detect and analyse underground marketplaces
Thank you!
Automated Analysis of Underground Marketplaces 
8thJanuary 2014 
Aleksandar Hudic, Katharina Krombholz, Thomas Otterbein, ...
Nächste SlideShare
Wird geladen in …5
×

Automated analysis of underground marketplaces

1.165 Aufrufe

Veröffentlicht am

Automated analysis of underground marketplaces

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

Automated analysis of underground marketplaces

  1. 1. Automated Analysis of Underground Marketplaces 8thJanuary 2014 Aleksandar Hudic, Katharina Krombholz, Thomas Otterbein, Christian Platzerand Edgar Weippl
  2. 2. Outline •Introduction •System Model –Training Phase –Classification Phase •Evaluation •Discussion •Conclusion
  3. 3. Underground Marketplaces •Places where cyber-criminals trade their goods and services •Mostly benign IRC chatrooms and Web forums •Hijacked benign websites (with abandoned forums) •Ad-hoc nature and volatile
  4. 4. Underground MarketplacesExample •Benign IRC channel •Criminal creates a channel with a name that is known by insiders to be crime related •Traded good: Credit card number •Channel name starts with “#cc“
  5. 5. Problem •Cyber criminals trade goods and services in publicly accessable chatrooms and Web forums •Forensic investigations with high-manual effort (time consuming and expensive) •Simple web-crawling is not feasible
  6. 6. Goal •New and efficient method to detect and analyse underground marketplaces •Reduction of the manual effort
  7. 7. Our solution •Machine learning –Information retrieval –Automated text categorization –Vector space model-based classification system •Discover underground marketplaces even if they are hidden among seemingly benign information channels
  8. 8. System Model Two phases: •Training •Classification
  9. 9. Training Process Goal: to build a classifier that determines class membership
  10. 10. Text Preprocessing •Training data: •Text preprocessing –Noise reduction for higher accuracy of the classifier –Extract plaintext
  11. 11. Vector Space Transformation •Tokenization to separate chunks of text with a specific semantic value •Tagging of semantically meaningful units with domain-relevant information
  12. 12. Document Selection •To reduce the amount of documents in the training set •Relevance in the context •Documents are clustered, representatives for each cluster are selected
  13. 13. Feature Selection •subset of terms from the training set •increases the accuracy of the classifier and reduces the feature space
  14. 14. Classification Process
  15. 15. Classification Process •Text preprocessing •Transformation to a vector space model •Feature weighting •Classification →Prediction results
  16. 16. Evaluation •Observation framework [1]: 51.3 million IRC messages •Crawler: 203,000 threads in 10 forums •Observation data was collected in a period of 11 months •Manually labeled •k-folds cross validation [1] H. Fallmann, G. Wondracek, C. Platzer. Covertly probing underground economy marketplaces. DIMVA 2010
  17. 17. Evaluation Parameters •푝푟푒푐푖푠푖표푛= 푛푢푚푏푒푟표푓푐표푟푟푒푐푡푟푒푠푢푙푡푠 푛푢푚푏푒푟표푓푎푙푙푟푒푡푢푟푛푒푑푟푒푠푢푙푡푠 •푟푒푐푎푙푙= 푛푢푚푏푒푟표푓푐표푟푟푒푐푡푟푒푠푢푙푡푠 푛푢푚푏푒푟표푓푒푥푝푒푐푡푒푑푟푒푠푢푙푡푠 •퐹1=2∗ 푝푟푒푐푖푠푖표푛∗푟푒푐푎푙푙 푝푟푒푐푖푠푖표푛+푟푒푐푎푙푙
  18. 18. Performance EvaluationIRC Channels Cross-validation results for underground marketplace detection in IRC channels
  19. 19. Performance EvaluationIRC Channels Classification performance of document selection IRC channels
  20. 20. Performance EvaluationWeb Forums
  21. 21. Discussion •97% of 51.3 million correctly labeled IRC messages •94 % of 203,000 correctly labeled threads •In our sample: threads were less noisy → easier to extract information •Use case: reasonable suspicion that an information channel is used to trade illegal goods and services
  22. 22. Conclusion Machine learning can be used as an efficient method to detect and analyse underground marketplaces
  23. 23. Thank you!
  24. 24. Automated Analysis of Underground Marketplaces 8thJanuary 2014 Aleksandar Hudic, Katharina Krombholz, Thomas Otterbein, Christian Platzerand Edgar Weippl

×