Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

AI For Good Bad guys, messy data, & NLP

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Wird geladen in …3
×

Hier ansehen

1 von 25 Anzeige

AI For Good Bad guys, messy data, & NLP

All the information that is needed to find and stop bad actors from entering our financial system already exists and is available to you today; it’s just buried in terabits of messy, unstructured data all over the internet. For those performing investigations and evaluating risk, this needle in a stack of needles problem is huge and growing: Unstructured data already dominates the web (growing exponentially year over year), and the traditional technology these departments use cannot keep up. Recent developments in natural language processing technology (NLP), the field of AI that focuses on human language, have, for the first time, made it possible for automated systems to find and deliver identity-relevant intelligence hidden in unstructured textual data. In this talk, I will share some of the common patterns, common mistakes, and opportunities that I see in the field.

All the information that is needed to find and stop bad actors from entering our financial system already exists and is available to you today; it’s just buried in terabits of messy, unstructured data all over the internet. For those performing investigations and evaluating risk, this needle in a stack of needles problem is huge and growing: Unstructured data already dominates the web (growing exponentially year over year), and the traditional technology these departments use cannot keep up. Recent developments in natural language processing technology (NLP), the field of AI that focuses on human language, have, for the first time, made it possible for automated systems to find and deliver identity-relevant intelligence hidden in unstructured textual data. In this talk, I will share some of the common patterns, common mistakes, and opportunities that I see in the field.

Anzeige
Anzeige

Weitere Verwandte Inhalte

Ähnlich wie AI For Good Bad guys, messy data, & NLP (20)

Aktuellste (20)

Anzeige

AI For Good Bad guys, messy data, & NLP

  1. 1. Chris Mack MAY 1, 2019 AI FOR GOOD Bad Guys, Messy Data, & NLP
  2. 2. LIQUID TRAVEL BAN 2AI FOR GOOD ● BASIS TECHNOLOGY
  3. 3. LIQUID BOMB PLOT 3AI FOR GOOD ● BASIS TECHNOLOGY
  4. 4. JIHADI BRIDES TRAGEDY 4AI FOR GOOD ● BASIS TECHNOLOGY Image Sources: - Bethnal trio: Mirror - Article: Independent
  5. 5. ALL THE EVIDENCE EXISTS 5AI FOR GOOD ● BASIS TECHNOLOGY Scotland Yard Report ID Social Activity Image Sources: - Tweet: : ISD Global
  6. 6. WHAT’S AT STAKE 6AI FOR GOOD ● BASIS TECHNOLOGY FINANCIAL STABILITY Global Money Laundering Operations 1% of Illegal Funds Captured PUBLIC SAFETY Deaths from Terrorist Attacks in Europe 11,288 from 1970-2017 Sources: - Terrorism: Washington Post - Money laundering: Wall Street Journal
  7. 7. UNPACKING THE AI SYSTEM ##AI FOR GOOD ● BASIS TECHNOLOGY
  8. 8. THE PROPOSED SOLUTION: NLP/NLU ##AI FOR GOOD ● BASIS TECHNOLOGY
  9. 9. COMMON PATTERN ##AI FOR GOOD ● BASIS TECHNOLOGY 80% of data is unstructured Join Processed and Structured Data into Knowledge Graph 1) Natural Language Processing Extracts Facts 2) Scored for confidence & relevance Mine Graph For Patterns & Changes People Organizations Locations Relationships Searching Alerting Anomaly Detection Reporting
  10. 10. CHALLENGES AT EVERY LEVEL ##AI FOR GOOD ● BASIS TECHNOLOGY ● Domains ● Languages ● Training Data ● Data Salad! ● Data Access ● Duplication ● Variation ● Ambiguity ● Semantics ● Honey Pots ● Training Data ● GIGO ● Data Overload ● Alert Bombs ● Privacy ● Trust
  11. 11. ... government officials were convicted of corruption. ABC Company saw a drop in sales as … CHALLENGES AND ANTI-PATTERNS ##AI FOR GOOD ● BASIS TECHNOLOGY Identifying Context 1) Reliance on Keywords 2) Naive Rules Leads to False Positives and False Negatives
  12. 12. CHALLENGES AND ANTI-PATTERNS ##AI FOR GOOD ● BASIS TECHNOLOGY Identifying Proper Names 3) Name Variants 4) Name Parts (common keys) Leads to False Positives and False Negatives abdul rashid abdal rashide abdal-rasheed abdul-rashiyd abdul-rachid abd-errshiyd abd-errchide abd-errcheed abd-errchiyd … Abdul-Rasheed ➔
  13. 13. BOSTON BOMBING ##AI FOR GOOD ● BASIS TECHNOLOGY
  14. 14. Challenges & Anti-patterns 3) Failure to match variants 4) Failure to disambiguate 5) Failure to model what matters 6) Monolingual design “Operation Hairball” CHALLENGES AND ANTI-PATTERNS ##AI FOR GOOD ● BASIS TECHNOLOGY
  15. 15. CROSS-LINGUAL SEMANTIC MODELING ##AI FOR GOOD ● BASIS TECHNOLOGY
  16. 16. CROSS-LINGUAL SEMANTIC MODELING ##AI FOR GOOD ● BASIS TECHNOLOGY Mapping algorithm Arabic English Chinese Multilingual embeddings space
  17. 17. CROSS-LINGUAL SEMANTIC MODELING ##AI FOR GOOD ● BASIS TECHNOLOGY Machine Learning ‫חישובית‬ ‫למידה‬Eagle Pharmaceuticals Inc. Eagle Drugs, Co. Tesla Energy Storage ‫טסלה‬ AI ‫ﻣوﺗورز‬ ‫ﺗﯾﺳﻼ‬ 計算学習 ‫אנרגיה‬ ‫אחסון‬
  18. 18. AI BUILDING BLOCKS: Algorithms & High Quality Data ##AI FOR GOOD ● BASIS TECHNOLOGY ● NN NER ● NN CLASS ● NN RELAX ● SVM ● TEXT EMBEDDINGS ● NNs ● NL SEARCH ● CLASSIC ML ● ANOMALY DETECTION ● HMM ● SEMANTIC MODELING ● GRAPH SIMILARITY ● Data Filtering ● Classification ● Deduplication ● High Quality Annotations ● Language & domain combos ● Active Learning Feedback ● High Quality Name Pairs in every language pair ● Confidence Modeling ● Semantic Model ● Baseline “normal” ● Queries ● Visualizations
  19. 19. PUTTING IT ALL TOGETHER ##AI FOR GOOD ● BASIS TECHNOLOGY People Organizations Locations Relationships Searching Alerting Anomaly Detection Reporting
  20. 20. ##AI FOR GOOD ● BASIS TECHNOLOGY THIS TECHNOLOGY IS ALREADY AT WORK
  21. 21. CAPTURING EL CHAPO ##AI FOR GOOD ● BASIS TECHNOLOGY Source: U.S. Immigration and Customs Enforcement
  22. 22. CAPTURING EL CHAPO ##AI FOR GOOD ● BASIS TECHNOLOGY Source: El Chapo recaptured in gun battle
  23. 23. KEY DOMAINS OF IMPACT ##AI FOR GOOD ● BASIS TECHNOLOGY National Security Financial ServicesLaw EnforcementIntelligence
  24. 24. THANK YOU ##AI FOR GOOD ● BASIS TECHNOLOGY Chris Mack ● Basis Technology ● I design & implement NLP / NLU solutions for good ● Please reach out! @cgmack
  25. 25. ##AI FOR GOOD ● BASIS TECHNOLOGY Thank You

×