Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Ai for Good: Bad Guys, Messy Data, & NLP

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Wird geladen in …3
×

Hier ansehen

1 von 24 Anzeige

Ai for Good: Bad Guys, Messy Data, & NLP

Herunterladen, um offline zu lesen

All the information that is needed to find and stop bad actors from entering our financial system already exists and is available to you today; it’s just buried in terabits of messy, unstructured data all over the internet. For those performing investigations and evaluating risk, this needle in a stack of needles problem is huge and growing: Unstructured data already dominates the web (growing exponentially year over year), and the traditional technology these departments use cannot keep up. Recent developments in natural language processing technology (NLP), the field of AI that focuses on human language, have, for the first time, made it possible for automated systems to find and deliver identity-relevant intelligence hidden in unstructured textual data.

All the information that is needed to find and stop bad actors from entering our financial system already exists and is available to you today; it’s just buried in terabits of messy, unstructured data all over the internet. For those performing investigations and evaluating risk, this needle in a stack of needles problem is huge and growing: Unstructured data already dominates the web (growing exponentially year over year), and the traditional technology these departments use cannot keep up. Recent developments in natural language processing technology (NLP), the field of AI that focuses on human language, have, for the first time, made it possible for automated systems to find and deliver identity-relevant intelligence hidden in unstructured textual data.

Anzeige
Anzeige

Weitere Verwandte Inhalte

Ähnlich wie Ai for Good: Bad Guys, Messy Data, & NLP (20)

Weitere von Gil Irizarry (17)

Anzeige

Aktuellste (20)

Ai for Good: Bad Guys, Messy Data, & NLP

  1. 1. Gil Irizarry MAY 22, 2019 AI FOR GOOD Bad Guys, Messy Data, & NLP
  2. 2. LIQUID TRAVEL BAN 2AI FOR GOOD ● BASIS TECHNOLOGY
  3. 3. LIQUID BOMB PLOT 3AI FOR GOOD ● BASIS TECHNOLOGY
  4. 4. JIHADI BRIDES TRAGEDY 4AI FOR GOOD ● BASIS TECHNOLOGY Image Sources: - Bethnal trio: Mirror - Article: Independent
  5. 5. ALL THE EVIDENCE EXISTS 5AI FOR GOOD ● BASIS TECHNOLOGY Scotland Yard Report ID Social Activity Image Sources: - Tweet: : ISD Global
  6. 6. WHAT’S AT STAKE 6AI FOR GOOD ● BASIS TECHNOLOGY FINANCIAL STABILITY Global Money Laundering Operations 1% of Illegal Funds Captured PUBLIC SAFETY Deaths from Terrorist Attacks in Europe 11,288 from 1970-2017 Sources: - Terrorism: Washington Post - Money laundering: Wall Street Journal
  7. 7. UNPACKING THE AI SYSTEM ##AI FOR GOOD ● BASIS TECHNOLOGY
  8. 8. THE PROPOSED SOLUTION: NLP/NLU ##AI FOR GOOD ● BASIS TECHNOLOGY
  9. 9. COMMON PATTERN ##AI FOR GOOD ● BASIS TECHNOLOGY 80% of data is unstructured Join Processed and Structured Data into Knowledge Graph 1) Natural Language Processing Extracts Facts 2) Scored for confidence & relevance Mine Graph For Patterns & Changes People Organizatio ns Locations Relationshi ps Searching Alerting Anomaly Detection COLLECT EXTRAC T COMBIN E ANALYZE Reporting ! ...
  10. 10. CHALLENGES AT EVERY LEVEL ##AI FOR GOOD ● BASIS TECHNOLOGY COLLE CT EXTRAC T COMBIN E ANALYZ E ● Domains ● Languages ● Training Data ● Data Salad! ● Data Access ● Duplication ● Variation ● Ambiguity ● Semantics ● Honey Pots ● Training Data ● GIGO ● Data Overload ● Alert Bombs ● Privacy ● Trust
  11. 11. ... government officials were convicted of corruption. ABC Company saw a drop in sales as … CHALLENGES AND ANTI-PATTERNS ##AI FOR GOOD ● BASIS TECHNOLOGY Identifying Context 1) Reliance on Keywords 2) Naive Rules Leads to False Positives and False Negatives COLLE CT EXTRAC T COMBIN E ANALYZ E
  12. 12. CHALLENGES AND ANTI-PATTERNS ##AI FOR GOOD ● BASIS TECHNOLOGY Identifying Proper Names 3) Name Variants 4) Name Parts (common keys) Leads to False Positives and False Negatives abdul rashid abdal rashide abdal-rasheed abdul-rashiyd abdul-rachid abd-errshiyd abd-errchide abd-errcheed Abdul-Rasheed ➔ COLLE CT EXTRAC T COMBIN E ANALYZ E
  13. 13. BOSTON BOMBING ##AI FOR GOOD ● BASIS TECHNOLOGY
  14. 14. Challenges & Anti-patterns 3) Failure to match variants 4) Failure to disambiguate 5) Failure to model what matters 6) Monolingual design “Operation Hairball” CHALLENGES AND ANTI-PATTERNS ##AI FOR GOOD ● BASIS TECHNOLOGY COLLE CT EXTRAC T COMBIN E ANALYZ E
  15. 15. CROSS-LINGUAL SEMANTIC MODELING ##AI FOR GOOD ● BASIS TECHNOLOGY
  16. 16. CROSS-LINGUAL SEMANTIC MODELING ##AI FOR GOOD ● BASIS TECHNOLOGY Mapping Algorith m Arabic English Chines e Multilingual Embedding s Space
  17. 17. CROSS-LINGUAL SEMANTIC MODELING ##AI FOR GOOD ● BASIS TECHNOLOGY Machine Learning ‫למידה‬‫חישובית‬Eagle Pharmaceuticals Inc. Eagle Drugs, Co. Tesla Energy Storage ‫טסלה‬ AI ‫تيسال‬‫موتورز‬ 計算学習 ‫אחסון‬‫אנרגיה‬
  18. 18. AI BUILDING BLOCKS: Algorithms & High Quality Data ##AI FOR GOOD ● BASIS TECHNOLOGY ● NN NER ● NN CLASS ● NN RELAX ● SVM ● TEXT EMBEDDING S ● NNs ● NL SEARCH ● CLASSIC ML ● ANOMALY DETECTION ● HMM ● SEMANTIC MODELING ● GRAPH SIMILARITY ● Data Filtering ● Classification ● Deduplication ● High Quality Annotations ● Language & domain combos ● Active Learning Feedback ● High Quality Name Pairs in every language pair ● Confidence Modeling ● Semantic Model ● Baseline “normal” ● Queries ● Visualizations COLLE CT EXTRAC T COMBIN E ANALYZ E
  19. 19. PUTTING IT ALL TOGETHER ##AI FOR GOOD ● BASIS TECHNOLOGY COLLE CT EXTRAC T COMBIN E ANALYZ E People Organizatio ns Locations Relationshi ps Searching Alerting Anomaly Detection Reporting ! ...
  20. 20. ##AI FOR GOOD ● BASIS TECHNOLOGY THIS TECHNOLOGY IS ALREADY AT WORK
  21. 21. CAPTURING EL CHAPO ##AI FOR GOOD ● BASIS TECHNOLOGY Source: U.S. Immigration and Customs Enforcement
  22. 22. KEY DOMAINS OF IMPACT ##AI FOR GOOD ● BASIS TECHNOLOGY National Security Financial ServicesLaw EnforcementIntelligence
  23. 23. THANK YOU ##AI FOR GOOD ● BASIS TECHNOLOGY Gil Irizarry ● Basis Technology ● I engineer NLP / NLU tech for good ● Please reach out! @conoagil
  24. 24. ##AI FOR GOOD ● BASIS TECHNOLOGY Thank You

Hinweis der Redaktion


  • All the information that is needed to find and stop bad actors from entering our financial system already exists and is available to you today; it’s just buried in terabits of messy, unstructured data all over the internet. For those performing investigations and evaluating risk, this needle in a stack of needles problem is huge and growing: Unstructured data already dominates the web (growing exponentially year over year), and the traditional technology these departments use cannot keep up. Recent developments in natural language processing technology (NLP), the field of AI that focuses on human language, have, for the first time, made it possible for automated systems to find and deliver identity-relevant intelligence hidden in unstructured textual data. In this talk, I will share some of the common patterns, common mistakes, and opportunities that I see in the field.


    These innovations unlock a new world of actionable insight, providing much-needed ammunition in the fight against fraud, money-laundering, financial crime, and terrorism.
  • As we all know, you can’t take liquids or gels onto commercial flights
    But Most people don’t know the events that led up to that regulation.

    USA “3-1-1 Liquids Rule” (source)
    Each passenger may carry liquids, gels and aerosols in travel-size containers that are 3.4 ounces or 100 milliliters. Each passenger is limited to one quart-size bag of liquids, gels and aerosols. Common travel items that must comply with the 3-1-1 liquids rule include toothpaste, shampoo, conditioner, mouthwash and lotion.

    German Rule (source)
    Containers holding liquids may not be larger than 100 ml, otherwise you may not carry them in your hand luggage. All such containers must be placed in a transparent, reclosable plastic bag with a capacity of no more than one liter (for example, an ordinary freezer bag with zipper). The bag may contain any number of containers as long as it is still possible to completely close it. Please remember: Each passenger may only take one such bag on board the plane.
  • PUNCHLINE
    In August of 2006, seven aircraft did not explode during their flight over the Atlantic. Instead of plunging into the ocean, they landed on runways—a happy ending to what would have been a human tragedy had law enforcement not been tipped off by some carefully crafted AI.
  • The sad and unfortunate situation here, is that it could have been avoided.
    The data exists, crime (which was effectively manipulating nad kidnapping a minor) could be prevented

  • For US audience:

    US Homeland Attacks (2015-2018)

    64 plots disrupted
    21 plots executed
  • The data and technology exist to make the world a safer place, and it’s already begun to make an impact.
  • A more recent example of NLP being used to analyze documents for national security is the 2016 capture and subsequent conviction of El Chapo. To find him, intel officer analyzed communications from email, phone, and sms (SIGINT) of El Chapo’s network; used semantic technology to look at the content of his and his networks conversations and determine who was talking about drugs to understand and link to people in the text to create a network; lead to the identification of the network of people that were involved in El Chapo, allowing agencies to find the location and capture him.

×