SlideShare ist ein Scribd-Unternehmen logo
1 von 5
DETECTING A HACKED
TWEET
with Machine Learning and Artificial Intelligence
Sponsored by
Kory Becker 2015
http://primaryobjects.com/cms/article158
http://linkedin.com/in/korybecker
http://twitter.com/primaryobjects
APRIL 23, 2013 1:15PM
143 POINT DROP
ALL YOUR DATA ARE BELONG TO US
 Accord.NET SVM, Tried Gaussian (96%), then linear (97%) kernel
 Extract Tweets with TweetSharp
 Create Document Corpus (6,054 tweets)
 Create Vocabulary (2,225 words)
 Digitize Corpus
 Porter-Stemmer (“talking” => “talk”, “explosion” => “explos”)
 Term Frequency Inverse Document Frequency (TF*IDF)
 Word Existence
 Vector Size = Vocabulary Size | Matrix = double[6054][2225]
ACCURACY
100% TRAINING
97.38% CV
96.23% TEST
CONCLUSION
Kory Becker
http://linkedin.com/in/korybecker
http://twitter.com/primaryobjects
Detecting a Hacked Tweet with Machine Learning
http://primaryobjects.com/CMS/Article158
An Intelligent Approach to Image
Classification By Color
http://primaryobjects.com/CMS/Article154
Self-Programming Artificial Intelligence
http://primaryobjects.com/CMS/Article149

Weitere ähnliche Inhalte

Andere mochten auch

Cognitive Computing.PDF
Cognitive Computing.PDFCognitive Computing.PDF
Cognitive Computing.PDFCharles Quincy
 
The New Era of Cognitive Computing
The New Era of Cognitive ComputingThe New Era of Cognitive Computing
The New Era of Cognitive ComputingIBM Research
 
IBM Watson Analytics Presentation
IBM Watson Analytics PresentationIBM Watson Analytics Presentation
IBM Watson Analytics PresentationIan Balina
 
The Future is Artificial Intelligence, David Cole, IBM Watson
The Future is Artificial Intelligence, David Cole, IBM WatsonThe Future is Artificial Intelligence, David Cole, IBM Watson
The Future is Artificial Intelligence, David Cole, IBM WatsonThe Drum
 
TEDx Manchester: AI & The Future of Work
TEDx Manchester: AI & The Future of WorkTEDx Manchester: AI & The Future of Work
TEDx Manchester: AI & The Future of WorkVolker Hirsch
 

Andere mochten auch (7)

Cognitive Computing.PDF
Cognitive Computing.PDFCognitive Computing.PDF
Cognitive Computing.PDF
 
The New Era of Cognitive Computing
The New Era of Cognitive ComputingThe New Era of Cognitive Computing
The New Era of Cognitive Computing
 
IBM Watson Overview
IBM Watson OverviewIBM Watson Overview
IBM Watson Overview
 
IBM Watson Analytics Presentation
IBM Watson Analytics PresentationIBM Watson Analytics Presentation
IBM Watson Analytics Presentation
 
IBM Internet of Things Offerings
IBM Internet of Things OfferingsIBM Internet of Things Offerings
IBM Internet of Things Offerings
 
The Future is Artificial Intelligence, David Cole, IBM Watson
The Future is Artificial Intelligence, David Cole, IBM WatsonThe Future is Artificial Intelligence, David Cole, IBM Watson
The Future is Artificial Intelligence, David Cole, IBM Watson
 
TEDx Manchester: AI & The Future of Work
TEDx Manchester: AI & The Future of WorkTEDx Manchester: AI & The Future of Work
TEDx Manchester: AI & The Future of Work
 

Mehr von Kory Becker

Intelligent Heuristics for the Game Isolation
Intelligent Heuristics  for the Game IsolationIntelligent Heuristics  for the Game Isolation
Intelligent Heuristics for the Game IsolationKory Becker
 
Tips for Submitting a Proposal to Grace Hopper GHC 2020
Tips for Submitting a Proposal to Grace Hopper GHC 2020Tips for Submitting a Proposal to Grace Hopper GHC 2020
Tips for Submitting a Proposal to Grace Hopper GHC 2020Kory Becker
 
Grace Hopper 2019 Quantum Computing Recap
Grace Hopper 2019 Quantum Computing RecapGrace Hopper 2019 Quantum Computing Recap
Grace Hopper 2019 Quantum Computing RecapKory Becker
 
An Introduction to Quantum Computing - Hopper X1 NYC 2019
An Introduction to Quantum Computing - Hopper X1 NYC 2019An Introduction to Quantum Computing - Hopper X1 NYC 2019
An Introduction to Quantum Computing - Hopper X1 NYC 2019Kory Becker
 
Self-Programming Artificial Intelligence Grace Hopper GHC 2018 GHC18
Self-Programming Artificial Intelligence Grace Hopper GHC 2018 GHC18Self-Programming Artificial Intelligence Grace Hopper GHC 2018 GHC18
Self-Programming Artificial Intelligence Grace Hopper GHC 2018 GHC18Kory Becker
 
2017 CodeFest Wrap-up Presentation
2017 CodeFest Wrap-up Presentation2017 CodeFest Wrap-up Presentation
2017 CodeFest Wrap-up PresentationKory Becker
 
Discovering Trending Topics in News - 2017 Edition
Discovering Trending Topics in News - 2017 EditionDiscovering Trending Topics in News - 2017 Edition
Discovering Trending Topics in News - 2017 EditionKory Becker
 
Machine Learning in a Flash (Extended Edition 2): An Introduction to Neural N...
Machine Learning in a Flash (Extended Edition 2): An Introduction to Neural N...Machine Learning in a Flash (Extended Edition 2): An Introduction to Neural N...
Machine Learning in a Flash (Extended Edition 2): An Introduction to Neural N...Kory Becker
 
Self Programming Artificial Intelligence - Lightning Talk
Self Programming Artificial Intelligence - Lightning TalkSelf Programming Artificial Intelligence - Lightning Talk
Self Programming Artificial Intelligence - Lightning TalkKory Becker
 
Machine Learning in a Flash (Extended Edition): An Introduction to Natural La...
Machine Learning in a Flash (Extended Edition): An Introduction to Natural La...Machine Learning in a Flash (Extended Edition): An Introduction to Natural La...
Machine Learning in a Flash (Extended Edition): An Introduction to Natural La...Kory Becker
 
Machine Learning in a Flash: An Introduction to Natural Language Processing
Machine Learning in a Flash: An Introduction to Natural Language ProcessingMachine Learning in a Flash: An Introduction to Natural Language Processing
Machine Learning in a Flash: An Introduction to Natural Language ProcessingKory Becker
 

Mehr von Kory Becker (11)

Intelligent Heuristics for the Game Isolation
Intelligent Heuristics  for the Game IsolationIntelligent Heuristics  for the Game Isolation
Intelligent Heuristics for the Game Isolation
 
Tips for Submitting a Proposal to Grace Hopper GHC 2020
Tips for Submitting a Proposal to Grace Hopper GHC 2020Tips for Submitting a Proposal to Grace Hopper GHC 2020
Tips for Submitting a Proposal to Grace Hopper GHC 2020
 
Grace Hopper 2019 Quantum Computing Recap
Grace Hopper 2019 Quantum Computing RecapGrace Hopper 2019 Quantum Computing Recap
Grace Hopper 2019 Quantum Computing Recap
 
An Introduction to Quantum Computing - Hopper X1 NYC 2019
An Introduction to Quantum Computing - Hopper X1 NYC 2019An Introduction to Quantum Computing - Hopper X1 NYC 2019
An Introduction to Quantum Computing - Hopper X1 NYC 2019
 
Self-Programming Artificial Intelligence Grace Hopper GHC 2018 GHC18
Self-Programming Artificial Intelligence Grace Hopper GHC 2018 GHC18Self-Programming Artificial Intelligence Grace Hopper GHC 2018 GHC18
Self-Programming Artificial Intelligence Grace Hopper GHC 2018 GHC18
 
2017 CodeFest Wrap-up Presentation
2017 CodeFest Wrap-up Presentation2017 CodeFest Wrap-up Presentation
2017 CodeFest Wrap-up Presentation
 
Discovering Trending Topics in News - 2017 Edition
Discovering Trending Topics in News - 2017 EditionDiscovering Trending Topics in News - 2017 Edition
Discovering Trending Topics in News - 2017 Edition
 
Machine Learning in a Flash (Extended Edition 2): An Introduction to Neural N...
Machine Learning in a Flash (Extended Edition 2): An Introduction to Neural N...Machine Learning in a Flash (Extended Edition 2): An Introduction to Neural N...
Machine Learning in a Flash (Extended Edition 2): An Introduction to Neural N...
 
Self Programming Artificial Intelligence - Lightning Talk
Self Programming Artificial Intelligence - Lightning TalkSelf Programming Artificial Intelligence - Lightning Talk
Self Programming Artificial Intelligence - Lightning Talk
 
Machine Learning in a Flash (Extended Edition): An Introduction to Natural La...
Machine Learning in a Flash (Extended Edition): An Introduction to Natural La...Machine Learning in a Flash (Extended Edition): An Introduction to Natural La...
Machine Learning in a Flash (Extended Edition): An Introduction to Natural La...
 
Machine Learning in a Flash: An Introduction to Natural Language Processing
Machine Learning in a Flash: An Introduction to Natural Language ProcessingMachine Learning in a Flash: An Introduction to Natural Language Processing
Machine Learning in a Flash: An Introduction to Natural Language Processing
 

Kürzlich hochgeladen

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 

Kürzlich hochgeladen (20)

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 

Detecting a Hacked Tweet with Machine Learning (5 Minute Presentation)

  • 1. DETECTING A HACKED TWEET with Machine Learning and Artificial Intelligence Sponsored by Kory Becker 2015 http://primaryobjects.com/cms/article158 http://linkedin.com/in/korybecker http://twitter.com/primaryobjects
  • 2. APRIL 23, 2013 1:15PM 143 POINT DROP
  • 3. ALL YOUR DATA ARE BELONG TO US  Accord.NET SVM, Tried Gaussian (96%), then linear (97%) kernel  Extract Tweets with TweetSharp  Create Document Corpus (6,054 tweets)  Create Vocabulary (2,225 words)  Digitize Corpus  Porter-Stemmer (“talking” => “talk”, “explosion” => “explos”)  Term Frequency Inverse Document Frequency (TF*IDF)  Word Existence  Vector Size = Vocabulary Size | Matrix = double[6054][2225]
  • 5. CONCLUSION Kory Becker http://linkedin.com/in/korybecker http://twitter.com/primaryobjects Detecting a Hacked Tweet with Machine Learning http://primaryobjects.com/CMS/Article158 An Intelligent Approach to Image Classification By Color http://primaryobjects.com/CMS/Article154 Self-Programming Artificial Intelligence http://primaryobjects.com/CMS/Article149

Hinweis der Redaktion

  1. 1. Introduction My name is Kory Becker. I'm a Software Architect at The Associated Press. I develop web applications by day, and have a fascination with artificial intelligence. If you like, you can follow the (short) slides for this talk at slideshare.net/korybecker.
  2. 2. What? On April 23, 2013 the stock market experienced one of its biggest flash-crash drops of the year, with the Dow Jones industrial average falling 143 points (over 1%) in a matter of minutes. Unlike the 2012 stock market blip, this one wasn't caused by an individual trade, but rather by a single tweet from The Associated Press account on the social network, Twitter. The tweet, of course, wasn't written by AP, but rather by an impostor (claimed by the Syrian Electronic Army) who had temporarily gained control of the account. Could a computer program have detected the tweet as hacked? The tweet was "Breaking: Two Explosions in the White House and Barack Obama is injured". Now, there are a couple of specific characteristics about the text in question. The term "Breaking" has incorrect casing, coming from AP. It would usually be all capitals. The combination of "White House" + "and" + "Barack Obama" is rare. Maybe a computer could pick up on this? So, what did we do?
  3. 3. How? The idea was to write a program using artificial intelligence. Specifically, a machine learning algorithm with supervised learning. The computer would be given a list of tweets and be told whether a tweet is real or fake. It can then learn common terms in each category and (hopefully) figure out how to detect the hacked tweet. Using the Accord.NET machine learning library, I started by implementing a support vector machine (SVM) with a gaussian kernel. SVMs work with different kernels, and gaussian allows fitting data points in a variety of non-linear shapes (round, curvy, etc). I extracted tweets using the TweetSharp library. I created a document corpus of about 6,000 tweets and a vocabulary of about 2,000 words. The documents were digitized by tokenizing the tweets, running porter-stemmer to shorten words, and then creating a bag-of-words model. Each tweet's unique terms were added to the vocabulary. Then, you loop through each tweet and check each word against the vocabulary. If the word exists, you mark a 1 in an array for that tweet. If it doesn't exist, you mark a 0. You end up with an array of 1's and 0's for each tweet. This is perfect for training a machine learning program. To train and test the accuracy, the tweets were split into a training, cross validation, and test set. The computer uses the training set to learn which tweets it classifies right or wrong and fine-tune its model. It then runs against the cross validation set to see how it does on tweets that it hasn't trained on. So, what were the results?
  4. 4. Result? The gaussian kernel did pretty well. It scored 99.7% accuracy on the training set and 96% on the cross validation. The SVM was then switched to use a linear kernel. This bumped up the accuracy to 100% training and 97% cross validation. Ok, but did it detect the hacked tweet? The initial training set contained random tweets from AP and non-AP Twitter accounts. It correctly classified AP tweets, but failed on the particular hacked tweet. I fed the training set additional tweets, such as "-from:AP obama" and "-from:AP breaking" so it had knowledge of the actual topic. And what do you know, it worked!
  5. 5. Conclusion There are a lot more details in this project, including some cool learning curve charts and examples of tweets being classified. You can read my full article at http://www.primaryobjects.com/cms/article158 (the top link in the last slide). There are some code samples for setting up the SVM and you can even download the test set results. If you're curious about artificial intelligence, I also have some other interesting articles, including Self-Programming Artificial Intelligence (the last link in the slide), where a computer program uses genetic algorithms to successfully write its own computer programs. Scary stuff! In conclusion, my name is Kory Becker. Feel free to chat if you have any questions or connect online via @primaryobjects on Twitter or Kory Becker on LinkedIn. Thanks.