Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
<ul>Machine Learning </ul><ul>Tom Maiaroto @shift8creative </ul>
<ul>What is Machine Learning? </ul>
<ul>Algorithms & Approaches </ul><ul>Decision trees   Random forests   Artificial neural networks     k-NN (nearest neighb...
<ul>Algorithms & Approaches </ul><ul>Decision trees   Random forests   Artificial neural networks     k-NN (nearest neighb...
<ul>So could machines one day rule the earth? </ul>
<ul>So could machines one day rule the earth? </ul><ul>  Maybe    (ok probably not) </ul>
<ul>What can Machine Learning  do for Apps? </ul><ul>  Spam filtering </ul>
<ul>What can Machine Learning  do for Apps? </ul><ul>Auto-tagging </ul>
<ul>What can Machine Learning  do for Apps? </ul><ul>All Sorts of Categorization  </ul>
<ul>What can Machine Learning  do for Apps? </ul><ul>Sentiment Analysis  </ul>
<ul>Languages Commonly Used </ul><ul><ul><li>Java </li></ul></ul><ul><ul><ul><li>Java-ML, WEKA, Apache Mahout, many more.....
<ul>Languages Commonly Used </ul><ul>    http://www.mloss.org </ul>
<ul>MongoDB Too! </ul><ul><ul><li>Map/Reduce
Stored JavaScript
Geo-spatial Indexing
Replication </li></ul></ul>
<ul>Geo-spatial Indexing </ul><ul>Did someone say nearest neighbour? </ul>
<ul>Geo-spatial Indexing </ul><ul>Did someone say nearest neighbour? Design geeks, imagine the visualizations... </ul>
<ul>Replication </ul><ul><ul><li>Store massive amounts of data
Distributed performance benefits
Dedicated databases for calculations  </li></ul></ul><ul>    All the obvious benefits. </ul>
<ul>Map/Reduce </ul><ul>It's the brain. </ul>
<ul>Map/Reduce </ul><ul>It's the brain. It's not just for aggregation. </ul>
<ul>Map/Reduce </ul><ul>It's the brain. It's not just for aggregation.       It's faster than you might think. </ul>
<ul>Map/Reduce </ul><ul>It's the brain. It's not just for aggregation.       It's faster than you might think. It runs  in...
<ul>Map/Reduce </ul><ul>In  the computer. .. </ul>
<ul>Example Time! </ul><ul>It's simple...Just take this... </ul>
<ul>Example Time! </ul><ul>It's simple...Just take this... </ul>
<ul>Example Time! </ul><ul>Just kidding...       Let's Break Down a Naive Bayes Classifier </ul>
<ul>Classification /Naive Bayes </ul><ul>Training the System </ul>
<ul>Classification /Naive Bayes </ul><ul>Training the System Simple... $inc </ul>
<ul>Classification /Naive Bayes </ul><ul>Just Keep Count of Words per Category </ul>
<ul>Classification /Naive Bayes </ul><ul>Reduce: </ul>
<ul>Classification /Naive Bayes </ul><ul>Reduce: </ul>
<ul>Classification /Naive Bayes </ul><ul>Finalize: </ul>
<ul>Classification /Naive Bayes </ul><ul>Finalize: </ul>
<ul>Classification /Naive Bayes </ul><ul>Call the Command: </ul>
<ul>Classification /Naive Bayes </ul><ul>Results: </ul><ul>Can see total words.  Can also see word  counts per category. <...
<ul>Classification /Naive Bayes </ul><ul>Results: </ul><ul>...and of course the scores per category... </ul><ul>cae = arts...
<ul>Classification /Naive Bayes </ul><ul><ul><li>Accurate even with little training
MongoDB on a small VM Took 1.7 seconds
Compared to say  PHP   33 seconds and timed out
More training data == exponentially faster than PHP </li></ul></ul>
<ul>Classification /Naive Bayes </ul><ul><ul><li>This wasn't even a full map/reduce
Nächste SlideShare
Wird geladen in …5
×

MongoDB & Machine Learning

16.696 Aufrufe

Veröffentlicht am

Update: Social Harvest is going open source, see http://www.socialharvest.io for more information.

My MongoSV 2011 talk about implementing machine learning and other algorithms in MongoDB. With a little real-world example at the end about what Social Harvest is doing with MongoDB. For more updates about my research, check out my blog at www.shift8creative.com

Veröffentlicht in: Technologie, Bildung
  • DOWNLOAD FULL. BOOKS INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier

MongoDB & Machine Learning

  1. 1. <ul>Machine Learning </ul><ul>Tom Maiaroto @shift8creative </ul>
  2. 2. <ul>What is Machine Learning? </ul>
  3. 3. <ul>Algorithms & Approaches </ul><ul>Decision trees   Random forests   Artificial neural networks     k-NN (nearest neighbour)     Naive Bayesian classifier </ul>
  4. 4. <ul>Algorithms & Approaches </ul><ul>Decision trees   Random forests   Artificial neural networks     k-NN (nearest neighbour)     Naive Bayesian classifier </ul>
  5. 5. <ul>So could machines one day rule the earth? </ul>
  6. 6. <ul>So could machines one day rule the earth? </ul><ul>  Maybe   (ok probably not) </ul>
  7. 7. <ul>What can Machine Learning  do for Apps? </ul><ul>  Spam filtering </ul>
  8. 8. <ul>What can Machine Learning  do for Apps? </ul><ul>Auto-tagging </ul>
  9. 9. <ul>What can Machine Learning  do for Apps? </ul><ul>All Sorts of Categorization </ul>
  10. 10. <ul>What can Machine Learning  do for Apps? </ul><ul>Sentiment Analysis </ul>
  11. 11. <ul>Languages Commonly Used </ul><ul><ul><li>Java </li></ul></ul><ul><ul><ul><li>Java-ML, WEKA, Apache Mahout, many more... </li></ul></ul></ul><ul><ul><li>Python </li></ul></ul><ul><ul><ul><li>NLTK, scikit-learn, PyML, a good deal more... </li></ul></ul></ul><ul><ul><li>C++ </li></ul></ul><ul><ul><ul><li>libDAI, Armadillo, Orange, tons more... </li></ul></ul></ul><ul>    and then some others... </ul>
  12. 12. <ul>Languages Commonly Used </ul><ul>    http://www.mloss.org </ul>
  13. 13. <ul>MongoDB Too! </ul><ul><ul><li>Map/Reduce
  14. 14. Stored JavaScript
  15. 15. Geo-spatial Indexing
  16. 16. Replication </li></ul></ul>
  17. 17. <ul>Geo-spatial Indexing </ul><ul>Did someone say nearest neighbour? </ul>
  18. 18. <ul>Geo-spatial Indexing </ul><ul>Did someone say nearest neighbour? Design geeks, imagine the visualizations... </ul>
  19. 19. <ul>Replication </ul><ul><ul><li>Store massive amounts of data
  20. 20. Distributed performance benefits
  21. 21. Dedicated databases for calculations  </li></ul></ul><ul>    All the obvious benefits. </ul>
  22. 22. <ul>Map/Reduce </ul><ul>It's the brain. </ul>
  23. 23. <ul>Map/Reduce </ul><ul>It's the brain. It's not just for aggregation. </ul>
  24. 24. <ul>Map/Reduce </ul><ul>It's the brain. It's not just for aggregation.       It's faster than you might think. </ul>
  25. 25. <ul>Map/Reduce </ul><ul>It's the brain. It's not just for aggregation.       It's faster than you might think. It runs in the database. </ul>
  26. 26. <ul>Map/Reduce </ul><ul>In the computer. .. </ul>
  27. 27. <ul>Example Time! </ul><ul>It's simple...Just take this... </ul>
  28. 28. <ul>Example Time! </ul><ul>It's simple...Just take this... </ul>
  29. 29. <ul>Example Time! </ul><ul>Just kidding...       Let's Break Down a Naive Bayes Classifier </ul>
  30. 30. <ul>Classification /Naive Bayes </ul><ul>Training the System </ul>
  31. 31. <ul>Classification /Naive Bayes </ul><ul>Training the System Simple... $inc </ul>
  32. 32. <ul>Classification /Naive Bayes </ul><ul>Just Keep Count of Words per Category </ul>
  33. 33. <ul>Classification /Naive Bayes </ul><ul>Reduce: </ul>
  34. 34. <ul>Classification /Naive Bayes </ul><ul>Reduce: </ul>
  35. 35. <ul>Classification /Naive Bayes </ul><ul>Finalize: </ul>
  36. 36. <ul>Classification /Naive Bayes </ul><ul>Finalize: </ul>
  37. 37. <ul>Classification /Naive Bayes </ul><ul>Call the Command: </ul>
  38. 38. <ul>Classification /Naive Bayes </ul><ul>Results: </ul><ul>Can see total words. Can also see word  counts per category. </ul>
  39. 39. <ul>Classification /Naive Bayes </ul><ul>Results: </ul><ul>...and of course the scores per category... </ul><ul>cae = arts and entertainment cs = science ... </ul>
  40. 40. <ul>Classification /Naive Bayes </ul><ul><ul><li>Accurate even with little training
  41. 41. MongoDB on a small VM Took 1.7 seconds
  42. 42. Compared to say PHP 33 seconds and timed out
  43. 43. More training data == exponentially faster than PHP </li></ul></ul>
  44. 44. <ul>Classification /Naive Bayes </ul><ul><ul><li>This wasn't even a full map/reduce
  45. 45. Your mileage will vary based on formula
  46. 46. You can cache certain values for speed
  47. 47. Don't forget about stored JavaScript (but use it wisely) </li></ul></ul>
  48. 48. <ul>Porter Stemming Algorithm </ul><ul>  Thank You Martin Porter http://tartarus.org/martin/PorterStemmer </ul>
  49. 49. <ul>Porter Stemming Algorithm </ul><ul><ul><li>Exists for nearly every language
  50. 50. MongoDB will use JavaScript of course
  51. 51. Decent execution time </li></ul></ul>
  52. 52. <ul>Porter Stemming Algorithm </ul><ul><ul><li>About 2.5x faster than PHP class
  53. 53. 663x faster than a web browser </li></ul></ul>
  54. 54. <ul>Porter Stemming Algorithm </ul><ul><ul><li>About 2.5x faster than PHP class
  55. 55. 663x faster than a web browser
  56. 56. 7x slower than PHP PECL extension </li></ul></ul>
  57. 57. <ul>Real World Application </ul><ul>Social Harvest Analyzes social data from the internet to determine languages spoken, gender, age, sentiment analysis, and categories.     </ul><ul>www.social-harvest.com </ul>
  58. 58. <ul>Real World Application </ul><ul>Social Harvest Who doesn't like pie charts? </ul>
  59. 60. <ul>Follow Tom </ul><ul>@shift8creative www.shift8creative.com www.social-harvest.com   www.union-of-rad.com </ul><ul>  Thank You! </ul>

×