2. Einmal im Monat ist TECHtalk Zeit! First come first served!
< OMM Solutions GmbH > 2
3. Talk: The possibilities of information that can be
extracted from seemingly simple data
Speaker: Olaf Horstmann
3< OMM Solutions GmbH >
4. • Data Science != Big Data != Machine Learning != AI
• Data Science = Extracting information from any data
• Big Data = TeraBytes to PetaBytes of data
• Machine Learning = Training algorithms to classify or predict specific data
• AI = Training an algorithm to make correct decisions on a wide range of different situations
• Data does not deliver “truth”
• Every piece of extracted information is just an assumption/thesis
4
...things to remember
Prequisit
< OMM Solutions GmbH >
5. Data from a fitness-tracker
5
Images by Strava; https://www.strava.com/heatmap#7.00/-120.90000/38.36000/hot/all
< OMM Solutions GmbH >
6. Runners
Data from a fitness-tracker
6< OMM Solutions GmbH >
Images by Strava; https://www.strava.com/heatmap#7.00/-120.90000/38.36000/hot/all
7. Cyclists
Data from a fitness-tracker
7< OMM Solutions GmbH >
Images by Strava; https://www.strava.com/heatmap#7.00/-120.90000/38.36000/hot/all
8. Sometimes they reveal streets that are not on the map ...
Data from a fitness-tracker
8< OMM Solutions GmbH >
Images by Strava; https://www.strava.com/heatmap#7.00/-120.90000/38.36000/hot/all
9. … somewhere in Afghanistan
Data from a fitness-tracker
9< OMM Solutions GmbH >
Images by Strava; https://www.strava.com/heatmap#7.00/-120.90000/38.36000/hot/all
10. Example: “Spiegel Mining” by David Kriesel
• 70.000 Spiegel Online articles
• time-frame ~2 years
• everything saved to a database
• analysis based on feature-extraction
Data Science on data from a news-website
10< OMM Solutions GmbH >
Quelle: http://www.dkriesel.com/spiegelmining
11. Step 1: Visualisation
Data Science on data from a news-website
11
Graphs by David Kriesel dkriesel.com
< OMM Solutions GmbH >
12. Step 2: “Trivial” derivations
Data Science on data from a news-website
12
Graphs by David Kriesel dkriesel.com
< OMM Solutions GmbH >
13. Step 2: “Trivial” derivations
Data Science on data from a news-website
13
Graphs by David Kriesel dkriesel.com
< OMM Solutions GmbH >
14. Step 3: More in depth analysis
Data Science on data from a news-website
14
Graphs by David Kriesel dkriesel.com
< OMM Solutions GmbH >
15. Step 4: Somewhat unrelated interpretations
Data Science on data from a news-website
15
Graphs by David Kriesel dkriesel.com
< OMM Solutions GmbH >
16. Step 5: Further unrelated interpretations
Data Science on data from a news-website
16
Graphs by David Kriesel dkriesel.com
< OMM Solutions GmbH >
17. Step 6: Combination of multiple datasources
Data Science on data from a news-website
17
Graphs by David Kriesel dkriesel.com; instagram.com; twitter.com
+
< OMM Solutions GmbH >
18. Data does not deliver “truth”
18
Remember
< OMM Solutions GmbH >
19. • Statistics have shown, that employees that used Chrome or Firefox performed better on
employment assessment metrics and stayed on longer than those who did not.
• => “Users of the Chrome and Firefox browsers are better employees.”
• The ones with “relatively feminine” names killed an average of 42 people, the ones with
“relatively male” name killed only 15 on average.
• => “Female-named hurricanes are more deadly.”
• American men 45 to 82 who skip breakfast showed a 27 percent higher risk of coronary heart
disease over a 16-year period.
• => “Skipping breakfast causes coronary heart disease”
...your turn to guess
Correlation != Causation
19
https://blogs.scientificamerican.com/guest-blog/9-bizarre-and-surprising-insights-from-data-science/