Suche senden
Hochladen
Jeff jonas big data new physics
•
2 gefällt mir
•
1,040 views
MIT Forum of Israel
Folgen
Big Data 12.3.14
Weniger lesen
Mehr lesen
Business
Melden
Teilen
Melden
Teilen
1 von 69
Jetzt herunterladen
Downloaden Sie, um offline zu lesen
Empfohlen
Growth Hacking
Growth Hacking
Centre for Digital Marketing & Communication
Workshop
Workshop
Beth Kanter
Datomic
Datomic
Jordan Leigh
Datomic
Datomic
Christophe Marchal
Datomic
Datomic
jperkelens
Backbone.js
Backbone.js
daisuke shimizu
Selena Gomez
Selena Gomez
guest5fa9931
Management Consulting
Management Consulting
Alexandros Chatzopoulos
Empfohlen
Growth Hacking
Growth Hacking
Centre for Digital Marketing & Communication
Workshop
Workshop
Beth Kanter
Datomic
Datomic
Jordan Leigh
Datomic
Datomic
Christophe Marchal
Datomic
Datomic
jperkelens
Backbone.js
Backbone.js
daisuke shimizu
Selena Gomez
Selena Gomez
guest5fa9931
Management Consulting
Management Consulting
Alexandros Chatzopoulos
Sap fiori
Sap fiori
Anudeep Bhatia
intel core i7
intel core i7
kiran bansod
Oprah Winfrey
Oprah Winfrey
Utkarsh Haldia
Clojure
Clojure
Rohit Vaidya
Reverse Engineering
Reverse Engineering
dswanson
Chess
Chess
Chuck Vohs
Lionel Messi
Lionel Messi
NaliKardan
Lionel messi
Lionel messi
Dipanker Singh
Growth Hacking
Growth Hacking
Mattan Griffel
Hadoop and Cassandra at Rackspace
Hadoop and Cassandra at Rackspace
Stu Hood
Cassandra @Formspring
Cassandra @Formspring
martincozzi
What Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database Scalability
jbellis
From 100s to 100s of Millions
From 100s to 100s of Millions
Erik Onnen
Cassandra by example - the path of read and write requests
Cassandra by example - the path of read and write requests
grro
Migrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global Cassandra
Adrian Cockcroft
BI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache Cassandra
Victor Coustenoble
Manchester city
Manchester city
ofrancis
Jim rohn
Jim rohn
Motivational Goldenwords
Waldorf Education
Waldorf Education
xMerodi
Defrag 2010-distrib
Defrag 2010-distrib
Jeff Jonas
EOCD Big Data Flows vs. Wicked Leaks
EOCD Big Data Flows vs. Wicked Leaks
Jeff Jonas
Strata hadoop world 2015 context computing - jonas keynote - final
Strata hadoop world 2015 context computing - jonas keynote - final
Jeff Jonas
Weitere ähnliche Inhalte
Andere mochten auch
Sap fiori
Sap fiori
Anudeep Bhatia
intel core i7
intel core i7
kiran bansod
Oprah Winfrey
Oprah Winfrey
Utkarsh Haldia
Clojure
Clojure
Rohit Vaidya
Reverse Engineering
Reverse Engineering
dswanson
Chess
Chess
Chuck Vohs
Lionel Messi
Lionel Messi
NaliKardan
Lionel messi
Lionel messi
Dipanker Singh
Growth Hacking
Growth Hacking
Mattan Griffel
Hadoop and Cassandra at Rackspace
Hadoop and Cassandra at Rackspace
Stu Hood
Cassandra @Formspring
Cassandra @Formspring
martincozzi
What Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database Scalability
jbellis
From 100s to 100s of Millions
From 100s to 100s of Millions
Erik Onnen
Cassandra by example - the path of read and write requests
Cassandra by example - the path of read and write requests
grro
Migrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global Cassandra
Adrian Cockcroft
BI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache Cassandra
Victor Coustenoble
Manchester city
Manchester city
ofrancis
Jim rohn
Jim rohn
Motivational Goldenwords
Waldorf Education
Waldorf Education
xMerodi
Andere mochten auch
(19)
Sap fiori
Sap fiori
intel core i7
intel core i7
Oprah Winfrey
Oprah Winfrey
Clojure
Clojure
Reverse Engineering
Reverse Engineering
Chess
Chess
Lionel Messi
Lionel Messi
Lionel messi
Lionel messi
Growth Hacking
Growth Hacking
Hadoop and Cassandra at Rackspace
Hadoop and Cassandra at Rackspace
Cassandra @Formspring
Cassandra @Formspring
What Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database Scalability
From 100s to 100s of Millions
From 100s to 100s of Millions
Cassandra by example - the path of read and write requests
Cassandra by example - the path of read and write requests
Migrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global Cassandra
BI, Reporting and Analytics on Apache Cassandra
BI, Reporting and Analytics on Apache Cassandra
Manchester city
Manchester city
Jim rohn
Jim rohn
Waldorf Education
Waldorf Education
Ähnlich wie Jeff jonas big data new physics
Defrag 2010-distrib
Defrag 2010-distrib
Jeff Jonas
EOCD Big Data Flows vs. Wicked Leaks
EOCD Big Data Flows vs. Wicked Leaks
Jeff Jonas
Strata hadoop world 2015 context computing - jonas keynote - final
Strata hadoop world 2015 context computing - jonas keynote - final
Jeff Jonas
CMU 2011 Watson Event
CMU 2011 Watson Event
Mark Sherman
Five Things to Consider in Multilingual Digital Publishing - DPSE, 10/5/15
Five Things to Consider in Multilingual Digital Publishing - DPSE, 10/5/15
Digiday
Hum t19 hum-t19
Hum t19 hum-t19
SelectedPresentations
Confessions of an Architect
Confessions of an Architect
Jeff Jonas
If an Application Fails in the Datacenter and No Users Are On It, Will it Cut...
If an Application Fails in the Datacenter and No Users Are On It, Will it Cut...
SolarWinds
Data for Business Journalism, NICAR 2012
Data for Business Journalism, NICAR 2012
Chris Taggart
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
PyData
Winning the 3rd Wave of BI
Winning the 3rd Wave of BI
Looker
Data Driven Decisions at Scale
Data Driven Decisions at Scale
Databricks
How to Conquer your Post-Election Data Chaos with the Cicero API
How to Conquer your Post-Election Data Chaos with the Cicero API
Azavea
Data Breaches - Sageworks, Inc., Webinar Series by Douglas Jambor
Data Breaches - Sageworks, Inc., Webinar Series by Douglas Jambor
Turner and Associates, Inc.
TheInternetOfEvidence(tm)-LittleBrotherIsWatchingYou-AndHe'sTakingNotes!
TheInternetOfEvidence(tm)-LittleBrotherIsWatchingYou-AndHe'sTakingNotes!
Wayne Norris
Ähnlich wie Jeff jonas big data new physics
(15)
Defrag 2010-distrib
Defrag 2010-distrib
EOCD Big Data Flows vs. Wicked Leaks
EOCD Big Data Flows vs. Wicked Leaks
Strata hadoop world 2015 context computing - jonas keynote - final
Strata hadoop world 2015 context computing - jonas keynote - final
CMU 2011 Watson Event
CMU 2011 Watson Event
Five Things to Consider in Multilingual Digital Publishing - DPSE, 10/5/15
Five Things to Consider in Multilingual Digital Publishing - DPSE, 10/5/15
Hum t19 hum-t19
Hum t19 hum-t19
Confessions of an Architect
Confessions of an Architect
If an Application Fails in the Datacenter and No Users Are On It, Will it Cut...
If an Application Fails in the Datacenter and No Users Are On It, Will it Cut...
Data for Business Journalism, NICAR 2012
Data for Business Journalism, NICAR 2012
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
Winning the 3rd Wave of BI
Winning the 3rd Wave of BI
Data Driven Decisions at Scale
Data Driven Decisions at Scale
How to Conquer your Post-Election Data Chaos with the Cicero API
How to Conquer your Post-Election Data Chaos with the Cicero API
Data Breaches - Sageworks, Inc., Webinar Series by Douglas Jambor
Data Breaches - Sageworks, Inc., Webinar Series by Douglas Jambor
TheInternetOfEvidence(tm)-LittleBrotherIsWatchingYou-AndHe'sTakingNotes!
TheInternetOfEvidence(tm)-LittleBrotherIsWatchingYou-AndHe'sTakingNotes!
Mehr von MIT Forum of Israel
Ben gurion int airport shmul zackay
Ben gurion int airport shmul zackay
MIT Forum of Israel
Yossi cohen preso
Yossi cohen preso
MIT Forum of Israel
הנדסה פיננסית, השוואה ומה שיש ביניהם - סיפורה של פוינטר
הנדסה פיננסית, השוואה ומה שיש ביניהם - סיפורה של פוינטר
MIT Forum of Israel
הדרך להצלחה
הדרך להצלחה
MIT Forum of Israel
טיפים לעסקאות Fwmk
טיפים לעסקאות Fwmk
MIT Forum of Israel
Treechains presentation
Treechains presentation
MIT Forum of Israel
Breezo meter CleanTech Open 2014
Breezo meter CleanTech Open 2014
MIT Forum of Israel
המכון הישראלי לייצוא CleanTech Open 2014
המכון הישראלי לייצוא CleanTech Open 2014
MIT Forum of Israel
Michal Vakrat Wolkin CleanTech Open 2014
Michal Vakrat Wolkin CleanTech Open 2014
MIT Forum of Israel
GenCell CleanTech Open 2014
GenCell CleanTech Open 2014
MIT Forum of Israel
Fuu CleanTech Open 2014
Fuu CleanTech Open 2014
MIT Forum of Israel
Green spense CleanTech Open 2014
Green spense CleanTech Open 2014
MIT Forum of Israel
Evr motors CleanTech Open 2014
Evr motors CleanTech Open 2014
MIT Forum of Israel
Ayla matalon CleanTech Open 2014
Ayla matalon CleanTech Open 2014
MIT Forum of Israel
AutoAgronome Cleantech Open 2014
AutoAgronome Cleantech Open 2014
MIT Forum of Israel
Asaf hahami
Asaf hahami
MIT Forum of Israel
43 north
43 north
MIT Forum of Israel
Dr Ohad Barzilay
Dr Ohad Barzilay
MIT Forum of Israel
Yaniv Mor - Xplenty - big data new physics
Yaniv Mor - Xplenty - big data new physics
MIT Forum of Israel
Mit Dec 2013 Measurement Redefined Similar Group
Mit Dec 2013 Measurement Redefined Similar Group
MIT Forum of Israel
Mehr von MIT Forum of Israel
(20)
Ben gurion int airport shmul zackay
Ben gurion int airport shmul zackay
Yossi cohen preso
Yossi cohen preso
הנדסה פיננסית, השוואה ומה שיש ביניהם - סיפורה של פוינטר
הנדסה פיננסית, השוואה ומה שיש ביניהם - סיפורה של פוינטר
הדרך להצלחה
הדרך להצלחה
טיפים לעסקאות Fwmk
טיפים לעסקאות Fwmk
Treechains presentation
Treechains presentation
Breezo meter CleanTech Open 2014
Breezo meter CleanTech Open 2014
המכון הישראלי לייצוא CleanTech Open 2014
המכון הישראלי לייצוא CleanTech Open 2014
Michal Vakrat Wolkin CleanTech Open 2014
Michal Vakrat Wolkin CleanTech Open 2014
GenCell CleanTech Open 2014
GenCell CleanTech Open 2014
Fuu CleanTech Open 2014
Fuu CleanTech Open 2014
Green spense CleanTech Open 2014
Green spense CleanTech Open 2014
Evr motors CleanTech Open 2014
Evr motors CleanTech Open 2014
Ayla matalon CleanTech Open 2014
Ayla matalon CleanTech Open 2014
AutoAgronome Cleantech Open 2014
AutoAgronome Cleantech Open 2014
Asaf hahami
Asaf hahami
43 north
43 north
Dr Ohad Barzilay
Dr Ohad Barzilay
Yaniv Mor - Xplenty - big data new physics
Yaniv Mor - Xplenty - big data new physics
Mit Dec 2013 Measurement Redefined Similar Group
Mit Dec 2013 Measurement Redefined Similar Group
Kürzlich hochgeladen
FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607
dollysharma2066
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
HajeJanKamps
Kenya’s Coconut Value Chain by Gatsby Africa
Kenya’s Coconut Value Chain by Gatsby Africa
ictsugar
Corporate Profile 47Billion Information Technology
Corporate Profile 47Billion Information Technology
Data Analytics Company - 47Billion Inc.
Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...
Peter Ward
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
ashishs7044
No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...
No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...
Call girls in Goa, +91 9319373153 Escort Service in North Goa
Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737
Riya Pathan
Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...
Seta Wicaksana
Japan IT Week 2024 Brochure by 47Billion (English)
Japan IT Week 2024 Brochure by 47Billion (English)
Data Analytics Company - 47Billion Inc.
Darshan Hiranandani [News About Next CEO].pdf
Darshan Hiranandani [News About Next CEO].pdf
Shashank Mehta
PSCC - Capability Statement Presentation
PSCC - Capability Statement Presentation
Anamaria Contreras
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
ictsugar
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024
Kirill Klimov
Annual General Meeting Presentation Slides
Annual General Meeting Presentation Slides
KeppelCorporation
Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.
Anamaria Contreras
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
dollysharma2066
MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?
Olivia Kresic
APRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdf
Rbc Rbcua
Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...
Americas Got Grants
Kürzlich hochgeladen
(20)
FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Kenya’s Coconut Value Chain by Gatsby Africa
Kenya’s Coconut Value Chain by Gatsby Africa
Corporate Profile 47Billion Information Technology
Corporate Profile 47Billion Information Technology
Fordham -How effective decision-making is within the IT department - Analysis...
Fordham -How effective decision-making is within the IT department - Analysis...
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...
No-1 Call Girls In Goa 93193 VIP 73153 Escort service In North Goa Panaji, Ca...
Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737
Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...
Japan IT Week 2024 Brochure by 47Billion (English)
Japan IT Week 2024 Brochure by 47Billion (English)
Darshan Hiranandani [News About Next CEO].pdf
Darshan Hiranandani [News About Next CEO].pdf
PSCC - Capability Statement Presentation
PSCC - Capability Statement Presentation
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024
Annual General Meeting Presentation Slides
Annual General Meeting Presentation Slides
Traction part 2 - EOS Model JAX Bridges.
Traction part 2 - EOS Model JAX Bridges.
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
(Best) ENJOY Call Girls in Faridabad Ex | 8377087607
MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?
APRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdf
Church Building Grants To Assist With New Construction, Additions, And Restor...
Church Building Grants To Assist With New Construction, Additions, And Restor...
Jeff jonas big data new physics
1.
Big Data. New
Physics. And Geospatial “Superfood” © 2014 IBM Corporation 1111 Jeff Jonas,Jeff Jonas,Jeff Jonas,Jeff Jonas, IBM Fellow Chief Scientist, Context Computing Email: jeffjonas@us.ibm.com Blog: www.jeffjonas.typepad.com Twitter: http://www.twitter.com/jeffjonas
2.
About the Speaker Jeff
Jonas IBM Fellow, Chief Scientist for Context Computing Founder and Chief Scientist of Systems Research & Development (SRD), acquired by IBM in 2005 © 2014 IBM Corporation 2222 acquired by IBM in 2005 Been designing, building deploying entity resolution systems for three decades This technology is used today by defense & intelligence, financial institutions, humanitarian efforts and more Today: Primarily focused on ‘sensemaking on streams’ with special attention towards privacy and civil liberties protections
3.
”The data must
find the data and the relevance must find the user.” © 2014 IBM Corporation 3333 relevance must find the user.”
4.
ComputingPowerGrowth Available Observation Space Context Trend: Organizations Are
Getting Dumber Enterprise Amnesia © 2014 IBM Corporation 4444 Time ComputingPowerGrowth Sensemaking Algorithms
5.
Available Observation Space Context WHY? Trend: Organizations Are
Getting Dumber ComputingPowerGrowth © 2014 IBM Corporation 5555 Time Sensemaking Algorithms ComputingPowerGrowth
6.
Algorithms at Dead
End. You Can’t © 2014 IBM Corporation 6666 You Can’t Squeeze Knowledge Out of a Pixel.
7.
No Context scrila34@msn.com © 2014
IBM Corporation 7777
8.
Context, definition Better understanding
something © 2014 IBM Corporation 8888 Better understanding something by taking into account the things around it.
9.
I ducked as
the bat flew my way. Another exciting baseball game … © 2014 IBM Corporation 9999
10.
Information in Context
… and Accumulating Top 200 CustomerTwitter scrila34@msn.com LinkedIn Career History © 2014 IBM Corporation 10101010 Customer Job Applicant Twitter Influencer AML Investigation
11.
The Puzzle Metaphor Imagine
an ever-growing pile of puzzle pieces of varying sizes, shapes and colors What it represents is unknown – there is no picture on hand Is it one puzzle, 15 puzzles, or 1,500 different puzzles? © 2014 IBM Corporation 11111111 Some pieces are duplicates, missing, incomplete, low quality, or have been misinterpreted Some pieces may even be professionally fabricated lies Until you take the pieces to the table and attempt assembly, you don’t know what you are dealing with
12.
270 pieces 90% 200 pieces 66% 150
pieces 50% 6 pieces 2% Puzzling Images: Courtesy Ravensburger © 2011 © 2014 IBM Corporation 12121212 90% 66% 50% 2% 30 pieces 10% (duplicates)
13.
© 2014 IBM
Corporation 13131313
14.
© 2014 IBM
Corporation 14141414
15.
First Discovery © 2014
IBM Corporation 15151515
16.
More Data Finds
Data © 2014 IBM Corporation 16161616
17.
Duplicates in Front
Of Your Eyes © 2014 IBM Corporation 17171717
18.
First Duplicate Found
Here © 2014 IBM Corporation 18181818
19.
© 2014 IBM
Corporation 19191919
20.
Incremental Context –
Incremental Discovery 6:40pm START 22min “Hey, this one is a duplicate!” 35min “I think some pieces are missing.” © 2014 IBM Corporation 20202020 37min “Looks like a bunch of hillbillies on a porch.” 44min “Hillbillies, playing guitars, sitting on a porch, near a barber sign … and a banjo!”
21.
150 pieces 50% © 2014
IBM Corporation 21212121
22.
Incremental Context –
Incremental Discovery 47min “We should take the sky and grass off the table.” 2hr “Let’s switch sides, and see if we can make sense of this from different perspectives.” © 2014 IBM Corporation 22222222 different perspectives.” 2hr10m “Wait, there are three … no, four puzzles.” 2hr17m “We need a bigger table.” 2hr18m “I think you threw in a few random pieces.”
23.
© 2014 IBM
Corporation 23232323
24.
How Context Accumulates With
each new observation … one of three assertions are made: 1) Un- associated; 2) placed near like neighbors; or 3) connected Must favor the false negative New observations sometimes reverse earlier assertions © 2014 IBM Corporation 24242424 Some observations produce novel discovery The emerging picture helps focus collection interests As the working space expands, computational effort increases Given sufficient observations, there can come a tipping point Thereafter, confidence improves while computational effort decreases!
25.
UniqueIdentities Overstated Population © 2014
IBM Corporation 25252525 Observations UniqueIdentities True Population
26.
Counting Is Difficult Mark
Smith 6/12/1978 Mark R Smith (707) 433-0000 DL: 00001234 © 2014 IBM Corporation 26262626 6/12/1978 443-43-0000 File 1 File 2
27.
UniqueIdentities The Rise and
Fall of a Population © 2014 IBM Corporation 27272727 Observations UniqueIdentities True Population
28.
Data Triangulation New Record Mark
Smith 6/12/1978 Mark R Smith (707) 433-0000 DL: 00001234 © 2014 IBM Corporation 28282828 Mark Randy Smith 443-43-0000 DL: 00001234 6/12/1978 443-43-0000 File 1 File 2
29.
Big Data [in
context]. New Physics. More data: better the predictions – Lower false positives – Lower false negatives © 2014 IBM Corporation 29292929 More data: bad data good – Suddenly glad your data is not perfect More data: less compute
30.
Big Data © 2014
IBM Corporation 30303030 Pile of ____ Information In Context
31.
One Form of
Context: “Expert Counting” Is it 5 people each with 1 account … or is it 1 person with 5 accounts? Is it 20 cases of H1N1 in 20 cities … or one case reported 20 times? © 2014 IBM Corporation 31313131 reported 20 times? If one cannot count … one cannot estimate vector or velocity (direction and speed). Without vector and velocity … prediction is nearly impossible.
32.
Entity Resolution Demonstration © 2014
IBM Corporation 32323232
33.
Entity Resolution Demonstration DECEASED
PERSONDECEASED PERSONDECEASED PERSONDECEASED PERSON George Balston YOB: 1951 SSN: 5598 DOD: 1995 VOTERVOTERVOTERVOTER George F Balston YOB: 1951 D/L: 4801 13070 SW Karen Blvd Apt 7 Beaverton, OR 97005 Last voted: 2008 © 2014 IBM Corporation 33333333 When it comes to best practices in voter matching, if only a name and year of birth match, this is insufficient proof of a match. Many different people in the U.S. share a name and year of birth. Human review is required. Unfortunately, there can be many thousands of cases just like this and state election offices don’t have the staff/budget to manually review them all.
34.
Now Consider This
Tertiary DMV Record DECEASED PERSONDECEASED PERSONDECEASED PERSONDECEASED PERSON George Balston YOB: 1951 SSN: 5598 DOD: 1995 VOTERVOTERVOTERVOTER George F Balston YOB: 1951 D/L: 4801 13070 SW Karen Blvd Apt 7 Beaverton, OR 97005 Last voted: 2008 © 2014 IBM Corporation 34343434 DMVDMVDMVDMV George F Balston YOB: 1951 SSN: 5598 D/L: 4801 3043 SW Clementine Blvd Apt 210 Beaverton, OR 97005 The DMV record contains enough features to match both the voter (name, year of birth and driver’s license) and/or the deceased persons record (name, year of birth and SSN). For the sake of argument, let’s say it matches the voter best.
35.
DECEASED PERSONDECEASED PERSONDECEASED
PERSONDECEASED PERSON George Balston YOB: 1951 SSN: 5598 DOD: 1995 Features Accumulate VOTERVOTERVOTERVOTER George F Balston YOB: 1951 D/L: 4801 13070 SW Karen Blvd Apt 7 Beaverton, OR 97005 Last voted: 2008 DMVDMVDMVDMV © 2014 IBM Corporation 35353535 The voter/DMV record now shares a name, year of birth and SSN with the deceased person. In voter matching best practices, this evidence would be sufficient to make a determination that this voter is likely deceased. This case no longer needs human review. DMVDMVDMVDMV George F Balston YOB: 1951 SSN: 5598 D/L: 4801 3043 SW Clementine Blvd Apt 210 Beaverton, OR 97005
36.
VOTERVOTERVOTERVOTER George F Balston YOB:
1951 D/L: 4801 13070 SW Karen Blvd Apt 7 Beaverton, OR 97005 Last voted: 2008 DMVDMVDMVDMV As features accumulate it becomes possible to resolve previous un-resolvable identity records. As events and transactions Useful Insight Revealed!Useful Insight Revealed! © 2014 IBM Corporation 36363636 DMVDMVDMVDMV George F Balston YOB: 1951 SSN: 5598 D/L: 4801 3043 SW Clementine Blvd Apt 210 Beaverton, OR 97005 DECEASED PERSONDECEASED PERSONDECEASED PERSONDECEASED PERSON George Balston YOB: 1951 SSN: 5598 DOD: 1995 As events and transactions accumulate – detection of relevance improves. Here we can see George who died in 1995 voted in 2008.
37.
Expert Counting: Degrees
of Difficulty Incompatible Features Deceit Bob Jones 123455 Ken Wells 550119 © 2014 IBM Corporation 37373737 Exactly Same Fuzzy Bob Jones 123455 Bob Jones 123455 Bob Jones 123455 Robert T Jonnes 000123455 Bob Jones 123455 bjones@hotmail
38.
Deceit Detection Using
Context Accumulation Deceit Bob Jones 123455 Ken Wells 550119Robert Jones 123455 POB 13452 DOB 03/12/73 Feature Accumulation © 2014 IBM Corporation 38383838 Ken Wells 550119 POB 999911 DOB 03/12/73 gw3e56@hotmail.com gw3e56@hotmail.com DOB 03/12/73 Robert Jones 123455 Ken Wells 550119 Resolved! DOB 03/12/73 Bob Jones POB 13452 gw3e56@hotmail.com
39.
Skilled adversaries use
“channel separation” to avoid detection. © 2014 IBM Corporation 39393939 Cell Phone #1 Unknown Cell Phone #2 Unknown Passport #1 William A. Bank Acct #1 Billy K.
40.
Detection requires “channel
consolidation.” © 2014 IBM Corporation 40404040 William A aka Billy K. • Cell Phone #1 • Cell Phone #2 • Bank Acct #1 • Passport #1
41.
Take Note To catch
clever criminals, one must ... 1) Collect observations the adversary doesn’t © 2014 IBM Corporation 41414141 1) Collect observations the adversary doesn’t know you have 2) Or, be able to perform compute over your observations in a manner the adversary cannot fathom
42.
InfoSphere Identity Insight v8 ©
2014 IBM Corporation 42424242 v8
43.
New Think About
Expert Counting Incompatible Features Deceit Bob Jones 123455 Ken Wells 550119 © 2014 IBM Corporation 43434343 Exactly Same Fuzzy Bob Jones 123455 Bob Jones 123455 Bob Jones 123455 Robert T Jonnes 000123455 Bob Jones 123455 bjones@hotmail
44.
Key Features Enable
Expert Counting Name License Plate No. Serial Number Address VIN MAC Address Date of Birth Make IP Address Phone Model Make Passport Year Model People Cars Router © 2014 IBM Corporation 44444444 Passport Year Model Nationality Color Firmware Version Biometric Etc. Etc. Etc.
45.
Consider Lying Identical
Twins #123 Sue 3/3/84 Uberstan Exp 2011 PASSPORT #123 Sue 3/3/84 Uberstan Exp 2011 PASSPORT © 2014 IBM Corporation 45454545 Fingerprint DNA Most Trusted Authority “Same person – trust me.” Most Trusted Authority
46.
The same thing
cannot be in two places … at the same time. Two different things cannot occupy the same space … at the © 2014 IBM Corporation 46464646 Two different things cannot occupy the same space … at the same time.
47.
Space & Time
Enables Absolute Disambiguation When When When Where Where Where People Cars Router Name License Plate No. Serial Number Address VIN MAC Address Date of Birth Make IP Address Phone Model Make Passport Year Model © 2014 IBM Corporation 47474747 Passport Year Model Nationality Color Firmware Version Biometric Etc. Etc. Etc.
48.
“Life Arcs” Are
Also Telling Bill Smith 4/13/67 Salem, Oregon Bill Smith 4/13/67 Seattle, Washington Address History Address History © 2014 IBM Corporation 48484848 Address History Tampa, FL 2008-2008 Biloxi, MS 2005-2008 NY, NY 1996-2005 Tampa, FL 1984-1996 Address History San Diego, CA 2005-2009 San Fran, CA 2005-2005 Phoenix, AZ 1990-2005 San Jose, CA 1982-1990
49.
OMG © 2014 IBM
Corporation 49494949
50.
Space-Time-Travel Cell phones are
generating a staggering amount of geo- locational data – 600B transactions per day being created in the US alone This data is being “de-identified” and shared with third parties – in volume and in real-time © 2014 IBM Corporation 50505050 parties – in volume and in real-time Your movement quickly reveals where you spend your time (e.g., evenings vs. working hours) Re-identification (figuring out who is who) is somewhat trivial And, oh so powerful predictions …
51.
The 10 People
I Spend the Most Time With (Not at Home and Not at Work) 1. Michelle J 2. Renee M 3. Peggy M 4. Erin E 5. Joshua J He must be following me! © 2014 IBM Corporation 51515151 4. Erin E 5. Joshua J 6. Ivan X 7. Bob Y 8. Amanda H 9. Dane J 10. Wesley R He must be following me!
52.
Consequences Space-time-travel data is
the ultimate biometric It will enable enormous opportunity It will unravel one’s secrets © 2014 IBM Corporation 52525252 It will unravel one’s secrets It will challenge existing notions of privacy Adoption is now accelerating at a blistering pace
53.
[Theatrical Pause] © 2014
IBM Corporation 53535353 [Theatrical Pause]
54.
The G2 |
Sensemaking Project © 2014 IBM Corporation 54545454
55.
The G2 Vision 1)
Evaluate each new observation against previous observations. 2) Determine if what is being observed is relevant. 3) Delivering this actionable insight to its consumer © 2014 IBM Corporation 55555555 3) Delivering this actionable insight to its consumer … fast enough to do something about it while it is still happening. 4) Doing this with sufficient accuracy and scale to really matter.
56.
Uniquely G2 Real “Context
Computing” – Complete Context: Contextualize diverse observations, each observation benefiting from others – Current Context: Real-time, incremental integration – Conflicting Context: High tolerance for disagreement, confusion and uncertainty – Self-Correcting Context: New observations able to reverse earlier assertions Engineered ground-up for cloud compute … in support of hemisphere-scale data © 2014 IBM Corporation 56565656 Introduce new data sources (e.g., geospatial), new entity types (e.g., vessels), new features (e.g., MAC addresses) … without schema change/re-engineering From sense to respond in sub-200ms– fast enough to do something about the transaction while it is still happening Unprecedented number of Privacy by Design (PbD) features baked-in
57.
Privacy by Design
(PbD) 1. Full Attribution 2. Tamper Resistant Audit Log 3. Information Transfer Accounting 4. Data Tethering © 2014 IBM Corporation 57575757 http://jeffjonas.typepad.com/jeff_jonas/2012/06/privacy-by-design-in-the-era-of-big-data.html 4. Data Tethering 5. False Negative Favoring 6. Self-Correcting False Positives 7. Analytics on Anonymized Data
58.
Example: Self-Correcting False
Positive John T Smith Jr 123 Main Street 703 111-2000 DOB: 03/12/1984 John T Smith 123 Main Street A plausible claim these two people are the same 1 2 John T Smith Sr 123 Main Street Until this record 3 © 2014 IBM Corporation 58585858 Which reveals this is a FALSE POSITIVE 123 Main Street 703 111-2000 DL: 009900991 2 123 Main Street 703 111-2000 DL: 009900991 Until this record comes into view
59.
Example: Self-Correcting False
Positive John T Smith Jr 123 Main Street 703 111-2000 DOB: 03/12/1984 John T Smith 123 Main Street John T Smith Sr 123 Main Street 1 3 2 © 2014 IBM Corporation 59595959 123 Main Street 703 111-2000 DL: 009900991 123 Main Street 703 111-2000 DL: 009900991 New Best Practice: FIXED IN REAL-TIME (not end of month) John T Smith 123 Main Street 703 111-2000 DL: 009900991 2 2
60.
Use Cases Maritime Domain
Awareness New system lets authorities track suspicious ships http://www.asiaone.com/print/News/Latest%2BNews/Science%2Band%2BTech/Story/A1Story201 30703-434337.html Voter Registration Modernization © 2014 IBM Corporation 60606060 Voter Registration Modernization David Becker (PEW Charitable Trust) and Jeff Jonas (IBM) Discuss How G2 Has Helped Modernize Voter Registration in America http://ibmreferencehub.com/STG/ibm_executive_edge_2013/#gensession_daytwo_jonasbecker
61.
Closing Thoughts © 2014
IBM Corporation 61616161
62.
Available Observation Space Context Wish This on
the Adversary Enterprise Amnesia ComputingPowerGrowth © 2014 IBM Corporation 62626262 Time Sensemaking Algorithms ComputingPowerGrowth
63.
Wish This for
Yourself: Better Sensemaking Skills Available Observation Space Context ComputingPowerGrowth © 2014 IBM Corporation 63636363 Time Sensemaking Algorithms ComputingPowerGrowth
64.
State of the
Union: Isolated Analytics Structured Data Analytics Unstructured Data Analytics © 2014 IBM Corporation 64646464 Observation Space Action Social Network Analytics
65.
The Future: General
Purpose Context Accumulation Data Finds Data Relevance Finds You This is GThis is GThis is GThis is G2222 © 2014 IBM Corporation 65656565 Observation Space Consumer (An analyst, a system, the sensor itself, etc.) Information In Context
66.
The most competitive
organizations are going to make sense of what they are observing fast enough to do something about it © 2014 IBM Corporation 66666666 fast enough to do something about it while they are observing it.
67.
Related Blog Posts Algorithms
At Dead-End: Cannot Squeeze Knowledge Out Of A Pixel Puzzling: How Observations Are Accumulated Into Context Big Data. New Physics. On A Smarter Planet … Some Organizations Will Be Smarter-er Than Others © 2014 IBM Corporation 67676767 Your Movements Speak for Themselves: Space-Time Travel Data is Analytic Super-Food! When Federated Search Bites Data Finds Data Structuring Unstructured Data Fantasy Analytics
68.
Questions? © 2014 IBM
Corporation 68686868 Email: jeffjonas@us.ibm.com Blog: www.jeffjonas.typepad.com Twitter: http://www.twitter.com/jeffjonas
69.
Big Data. New
Physics. And Geospatial “Superfood” © 2014 IBM Corporation 69696969 Jeff Jonas,Jeff Jonas,Jeff Jonas,Jeff Jonas, IBM Fellow Chief Scientist, Context Computing Email: jeffjonas@us.ibm.com Blog: www.jeffjonas.typepad.com Twitter: http://www.twitter.com/jeffjonas
Jetzt herunterladen