Weitere ähnliche Inhalte
Ähnlich wie Enterprise Intelligence: How Context Accumulates
Ähnlich wie Enterprise Intelligence: How Context Accumulates (7)
Kürzlich hochgeladen (20)
Enterprise Intelligence: How Context Accumulates
- 1. Enterprise Intelligence
Jeff Jonas, IBM Distinguished Engineer
Chief Scientist, IBM Entity Analytics
Email: jeffjonas@us.ibm.com
Blog: www.jeffjonas.typepad.com
Twitter: http://www.twitter.com/jeffjonas
1
© 2012 IBM Corporation
- 2. My Background
Early 80‟s: Founded Systems Research & Development (SRD), a
custom software consultancy
Personally designed and deployed +/- 100 systems, a number of
which contained multi-billions of transactions describing 100‟s
of millions of entities
1989 – 2003: Built numerous systems for Las Vegas casinos
including a technology known as Non-Obvious Relationship
Awareness (NORA)
2001: Funded by In-Q-Tel, the venture capital arm of the CIA
2005: IBM acquires SRD
Today: Primarily focused on „sensemaking on streams‟ with
special attention towards privacy and civil liberties protections
2
© 2012 IBM Corporation
- 3. Trend: Organizations Are Getting Dumber
Every two days now we create as
Available much information as we did from
Observation the dawn of civilization up until
Computing Power Growth
Space 2003.”
~ EricContext CEO Google
Schmidt,
Enterprise
Amnesia
Sensemaking
Algorithms
Time
3
© 2012 IBM Corporation
- 4. Amnesia, definition
A defect in memory, especially resulting
from brain damage.
4
© 2012 IBM Corporation
- 5. Enterprise Amnesia, definition
A defect in memory, resulting in wasted
resources, lower revenues, unnecessary
fraud losses, etc.
5
© 2012 IBM Corporation
- 6. Trend: Organizations Are Getting Dumber
Available
Observation
Computing Power Growth
Space
WHY?
Context
Sensemaking
Algorithms
Time
6
© 2012 IBM Corporation
- 7. Algorithms at Dead End.
You Can‟t
Squeeze Knowledge
Out of a Pixel.
7
© 2012 IBM Corporation
- 8. No Context
scrila34@msn.com
8
© 2012 IBM Corporation
- 9. Context, definition
Better understanding
something by taking into
account the things around it.
9
© 2012 IBM Corporation
- 10. Information in Context … and Accumulating
scrila34@msn.com
Job
Applicant Top 200
Customer
Criminal
Investigation
Identity
Thief
10
© 2012 IBM Corporation
- 11. The Puzzle Metaphor
Imagine an ever-growing pile of puzzle pieces of varying sizes,
shapes and colors
What it represents is unknown – there is no picture on hand
Is it one puzzle, 15 puzzles, or 1,500 different puzzles?
Some pieces are duplicates, missing, incomplete, low quality, or
have been misinterpreted
Some pieces may even be professionally fabricated lies
Until you take the pieces to the table and attempt assembly,
you don‟t know what you are dealing with
11
© 2012 IBM Corporation
- 12. Puzzling
270 pieces
Vegas 200 pieces
Neuschwanstein Beauty
© 2009 Photo Copyright
150 pieces
Down Home Music
© Kay Lamb Shannon,
6 pieces
Cottage Garden
© 2010 Royce B. McClure,
2%
Artwork provided by
90%
Hadley House Licensing,
Minneapolis
66%
Robert Cushman Hayes
© 2009 Ravensburger USA,
50%
Artist
Licensed by Cypress Fine
Artist All Rights Reserved
© 2010 Ravensburger USA,
© 2011 Giesla Hoelscher Inc. Art Licensing Inc.
All Rights Reserved © 2011 Ravensburger USA
© 2011 Ravensburger USA, Inc.
Inc.
30 pieces
10%
(duplicates)
12
© 2012 IBM Corporation
- 13. 13
© 2012 IBM Corporation
- 14. 14
© 2012 IBM Corporation
- 19. 19
© 2012 IBM Corporation
- 20. Incremental Context – Incremental Discovery
6:40pm START
22min “Hey, this one is a duplicate!”
35min “I think some pieces are missing.”
37min “Looks like a bunch of hillbillies on
a porch.”
44min “Hillbillies, playing guitars, sitting
on a porch, near a barber sign …
and a banjo!”
20
© 2012 IBM Corporation
- 22. Incremental Context – Incremental Discovery
47min “We should take the sky and grass
off the table.”
2hr “Let‟s switch sides, and see if we
can make sense of this from
different perspectives.”
2hr10m “Wait, there are three … no, four
puzzles.”
2hr17m “We need a bigger table.”
2hr18m “I think you threw in a few random
pieces.”
22
© 2012 IBM Corporation
- 23. 23
© 2012 IBM Corporation
- 24. How Context Accumulates
With each new observation … one of three assertions are made:
1) Un-associated; 2) placed near like neighbors; or 3) connected
Must favor the false negative
New observations sometimes reverse earlier assertions
Some observations produce novel discovery
As the working space expands, computational effort increases
Given sufficient observations, there can come a tipping point
Thereafter, confidence improves while computational effort
decreases!
24
© 2012 IBM Corporation
- 25. Big Data [in context]. New Physics.
More data: better the predictions
– Lower false positives
– Lower false negatives
More data: bad data good
– Suddenly glad your data is not perfect
More data: less compute
25
© 2012 IBM Corporation
- 26. Big Data
Pile of ____ In Context
26
© 2012 IBM Corporation
- 27. One Form of Context: “Expert Counting”
Is it 5 people each with 1 account … or is it 1
person with 5 accounts?
Is it 20 cases of H1N1 in 20 cities … or one
case reported 20 times?
If one cannot count … one cannot estimate
vector or velocity (direction and speed).
Without vector and velocity … prediction is
nearly impossible.
27
© 2012 IBM Corporation
- 29. Entity Resolution Demonstration
VOTER DECEASED PERSON
George F Balston George Balston
YOB: 1951 D/L: 4801 YOB: 1951 SSN: 5598
13070 SW Karen Blvd Apt 7 DOD: 1995
Beaverton, OR 97005
Last voted: 2008
When it comes to best practices in voter matching, if only a name and
year of birth match, this is insufficient proof of a match. Many
different people in the U.S. share a name and year of birth.
Human review is required.
Unfortunately, there are thousands and thousands of cases just like
this and state election offices don‟t have the staff (or budget) to
manually review such volumes.
29
© 2012 IBM Corporation
- 30. Now Consider This Tertiary DMV Record
VOTER DECEASED PERSON
George F Balston George Balston
YOB: 1951 D/L: 4801 YOB: 1951 SSN: 5598
13070 SW Karen Blvd Apt 7 DOD: 1995
Beaverton, OR 97005
Last voted: 2008
DMV
George F Balston
YOB: 1951 SSN: 5598 D/L: 4801
3043 SW Clementine Blvd Apt 210
Beaverton, OR 97005
The DMV record contains enough features to match both the voter
(name, year of birth and driver‟s license) and/or the deceased persons
record (name, year of birth and SSN). For the sake of argument, let‟s
30
say it matches the voter best.
© 2012 IBM Corporation
- 31. Features Accumulate
VOTER DECEASED PERSON
George F Balston George Balston
YOB: 1951 D/L: 4801 YOB: 1951 SSN: 5598
13070 SW Karen Blvd Apt 7 DOD: 1995
Beaverton, OR 97005
Last voted: 2008
DMV
George F Balston
YOB: 1951 SSN: 5598 D/L: 4801
3043 SW Clementine Blvd Apt 210
Beaverton, OR 97005
The voter/DMV record now shares a name, year of birth and SSN with
the deceased person record. In voter matching best practices, this
evidence would be sufficient to make a determination that this voter
is in fact deceased. This case no longer needs human review.
31
© 2012 IBM Corporation
- 32. Useful Insight Revealed!
VOTER
George F Balston As features accumulate it
YOB: 1951 D/L: 4801 becomes possible to
13070 SW Karen Blvd Apt 7 resolve previous un-
Beaverton, OR 97005 resolvable identity
Last voted: 2008 records.
DMV
George F Balston As events and
YOB: 1951 SSN: 5598 D/L: 4801 transactions accumulate –
3043 SW Clementine Blvd Apt 210 detection of relevance
Beaverton, OR 97005 improves.
DECEASED PERSON Here we can see George
George Balston who died in 1995 voted in
YOB: 1951 SSN: 5598 2008.
DOD: 1995
32
© 2012 IBM Corporation
- 36. Sense and Respond
Observation
Space
New
Observations
What you know
36
© 2012 IBM Corporation
- 37. Sense and Respond
Observation
Space
Data Finds
Data
Relevance
Finds the Sensor
(<200ms)
?
Decide
37
© 2012 IBM Corporation
- 38. Sense and Respond Explore and Reflect
Observation
Space Deep
Reflection
Curated
Data
Data Finds Pattern
Data Discovery
Directed
Attention
Relevance
Finds the Sensor
(<200ms)
?
Relevance
Decide Find You
38
© 2012 IBM Corporation
- 39. Sense and Respond Explore and Reflect
Observation
Space Deep
Reflection
Curated
Data
Data Finds Pattern
Data Discovery
Directed
Attention
Relevance NEW
Finds the Sensor
(<200ms)
? INTERESTS
Decide
39
© 2012 IBM Corporation
- 40. Sense and Respond Explore and Reflect
Observation InfoSphere Streams Netezza
Space Deep
SPSS
Reflection
Watson
Curated
Data
Data Finds Pattern
Data SPSS Discovery
Sensemaking
Cognos
Directed
Attention
Relevance NEW
InfoSphere
Finds the Sensor
(<200ms)
Streams ? INTERESTS
ILog
Decide
40
© 2012 IBM Corporation
- 41. Sense and Respond Explore and Reflect
Observation
Space Deep
Reflection
Curated
Data
Data Finds Pattern
Data Discovery
Directed
Attention
Relevance NEW
Finds the Sensor
(<200ms)
? INTERESTS
Decide
41
Report and Manage
© 2012 IBM Corporation
- 42. Data Finds Pattern
Data Discovery
Directed
Attention
Relevance NEW
Finds the Sensor
(<200ms)
? INTERESTS
Decide
Content Management
Info Management
Case Management
Data Systems
Warehousing
42
Report and Manage
© 2012 IBM Corporation
- 44. The Greater the Context, the Greater the Value
Data
in Context
Value of Data
Pile of Data
(Big) Records Managed (Ludicrous Big)
44
© 2012 IBM Corporation
- 45. Time Is Of The Essence
The better the
predictions … the
Batch faster they will be
wanted.
Day
“Why did we have
Willingness to Wait
to wait until the
Hour end of the day for
the smart answer?”
200ms Real-Time
(Iffy) Relevance (Totally)
45
© 2012 IBM Corporation
- 47. The most competitive organizations
are going to make sense of what they are observing
fast enough to do something about it
while they are observing it.
47
© 2012 IBM Corporation
- 48. Wish This On The Competitor
Available
Observation
Computing Power Growth
Space
Context
Enterprise
Amnesia
Sensemaking
Algorithms
Time
48
© 2012 IBM Corporation
- 49. The Way Forward: Enterprise Intelligence
Available
Observation
Computing Power Growth
Space
Context
Sensemaking
Algorithms
Time
49
© 2012 IBM Corporation
- 50. Related Blog Posts
Algorithms At Dead-End: Cannot Squeeze Knowledge Out Of A
Pixel
Puzzling: How Observations Are Accumulated Into Context
On A Smarter Planet … Some Organizations Will Be Smarter-er
Than Others
G2 | Sensemaking – One Year Birthday Today. Cognitive Basics
Emerging.
50
© 2012 IBM Corporation
- 51. Questions?
Email: jeffjonas@us.ibm.com
Blog: www.jeffjonas.typepad.com
Twitter: http://www.twitter.com/jeffjonas
51
© 2012 IBM Corporation
- 52. Enterprise Intelligence
Jeff Jonas, IBM Distinguished Engineer
Chief Scientist, IBM Entity Analytics
Email: jeffjonas@us.ibm.com
Blog: www.jeffjonas.typepad.com
Twitter: http://www.twitter.com/jeffjonas
52
© 2012 IBM Corporation
- 54. G2 Mission Statement
1) Evaluate each new observation against
previous observations.
2) Determine if what is being observed is
relevant.
3) Delivering this actionable insight to its
consumer … fast enough to do something
about it while it is still happening.
4) Doing this with sufficient accuracy and
scale to really matter.
54
© 2012 IBM Corporation
- 55. From Pixels to Pictures to Action
Relevance Finds You
Data Finds Data
This is G2
Observations Persistent Consumer
Context (An analyst, a system,
the sensor itself, etc.)
55
© 2012 IBM Corporation
- 56. Uniquely G2
More scalable, faster and extensible
– Designed for grid compute and sub-200ms sense and respond
Smarter
– Tolerance for disagreement (no such thing as a single version of truth)
– Support for more abstract entities (e.g., locations, products, asteroids)
– Support for more exotic features (e.g., biometrics, social circles)
Crazy stuff
– Detects on its own when it is confused and makes “note to self”
– Geospatial reasoning including a sense of here and now
Privacy by Design (PbD)
– More privacy and civil liberties enhancing features baked-in than any other
commercial technology
56
© 2012 IBM Corporation
- 57. PbD: Self-Correcting False Positives
1 A plausible claim these
John T Smith Jr two people are the same
123 Main Street
703 111-2000
DOB: 03/12/1984
3
2 John T Smith Until this record John T Smith Sr
123 Main Street comes into view 123 Main Street
703 111-2000 703 111-2000
DL: 009900991 DL: 009900991
Which reveals this is a
FALSE POSITIVE
57
© 2012 IBM Corporation
- 58. PbD: Self-Correcting False Positives
1
John T Smith Jr
123 Main Street
703 111-2000
DOB: 03/12/1984
3
2 John T Smith John T Smith Sr
123 Main Street 123 Main Street
703 111-2000 703 111-2000
DL: 009900991 DL: 009900991
2 John T Smith
123 Main Street
703 111-2000
DL: 009900991
New Best Practice:
FIXED IN REAL-TIME
(not end of month)
58
© 2012 IBM Corporation
- 59. Customer Facing Systems
Fraud Data Mining
Sensemaking
This System That System
Back-of-House Accounting Systems
59
© 2012 IBM Corporation