SlideShare ist ein Scribd-Unternehmen logo
1 von 67
neilellis@cazcade.com
THANK YOU FOR BEING
     my guinea pigs today
I’D LOVE YOUR FEEDBACK TOO
SERENDIPITY
How do I nd what I am not looking for?
Serendipity: the occurrence and development of
 events by chance in a happy or benecial way
HOW SERENDIPITY HELPS

• Many new inventions occur because related information
 crosses conventional boundaries, leaving it’s ghetto.

• Ourlives are made richer by discovering ideas and
 experiences outside our comfort zones and habitual patterns.

• Serendipity
           accelerates information discovery by making new
 and unexpected connections.
Serendipity has lead to an incredible amount of discoveries.
A PRACTICAL EXERCISE
  Please ll in the forms I handed out.
WHO NEEDS SERENDIPITY?
•   B2B Sites - encourages businesses to find ways of collaborating they may
    never have thought of.

•   Social sites - let people discover new friends and new interests.

•   Collaborative software - find projects that could work together in
    unexpected ways.

•   Document management - find documents that help you look at your work
    in a different way?

•   Contact management - find new people who you could do business with
    that might not be in a narrowly dened eld.
THE THREE STEP PLAN
STEP 1: REMOVE ISOLATION
Science Books
                 Art Books


                   Cookery Books




SEMANTICALLY ISOLATED
User


              Blog Posts


  Documents




CONTEXTUALLY ISOLATED
Documents




CONTENT CONNECTED
Users
Documents

            Blog Posts                Documents

                         Blog Posts


  SOCIALLY CONNECTED
Documents


 Users


           Blog Posts




HIGHLY CONNECTED
GET CONNECTED
•   Contextually isolated systems only show us information regarding a closed set of
    data and activities.

•   Semantically isolated systems only show us information which is similar to other
    information.

•   Content connected systems show us data that relates to each other which can
    crosses weakening contextual and semantic boundaries.

•   Socially connected systems show us information regarding our friends and their
    activities, weakening contextual and semantic boundaries.

•   Highly connected systems show us information with n-degrees of separation and
    multiple paths across contextual and semantic boundaries.
OUR STORAGE SYSTEMS
AFFECT HOW CONNECTED
  WE MAKE THE WORLD
FILE BASED STORAGE SEES THE
WORLD AS A SET OF NESTED
 COLLECTIONS OF ISOLATED
        INFORMATION
LIKE FILING CABINETS
OR A WAREHOUSE
RELATIONAL DATABASES AS HIGHLY
  ORGANISED COLLECTIONS OF
        INFORMATION
       WHICH INTERSECT
LIKE ENROLMENT
LIKE BANKING
OR AN OCD LARDER
AND GRAPH DATABASES AS
DISORGANISED BUT HIGHLY
 INTERCONNECTED DATA
LIKE .....
The Human Brain
Ideas
People
The Internet
Data
AND POSSIBLY EVERYTHING!
RDMS VS GRAPH

• Highly
      connected systems can be modelled relatively easily on
 an RDMS, but adding new relationships creates complexity
 and must be planned in advance.

• Queryingis easier for semantically and contextually isolated
 models on an RDMS.

• Querying   is extremely messy (indeed!) for highly connected
 models.
STEP 2: USE MULTIPLE HOPS
User


  User




          Documents




RECOMMEND A FRIEND
User


             Documents




 YOU MIGHT ALSO LIKE
RDMS VS GRAPH


• Multiple
        hop queries are horric under an RDBMS in both
 performance pitfalls and legibility of queries.

• Graph  databases love multiple hop logic and one can say
 thrive upon it. It’s much easier to find out related items
 through arbitrary degrees of separation and semantic barriers.
STEP 3: WEIGHT AND FILTER
WEIGHT & FILTER

• Proximitystill matters, information should be closely
  connected if not semantically or contextually related.

• Relevancy    should relate to frequency.

• Filtering
        can be done manually by users choosing what to
  recommend or pass on.

• If   possible use customer feedback to adjust weighting.
RDMS VS GRAPH

• RDMS cannot categorise relationships independently of the
 content for example ‘like’, ‘owns’, ‘has viewed’.

• RDMS  cannot add meta-data to the relationship to help
 ranking of the relevancy.

• Graph databases can do both these and can quickly calculate
 the cost of traversing to an item of content.
EXAMPLES
TEFLON FRYING PANS:
SERENDIPITY IN ACTION
Marc GrĂŠgoire   Mme. GrĂŠgoire




INVENTED BY MARC GREGOIRE
  AT THE BEHEST OF HIS WIFE
Marc GrĂŠgoire




    PTFE




MARC USED PTFE ON HIS TACKLE
Mme. GrĂŠgoire




HIS WIFE WANTED PANS THAT DIDN’T STICK
PTFE




SEMANTICALLY ISOLATED
Marc GrĂŠgoire   Mme. GrĂŠgoire




    PTFE




CONTEXTUALLY ISOLATED
Marc GrĂŠgoire   Mme. GrĂŠgoire




  PTFE




SOCIALLY CONNECTED
Marc GrĂŠgoire   Mme. GrĂŠgoire




PTFE




MULTIPLE HOPS
Mme. GrĂŠgoire




PTFE




       SERENDIPITY
Marc GrĂŠgoire   Mme. GrĂŠgoire




     PTFE




HIGHLY CONNECTED SYSTEM
RE-TWEET
RE-TWEETS

• Re-tweets allow rapid dissemination of information beyond a
 limited social group, they cross semantic and contextual
 boundaries.

• Re-tweets    can be (and are often) re-tweeted, allowing multiple
 hops.

• Other Twitter   users act as the filters, and we further weight by
 reputation.
HAVE YOU FILLED IN YOUR
       FORMS?
WHAT SERENDIPITY ISN’T!

• Random; random   combinations of information are just noise.
 putting teflon on a dolphin’s nose would not be a useful
 contribution to society. Don’t confuse unexpected with random!

• Accidental; serendipitycomes from an attentive, and often
 intuitive mind receiving diverse information.

• Luck; serendipity
                  is a cognitive process that creates new
 connections between previously unrelated concepts and realises
 the value in them.
THREE STEPS TO SERENDIPITY

• Remove     Isolation. Relationships are low cost and can be
 added to data at any point, so create them and create as many
 as possible ignoring contextual or semantic boundaries.

• UseMultiple Hops. Cross semantic and contextual
 boundaries when providing relevancy.

• Weight   and Filter. The value of the information found
 should relate to the route traversed. Allow users to manually
 pass on information to others.
CODING SERENDIPITY
       How can we add serendipity into our systems?

• Information   must be able to travel freely between users.

• Information should be able to travel multiple levels of
 indirection with ease.

• Information
            should have the maximum number of inter-
 connections across semantic boundaries.

• Information
            relationships should be categorised and potentially
 contain meta-data required for weighting.
HOW NEO4J HELPS

• Relationships
              are created trivially at low cost at any time with
 no regards to semantic boundaries.

• Connected   information over many hops can be retrieved
 quickly using Node#traverse or the Traversal framework.

• Relationships
              can have both types and properties making
 weight and lter calculations easy.
TAKE AWAY

• Create   more relationships.

• Let   information cross contextual and semantic boundaries.

• Make    sure relevancy is probabilistic, not deterministic.

• Serendipity   is not accidental, random or lucky!

• Themore heterogeneous and connected your data becomes,
 the more you should consider Neo4j.
@neilellis
neilellis@cazcade.com
AUTOMATIC WEIGHT&FILTER

• Sum    the ‘weight’ of each relationship traversed to the node.

• Find   a random number between 0 and that weight.

• Order   the discovered nodes by this random value.

• Choose    the nodes with the nth lowest values.

• Byusing random numbers we increase serendipity without
 sacricing relevance.
MANUAL WEIGHT&FILTER


• Re-Tweeting        or forwarding.

• Tell   a friend.

• Like.

• etc.
OTHER EXAMPLES

•   Research papers are a semantically arranged collection of information and
    therefore create semantic isolated areas of information.

•   A lending library is another semantically isolated collection of information.

•   A project management website creates a contextually isolated set of
    information.

•   The internet is a highly connected disorganised information storage system
    - which leads to a fair amount of serendipity. How many interesting things
    have you ‘stumbled upon’ on the internet, but it still has a tendency to have
    semantic or contextual silos. There’s still a lot of room for improvement.

Weitere ähnliche Inhalte

Andere mochten auch

Abecedario con animalitos
Abecedario con animalitosAbecedario con animalitos
Abecedario con animalitos
Jennifer Valdez
 
Lifestyle unit 6
Lifestyle unit 6Lifestyle unit 6
Lifestyle unit 6
Les Davy
 
10G Verhagen deel 1.1
10G Verhagen deel 1.110G Verhagen deel 1.1
10G Verhagen deel 1.1
Dewiix3x3
 

Andere mochten auch (15)

απατσι
απατσιαπατσι
απατσι
 
Infografia 11 8 jailyne ruales
Infografia 11 8 jailyne rualesInfografia 11 8 jailyne ruales
Infografia 11 8 jailyne ruales
 
Abecedario con animalitos
Abecedario con animalitosAbecedario con animalitos
Abecedario con animalitos
 
Valley Gives orientation Feb. 1, 2017
Valley Gives orientation Feb. 1, 2017Valley Gives orientation Feb. 1, 2017
Valley Gives orientation Feb. 1, 2017
 
Millennials and Giving
Millennials and GivingMillennials and Giving
Millennials and Giving
 
Position Paper
Position PaperPosition Paper
Position Paper
 
διδώ σωτηρίου
διδώ σωτηρίουδιδώ σωτηρίου
διδώ σωτηρίου
 
SXSW - Open Leadership
SXSW - Open LeadershipSXSW - Open Leadership
SXSW - Open Leadership
 
MARCO FUNCIONAL DE LA GESTIÓN DE LA INFORMACIÓN
MARCO FUNCIONAL DE LA GESTIÓN DE LA INFORMACIÓNMARCO FUNCIONAL DE LA GESTIÓN DE LA INFORMACIÓN
MARCO FUNCIONAL DE LA GESTIÓN DE LA INFORMACIÓN
 
Health 3.0: What Does It Look Like and How Do We Get There
Health 3.0: What Does It Look Like and How Do We Get ThereHealth 3.0: What Does It Look Like and How Do We Get There
Health 3.0: What Does It Look Like and How Do We Get There
 
Practica innovadora 2 APRENDIZAJE VIVENCIAL
Practica innovadora 2 APRENDIZAJE VIVENCIAL Practica innovadora 2 APRENDIZAJE VIVENCIAL
Practica innovadora 2 APRENDIZAJE VIVENCIAL
 
Lifestyle unit 6
Lifestyle unit 6Lifestyle unit 6
Lifestyle unit 6
 
クラウドセキュリティ 誤解と事実の壁
クラウドセキュリティ 誤解と事実の壁クラウドセキュリティ 誤解と事実の壁
クラウドセキュリティ 誤解と事実の壁
 
Aef4 15
Aef4 15Aef4 15
Aef4 15
 
10G Verhagen deel 1.1
10G Verhagen deel 1.110G Verhagen deel 1.1
10G Verhagen deel 1.1
 

Ähnlich wie Serendipity

Shared data and the future of libraries
Shared data and the future of librariesShared data and the future of libraries
Shared data and the future of libraries
Regan Harper
 
Linked data and the future of libraries
Linked data and the future of librariesLinked data and the future of libraries
Linked data and the future of libraries
Regan Harper
 
Sci Tech Forum LA 2013: New Directions in Scholarly Communication
Sci Tech Forum LA 2013: New Directions in Scholarly CommunicationSci Tech Forum LA 2013: New Directions in Scholarly Communication
Sci Tech Forum LA 2013: New Directions in Scholarly Communication
William Gunn
 
Redesigning a Website Using Information Architecture Principals
Redesigning a Website Using Information Architecture PrincipalsRedesigning a Website Using Information Architecture Principals
Redesigning a Website Using Information Architecture Principals
Jenny Emanuel
 

Ähnlich wie Serendipity (20)

Can you Cope
Can you CopeCan you Cope
Can you Cope
 
Shared data and the future of libraries
Shared data and the future of librariesShared data and the future of libraries
Shared data and the future of libraries
 
Social Network Analysis (SNA) Made Easy
Social Network Analysis (SNA) Made EasySocial Network Analysis (SNA) Made Easy
Social Network Analysis (SNA) Made Easy
 
Social Network Analysis - an Introduction (minus the Maths)
Social Network Analysis - an Introduction (minus the Maths)Social Network Analysis - an Introduction (minus the Maths)
Social Network Analysis - an Introduction (minus the Maths)
 
Linked Data: opening Scotland’s library content to the world
Linked Data: opening Scotland’s library content to the world Linked Data: opening Scotland’s library content to the world
Linked Data: opening Scotland’s library content to the world
 
Is connectivism real v 19th
Is connectivism real v 19thIs connectivism real v 19th
Is connectivism real v 19th
 
NISO Webinar: Library Linked Data: From Vision to Reality
NISO Webinar: Library Linked Data: From Vision to RealityNISO Webinar: Library Linked Data: From Vision to Reality
NISO Webinar: Library Linked Data: From Vision to Reality
 
Linked data and the future of libraries
Linked data and the future of librariesLinked data and the future of libraries
Linked data and the future of libraries
 
Seeing Graphs - How to successfully visualize connected data
Seeing Graphs - How to successfully visualize connected dataSeeing Graphs - How to successfully visualize connected data
Seeing Graphs - How to successfully visualize connected data
 
In search of lost knowledge: joining the dots with Linked Data
In search of lost knowledge: joining the dots with Linked DataIn search of lost knowledge: joining the dots with Linked Data
In search of lost knowledge: joining the dots with Linked Data
 
Implementing Linked Data in Low-Resource Conditions
Implementing Linked Data in Low-Resource ConditionsImplementing Linked Data in Low-Resource Conditions
Implementing Linked Data in Low-Resource Conditions
 
Linked data 20171106
Linked data 20171106Linked data 20171106
Linked data 20171106
 
Why are they ranking higher than me? by @kelvinnewman #tfma2014 with @theidm
Why are they ranking higher than me?  by @kelvinnewman #tfma2014 with @theidmWhy are they ranking higher than me?  by @kelvinnewman #tfma2014 with @theidm
Why are they ranking higher than me? by @kelvinnewman #tfma2014 with @theidm
 
Sci Tech Forum LA 2013: New Directions in Scholarly Communication
Sci Tech Forum LA 2013: New Directions in Scholarly CommunicationSci Tech Forum LA 2013: New Directions in Scholarly Communication
Sci Tech Forum LA 2013: New Directions in Scholarly Communication
 
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
 
Redesigning a Website Using Information Architecture Principals
Redesigning a Website Using Information Architecture PrincipalsRedesigning a Website Using Information Architecture Principals
Redesigning a Website Using Information Architecture Principals
 
SHARE Notification Service, December 2014
SHARE Notification Service, December 2014SHARE Notification Service, December 2014
SHARE Notification Service, December 2014
 
FORCE11: Creating a data and tools ecosystem
FORCE11:  Creating a data and tools ecosystemFORCE11:  Creating a data and tools ecosystem
FORCE11: Creating a data and tools ecosystem
 
Research into Practice case study 2: Library linked data implementations an...
	Research into Practice case study 2:  Library linked data implementations an...	Research into Practice case study 2:  Library linked data implementations an...
Research into Practice case study 2: Library linked data implementations an...
 
Shared Data & Big Data for Libraries
Shared Data & Big Data for LibrariesShared Data & Big Data for Libraries
Shared Data & Big Data for Libraries
 

KĂźrzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

KĂźrzlich hochgeladen (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 

Serendipity

  • 2. THANK YOU FOR BEING my guinea pigs today
  • 3. I’D LOVE YOUR FEEDBACK TOO
  • 4. SERENDIPITY How do I nd what I am not looking for?
  • 5. Serendipity: the occurrence and development of events by chance in a happy or benecial way
  • 6. HOW SERENDIPITY HELPS • Many new inventions occur because related information crosses conventional boundaries, leaving it’s ghetto. • Ourlives are made richer by discovering ideas and experiences outside our comfort zones and habitual patterns. • Serendipity accelerates information discovery by making new and unexpected connections.
  • 7. Serendipity has lead to an incredible amount of discoveries.
  • 8. A PRACTICAL EXERCISE Please ll in the forms I handed out.
  • 9. WHO NEEDS SERENDIPITY? • B2B Sites - encourages businesses to nd ways of collaborating they may never have thought of. • Social sites - let people discover new friends and new interests. • Collaborative software - nd projects that could work together in unexpected ways. • Document management - nd documents that help you look at your work in a different way? • Contact management - nd new people who you could do business with that might not be in a narrowly dened eld.
  • 11. STEP 1: REMOVE ISOLATION
  • 12. Science Books Art Books Cookery Books SEMANTICALLY ISOLATED
  • 13. User Blog Posts Documents CONTEXTUALLY ISOLATED
  • 15. Users Documents Blog Posts Documents Blog Posts SOCIALLY CONNECTED
  • 16.
  • 17. Documents Users Blog Posts HIGHLY CONNECTED
  • 18.
  • 19.
  • 20. GET CONNECTED • Contextually isolated systems only show us information regarding a closed set of data and activities. • Semantically isolated systems only show us information which is similar to other information. • Content connected systems show us data that relates to each other which can crosses weakening contextual and semantic boundaries. • Socially connected systems show us information regarding our friends and their activities, weakening contextual and semantic boundaries. • Highly connected systems show us information with n-degrees of separation and multiple paths across contextual and semantic boundaries.
  • 21. OUR STORAGE SYSTEMS AFFECT HOW CONNECTED WE MAKE THE WORLD
  • 22. FILE BASED STORAGE SEES THE WORLD AS A SET OF NESTED COLLECTIONS OF ISOLATED INFORMATION
  • 25. RELATIONAL DATABASES AS HIGHLY ORGANISED COLLECTIONS OF INFORMATION WHICH INTERSECT
  • 28. OR AN OCD LARDER
  • 29. AND GRAPH DATABASES AS DISORGANISED BUT HIGHLY INTERCONNECTED DATA
  • 32. Ideas
  • 35. Data
  • 37. RDMS VS GRAPH • Highly connected systems can be modelled relatively easily on an RDMS, but adding new relationships creates complexity and must be planned in advance. • Queryingis easier for semantically and contextually isolated models on an RDMS. • Querying is extremely messy (indeed!) for highly connected models.
  • 38. STEP 2: USE MULTIPLE HOPS
  • 39. User User Documents RECOMMEND A FRIEND
  • 40. User Documents YOU MIGHT ALSO LIKE
  • 41. RDMS VS GRAPH • Multiple hop queries are horric under an RDBMS in both performance pitfalls and legibility of queries. • Graph databases love multiple hop logic and one can say thrive upon it. It’s much easier to nd out related items through arbitrary degrees of separation and semantic barriers.
  • 42. STEP 3: WEIGHT AND FILTER
  • 43. WEIGHT & FILTER • Proximitystill matters, information should be closely connected if not semantically or contextually related. • Relevancy should relate to frequency. • Filtering can be done manually by users choosing what to recommend or pass on. • If possible use customer feedback to adjust weighting.
  • 44. RDMS VS GRAPH • RDMS cannot categorise relationships independently of the content for example ‘like’, ‘owns’, ‘has viewed’. • RDMS cannot add meta-data to the relationship to help ranking of the relevancy. • Graph databases can do both these and can quickly calculate the cost of traversing to an item of content.
  • 47. Marc GrĂŠgoire Mme. GrĂŠgoire INVENTED BY MARC GREGOIRE AT THE BEHEST OF HIS WIFE
  • 48. Marc GrĂŠgoire PTFE MARC USED PTFE ON HIS TACKLE
  • 49. Mme. GrĂŠgoire HIS WIFE WANTED PANS THAT DIDN’T STICK
  • 51. Marc GrĂŠgoire Mme. GrĂŠgoire PTFE CONTEXTUALLY ISOLATED
  • 52. Marc GrĂŠgoire Mme. GrĂŠgoire PTFE SOCIALLY CONNECTED
  • 53. Marc GrĂŠgoire Mme. GrĂŠgoire PTFE MULTIPLE HOPS
  • 54. Mme. GrĂŠgoire PTFE SERENDIPITY
  • 55. Marc GrĂŠgoire Mme. GrĂŠgoire PTFE HIGHLY CONNECTED SYSTEM
  • 57. RE-TWEETS • Re-tweets allow rapid dissemination of information beyond a limited social group, they cross semantic and contextual boundaries. • Re-tweets can be (and are often) re-tweeted, allowing multiple hops. • Other Twitter users act as the lters, and we further weight by reputation.
  • 58. HAVE YOU FILLED IN YOUR FORMS?
  • 59. WHAT SERENDIPITY ISN’T! • Random; random combinations of information are just noise. putting teflon on a dolphin’s nose would not be a useful contribution to society. Don’t confuse unexpected with random! • Accidental; serendipitycomes from an attentive, and often intuitive mind receiving diverse information. • Luck; serendipity is a cognitive process that creates new connections between previously unrelated concepts and realises the value in them.
  • 60. THREE STEPS TO SERENDIPITY • Remove Isolation. Relationships are low cost and can be added to data at any point, so create them and create as many as possible ignoring contextual or semantic boundaries. • UseMultiple Hops. Cross semantic and contextual boundaries when providing relevancy. • Weight and Filter. The value of the information found should relate to the route traversed. Allow users to manually pass on information to others.
  • 61. CODING SERENDIPITY How can we add serendipity into our systems? • Information must be able to travel freely between users. • Information should be able to travel multiple levels of indirection with ease. • Information should have the maximum number of inter- connections across semantic boundaries. • Information relationships should be categorised and potentially contain meta-data required for weighting.
  • 62. HOW NEO4J HELPS • Relationships are created trivially at low cost at any time with no regards to semantic boundaries. • Connected information over many hops can be retrieved quickly using Node#traverse or the Traversal framework. • Relationships can have both types and properties making weight and lter calculations easy.
  • 63. TAKE AWAY • Create more relationships. • Let information cross contextual and semantic boundaries. • Make sure relevancy is probabilistic, not deterministic. • Serendipity is not accidental, random or lucky! • Themore heterogeneous and connected your data becomes, the more you should consider Neo4j.
  • 65. AUTOMATIC WEIGHT&FILTER • Sum the ‘weight’ of each relationship traversed to the node. • Find a random number between 0 and that weight. • Order the discovered nodes by this random value. • Choose the nodes with the nth lowest values. • Byusing random numbers we increase serendipity without sacricing relevance.
  • 66. MANUAL WEIGHT&FILTER • Re-Tweeting or forwarding. • Tell a friend. • Like. • etc.
  • 67. OTHER EXAMPLES • Research papers are a semantically arranged collection of information and therefore create semantic isolated areas of information. • A lending library is another semantically isolated collection of information. • A project management website creates a contextually isolated set of information. • The internet is a highly connected disorganised information storage system - which leads to a fair amount of serendipity. How many interesting things have you ‘stumbled upon’ on the internet, but it still has a tendency to have semantic or contextual silos. There’s still a lot of room for improvement.

Hinweis der Redaktion

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. We’ll come back to these forms a little later.\n
  9. \n
  10. So how can we encourage serendipity?\n
  11. \n
  12. Semantically related information such as science books, art books and cookery books are unlikely to refer to each other, keeping the information isolated by it’s semantics. When these boundaries are crossed we get some of the inventions we saw earlier.\n
  13. Contextually isolated information is separated by the context the information was created in; i.e. it belongs to a single user, a single team, company, project. Anything that links information together into a closed network. When scientists, companies, teams and people communicate their work or interests great things also happen.\n
  14. The internet broke away from these two information ghettos by joining documents together on the internet, so our information could be connected.\n
  15. We’ve now moved forward into the socially connected era where our systems now encourage the spread of information by users, we share, recommend and forward.\n
  16. \n
  17. But we can go a stage further, highly connected systems need to not just connect information but people and information in arbitrary combinations - further more we need to allow this information to travel in real time across these links. \n
  18. History shows that when we allow information to flow fast and freely in society we see revolutions in science and spirituality. As our collective understanding increases so does the welfare of the individual and society. So it is with information systems, by increasing the flow of information we increase the value to all those using it.\n
  19. Whenever information doesn’t flow, ignorance takes over and clearly we all suffer for that.\n
  20. So recommendation number one, increase connectivity.\n
  21. But our storage systems affect how connected we make the world\n
  22. File based systems basically encourages us to dump stuff together, but don’t encourages us to think how it interconnects. So we end up seeing the world as ....\n
  23. \n
  24. \n
  25. Relational databases help us to organise and connect related information in a highly organised formal manner, like ....\n
  26. \n
  27. \n
  28. \n
  29. Whereas graph databases or more like ....\n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. We also need data to escape it’s ghettos and a way we can do this is to potential allow information to travel arbitrary degrees of separation, for example like emails or tweets. Not just manually like viral marketing, but also automatically - in status updates, suggested content etc.\n
  39. We see this already in recommend a friend....\n
  40. Or related documents, but the key here is to allow multiple hops across all boundaries, semantic and contextual.\n
  41. Multiple hop queries are horrific under an RDBMS in both performance pitfalls and legibility of queries. This is the main reason RDMS systems rarely help the spread of information by automatic means and rely on users passing on information instead.\n\n
  42. But we don’t want just any old information, we still need to filter according to relevancy.\n
  43. But the key I believe when automating relevancy is not to use relevancy as a fixed one off judgement on whether something is visible or not, rather to use it as an indicator of the likelihood the information will be visible.\n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. In a semantically isolated example, books would be written about how teflon helps in fishing. Meanwhile frying pans are only of interest to the catering industry and would not have references to fishing equipment.\n
  51. In a contextually isolated system Marc would have been busy using teflon for his fishing equipment and never mentioned it to his wife.\n
  52. Luckily they talked to each other.\n
  53. Now in this system which is not at this point highly connected information was able to travel multiple hops as Marc discussed his fishing equipment and his wife saw the potential application \n
  54. \n
  55. Now we have a highly connected system that has crossed social and semantic boundaries, how long did it take before we had teflon baking trays, cake tins etc. Once a semantic boundary has been broken the process accelerates and the speed at which other boundaries are broken increases.\n
  56. Re-tweets traverse a graph with ‘n’ degrees of separation I can be looking at how to increase the viral nature of my new startup. When I notice a tweet about the use of landing pages - which leads me to write a viral landing page. Such a collaboration is serendipitous, it is unintentional but beneficial and rewarding.\n\n
  57. Re-tweets allow rapid dissemination of information beyond a limited social group. Because of the 5 degrees of separation on Twitter, a single tweet can reach the entire 200 million user base within minutes. As shown by Osama Bin Laden’s death.\n
  58. Please can you swap forms with one other person .... now the information on those forms is closely related to you because most of the people in the room have similarity in the backgrounds. However it’s outside of your pre-defined social group and the common semantical links between people here. For your homework I’d like you to watch that movie, listen to that music and take a look at that technology!\n
  59. \n
  60. -- Weight and Filter -> Whether they recommend, make favourite lists or send as a message. Maintain the source of the information for future automatic recommendations. Keep it connected.\n
  61. \n
  62. \n
  63. \n
  64. \n
  65. \n
  66. \n
  67. \n