SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Downloaden Sie, um offline zu lesen
Finding Ostriches in the Courtroom
Enabling Insight with Linguistic Visualization

                               Christopher Collins
                            University of Toronto (to Dec 2009)
      University of Ontario Institute of Technology (Jan 2010-)
Target Audience




          General         Domain          Language
          Public          Experts         Researchers




          Real-time   Single Document       Linguistic


                       Discrete Corpus        NLP


                      Continuous Corpus        CL
Problem Areas




         Real-time   Single Document     Linguistic


                      Discrete Corpus      NLP


                     Continuous Corpus      CL
Humans have reached
their cognitive capacity.
Humans have reached
their cognitive capacity.
Information is overwhelming
         because of
      the naïve manner
   in which it is delivered.
7
External Cognition
• External cognition is the interaction
  between internal and external
  representations when performing cognitive
  tasks.
• Computational offloading is the extent to
  which external representations can reduce
  the amount of cognitive effort to solve a
  problem.
  Yvonne Rogers, New Theoretical Approaches for Human-Computer Interaction, 2004.
Document Visualization




                                                Collins, C.; Carpendale, S.; Penn, G.
               DocuBurst: Visualizing Document Content using Language Structure.
       Proceedings of Eurographics/IEEE VGTC Symposium on Visualization, June, 2009.
Many Eyes Tag Cloud
Mihalcea and Tarau, 2004
DocuBurst
          games game
          taken take




          absolute,noun,10
          chair,noun,2
          moment,noun,11
          game,noun,30
          reality,noun,3
          take,verb,13
          represent,verb,17
          ...




          game IS activity
WordNet   chair IS furniture
U.S. Presidential Debates
Corpus Visualization

• Beyond similarity and clustering
  – How do we discern differences within and between
    document collections?




                                                         Collins, C.; Viégas, F.; Wattenberg, M.
                           Parallel Tag Clouds to Explore and Analyze Faceted Text Corpora.
     To appear in Proc. IEEE Symposium on Visual Analytics Science & Technology (VAST), 2009.
Our Data: U.S. Federal Court Decisions




Data from public.resource.org
Visualization Design          Patent Invention

17


     • Size = significance of
       difference (G2 score)
     • Order = alphabetic
     • Edges = word occurring in
       multiple columns
Ostriches in the 7th Circuit
Highfalutin Judge Selya

furculum
             immurement
       impuissant
Bridging the Linguistic Divide

 Open APIs for data


 NYT, Twitter, Google




                                  ?
                          Open APIs for NLP

                              -    Summarization
                          -       Keyword extraction
 Toolkits and APIs for    -       Sentiment analysis
     Visualization

 Processing, Rafael,
    Flare, Flash
Visualization
 Augments
  Reading


       www.christophercollins.ca

Weitere ähnliche Inhalte

Ähnlich wie Finding Ostriches in the Courtroom

Why Languages Matter 20090123
Why Languages Matter 20090123Why Languages Matter 20090123
Why Languages Matter 20090123David Wood
 
Semantic webslideshareversion
Semantic webslideshareversionSemantic webslideshareversion
Semantic webslideshareversionCaroline_Rose
 
Portuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and HowPortuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and HowValeria de Paiva
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with PythonBenjamin Bengfort
 
Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)Cornelius Puschmann
 
Understanding natural language processing
Understanding natural language processingUnderstanding natural language processing
Understanding natural language processingjbene mourad
 
Lean Logic for Lean Times: Varieties of Natural Logic
Lean Logic for Lean Times: Varieties of Natural LogicLean Logic for Lean Times: Varieties of Natural Logic
Lean Logic for Lean Times: Varieties of Natural LogicValeria de Paiva
 
NLP introduced and in 47 slides Lecture 1.ppt
NLP introduced and in 47 slides Lecture 1.pptNLP introduced and in 47 slides Lecture 1.ppt
NLP introduced and in 47 slides Lecture 1.pptOlusolaTop
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processingpunedevscom
 
Knowledge Integration and Language relativism–Of the triade knowledge,languag...
Knowledge Integration and Language relativism–Of the triade knowledge,languag...Knowledge Integration and Language relativism–Of the triade knowledge,languag...
Knowledge Integration and Language relativism–Of the triade knowledge,languag...Oliver Krone-Franken
 
Discourse Or Document? Issues of adopting Emerging Digital Genres for Scholar...
Discourse Or Document? Issues of adopting Emerging Digital Genres for Scholar...Discourse Or Document? Issues of adopting Emerging Digital Genres for Scholar...
Discourse Or Document? Issues of adopting Emerging Digital Genres for Scholar...Cornelius Puschmann
 
Bosch1991a bermuda
Bosch1991a bermudaBosch1991a bermuda
Bosch1991a bermudagorin2008
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language ProcessingMichel Bruley
 
Nlp Sentemental analysis of Tweetr And CaseStudy
Nlp Sentemental analysis of Tweetr And CaseStudyNlp Sentemental analysis of Tweetr And CaseStudy
Nlp Sentemental analysis of Tweetr And CaseStudyRaza Azeem
 
Cognitive ethnography
Cognitive ethnographyCognitive ethnography
Cognitive ethnographyBrock Dubbels
 

Ähnlich wie Finding Ostriches in the Courtroom (20)

Why Languages Matter 20090123
Why Languages Matter 20090123Why Languages Matter 20090123
Why Languages Matter 20090123
 
Semantic webslideshareversion
Semantic webslideshareversionSemantic webslideshareversion
Semantic webslideshareversion
 
Portuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and HowPortuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and How
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with Python
 
Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)
 
A Bridge Not too Far
A Bridge Not too FarA Bridge Not too Far
A Bridge Not too Far
 
Understanding natural language processing
Understanding natural language processingUnderstanding natural language processing
Understanding natural language processing
 
Lean Logic for Lean Times: Varieties of Natural Logic
Lean Logic for Lean Times: Varieties of Natural LogicLean Logic for Lean Times: Varieties of Natural Logic
Lean Logic for Lean Times: Varieties of Natural Logic
 
NLP introduced and in 47 slides Lecture 1.ppt
NLP introduced and in 47 slides Lecture 1.pptNLP introduced and in 47 slides Lecture 1.ppt
NLP introduced and in 47 slides Lecture 1.ppt
 
Bird05 nltk-intro
Bird05 nltk-introBird05 nltk-intro
Bird05 nltk-intro
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Knowledge Integration and Language relativism–Of the triade knowledge,languag...
Knowledge Integration and Language relativism–Of the triade knowledge,languag...Knowledge Integration and Language relativism–Of the triade knowledge,languag...
Knowledge Integration and Language relativism–Of the triade knowledge,languag...
 
LSDI.pptx
LSDI.pptxLSDI.pptx
LSDI.pptx
 
Discourse Or Document? Issues of adopting Emerging Digital Genres for Scholar...
Discourse Or Document? Issues of adopting Emerging Digital Genres for Scholar...Discourse Or Document? Issues of adopting Emerging Digital Genres for Scholar...
Discourse Or Document? Issues of adopting Emerging Digital Genres for Scholar...
 
Bosch1991a bermuda
Bosch1991a bermudaBosch1991a bermuda
Bosch1991a bermuda
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language Processing
 
Diachronic Analysis
Diachronic AnalysisDiachronic Analysis
Diachronic Analysis
 
Nlp Sentemental analysis of Tweetr And CaseStudy
Nlp Sentemental analysis of Tweetr And CaseStudyNlp Sentemental analysis of Tweetr And CaseStudy
Nlp Sentemental analysis of Tweetr And CaseStudy
 
Cognitive ethnography
Cognitive ethnographyCognitive ethnography
Cognitive ethnography
 
Wittgenstein Language-game and Ontologies
Wittgenstein Language-game and OntologiesWittgenstein Language-game and Ontologies
Wittgenstein Language-game and Ontologies
 

Kürzlich hochgeladen

Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 

Kürzlich hochgeladen (20)

Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 

Finding Ostriches in the Courtroom

  • 1. Finding Ostriches in the Courtroom Enabling Insight with Linguistic Visualization Christopher Collins University of Toronto (to Dec 2009) University of Ontario Institute of Technology (Jan 2010-)
  • 2. Target Audience General Domain Language Public Experts Researchers Real-time Single Document Linguistic Discrete Corpus NLP Continuous Corpus CL
  • 3. Problem Areas Real-time Single Document Linguistic Discrete Corpus NLP Continuous Corpus CL
  • 4. Humans have reached their cognitive capacity.
  • 5. Humans have reached their cognitive capacity.
  • 6. Information is overwhelming because of the naïve manner in which it is delivered.
  • 7. 7
  • 8. External Cognition • External cognition is the interaction between internal and external representations when performing cognitive tasks. • Computational offloading is the extent to which external representations can reduce the amount of cognitive effort to solve a problem. Yvonne Rogers, New Theoretical Approaches for Human-Computer Interaction, 2004.
  • 9. Document Visualization Collins, C.; Carpendale, S.; Penn, G. DocuBurst: Visualizing Document Content using Language Structure. Proceedings of Eurographics/IEEE VGTC Symposium on Visualization, June, 2009.
  • 10. Many Eyes Tag Cloud Mihalcea and Tarau, 2004
  • 11. DocuBurst games game taken take absolute,noun,10 chair,noun,2 moment,noun,11 game,noun,30 reality,noun,3 take,verb,13 represent,verb,17 ... game IS activity WordNet chair IS furniture
  • 12.
  • 13.
  • 15. Corpus Visualization • Beyond similarity and clustering – How do we discern differences within and between document collections? Collins, C.; Viégas, F.; Wattenberg, M. Parallel Tag Clouds to Explore and Analyze Faceted Text Corpora. To appear in Proc. IEEE Symposium on Visual Analytics Science & Technology (VAST), 2009.
  • 16. Our Data: U.S. Federal Court Decisions Data from public.resource.org
  • 17. Visualization Design Patent Invention 17 • Size = significance of difference (G2 score) • Order = alphabetic • Edges = word occurring in multiple columns
  • 18.
  • 19. Ostriches in the 7th Circuit
  • 20. Highfalutin Judge Selya furculum immurement impuissant
  • 21. Bridging the Linguistic Divide Open APIs for data NYT, Twitter, Google ? Open APIs for NLP - Summarization - Keyword extraction Toolkits and APIs for - Sentiment analysis Visualization Processing, Rafael, Flare, Flash
  • 22. Visualization Augments Reading www.christophercollins.ca