SlideShare ist ein Scribd-Unternehmen logo
1 von 15
Downloaden Sie, um offline zu lesen
3rd Ph.D. Review
Tracking trends on the web using novel Machine Learning
                        methods


                       Vasileios Lampos

                                advised by

                     Professor Nello Cristianini

        Intelligent Systems Laboratory, University of Bristol
                            May 25, 2010




                  Vasileios Lampos      3rd Ph.D. Review
Vasileios Lampos   3rd Ph.D. Review
Aims


  The general aims of our research project can be summarised in
  the following points:

       1   Track trends on the Web by applying Machine Learning
           methods (track expresses the notions of infer or predict as
           well)

       2   Extend current or invent new methodologies (where and if
           needed) for accomplishing our primary aim

       3   Build tools that apply the experimental/theoretical results in
           real and large-scale applications (featured research)



                            Vasileios Lampos   3rd Ph.D. Review
Being more specific

    1   Trends about what? Examples?
            Predict flu rates (epidemics)
            Infer vote intensions (politics)
            Infer traffic/weather conditions (toy problems)




                        Vasileios Lampos   3rd Ph.D. Review
Being more specific

    1   Trends about what? Examples?
            Predict flu rates (epidemics)
            Infer vote intensions (politics)
            Infer traffic/weather conditions (toy problems)

    2   Methodologies?
            Feature extraction/selection
            Exploit probabilistic relationships (PGMs)
            Regression/classification/ranking scenarios
            Active learning




                         Vasileios Lampos   3rd Ph.D. Review
Being more specific

    1   Trends about what? Examples?
            Predict flu rates (epidemics)
            Infer vote intensions (politics)
            Infer traffic/weather conditions (toy problems)

    2   Methodologies?
            Feature extraction/selection
            Exploit probabilistic relationships (PGMs)
            Regression/classification/ranking scenarios
            Active learning

    3   Applications?
            Back-end infrastructure for data collection/retrieval/mining
            Real time online tools for making and displaying predictions
            (like the Flu detector)

                         Vasileios Lampos   3rd Ph.D. Review
P1 - Summary (1 of 3)

   Title:          Tracking the flu pandemic by monitoring the Social Web
   Authors:        V. Lampos and N. Cristianini
   Submitted to:   IAPR Cognitive Information Processing 2010 (accepted)


        Twitter and Health Protection Agency data for weeks 26-49,
        2009 (on average 160,000 tweets collected per day geolocated
        in 54 urban centres in the UK)
        Frequency of 41 flu related words (markers) in Twitter
        corpus had a correlation of >80% with the HPA flu rates in
        all UK regions
        Learn a better list of weighted markers automatically:
             Generate a list of candidate markers (1560 words taken from
             flu related web pages)
             Use LASSO for feature selection

                          Vasileios Lampos   3rd Ph.D. Review
P1 - Summary (2 of 3)

   Validation schemes:
     1   Train on one region, validate regularisation parameter on
         another, test on the remaining regions (for all possible
         combinations)
                      Train/Validate (regions)         A          B           C         D            E
                                 A                     -       0.9594      0.9375     0.9348      0.9297
                                 B                  0.9455        -        0.9476     0.9267      0.9003
                                 C                  0.9154     0.9513         -       0.8188       0.908
                                 D                  0.9463     0.9459      0.9424        -        0.9337
                                 E                  0.8798     0.9506      0.9455     0.8935         -
                                                                               Total Avg.         0.9256

         97 selected words                  : lung, unwel, temperatur, like, headach, season, unusu, chronic, child,
         dai, appetit, stai, symptom, spread, diarrhoea, start, muscl, weaken, immun, feel, liver, plenti, antivir,
         follow, sore, peopl, nation, small, pandem, pregnant, thermomet, bed, loss, heart, mention, condit, ...


     2   Aggregate data from all regions, test on weeks 28 and 41
         (2009) and train using the rest of the data set


                                       Vasileios Lampos        3rd Ph.D. Review
P1 - Summary (3 of 3)

    1       Inferred vs Official flu rate in North England
                                                                                                                HPA
                              100                                                                               Inferred
                   Flu rate


                              50



                               0
                                        180        200        220         240       260       280        300   320       340
                                                                         Day Number (2009)

        2    Inferred vs Official rates in all regions (aggregated data set)
                              150
                                                                                   HPA
                                                                                   Inferred
                              100
                   Flu rate




                                              A                B                   C                 D               E
                               50



                               0
                                    0         10         20         30       40        50       60        70    80       90
                                                                                  Days



                                                   Vasileios Lampos               3rd Ph.D. Review
P2 - Summary

  Title:          Flu detector - Tracking epidemics on Twitter
  Authors:        V. Lampos, T. De Bie, and N. Cristianini
  Submitted to:   ECML PKDD 2010 Demos (under review)


       Extending and making more robust the methodology of P1
       Larger data sets (bigger time series) and more (2675)
       candidate features
       Select a list of features (markers) using BoLASSO (bootstrap
       version of LASSO)
       Then learn weights of those markers via linear least squares
       regression
       Stricter evaluation of the methodology - Available online
       Put all this into practice and come up with the Flu detector

                          Vasileios Lampos   3rd Ph.D. Review
Other activities



       Studied/implemented the necessary statistical tools and
       algorithms (in MATLAB or Java)
       Extended further the infrastructure for conducting large scale
       experiments and data retrieval on demand
       TA for Intro to AI, Data Analysis and Pattern Analysis &
       Statistical Learning
       Attended some of the ISL meetings and seminars




                        Vasileios Lampos   3rd Ph.D. Review
General goals & activities




     1   ... (content omitted)
     2   ... (content omitted)
     3   ... (content omitted)
     4   ... (content omitted)




                          Vasileios Lampos   3rd Ph.D. Review
A more time specific tentative plan




      In June: ... (content omitted)
      In July: ... (content omitted)
      In August: ... (content omitted)
      In September - November: ... (content omitted)




                       Vasileios Lampos   3rd Ph.D. Review
This is the last slide.




    Vasileios Lampos   3rd Ph.D. Review
This is the last slide.

   Any questions?




    Vasileios Lampos   3rd Ph.D. Review

Weitere ähnliche Inhalte

Ähnlich wie Example 1

Biosurveillance 2.0: Lecture at Emory University
Biosurveillance 2.0: Lecture at Emory UniversityBiosurveillance 2.0: Lecture at Emory University
Biosurveillance 2.0: Lecture at Emory UniversityTaha Kass-Hout, MD, MS
 
2009 12 06 - LOINC Workshop
2009 12 06 - LOINC Workshop2009 12 06 - LOINC Workshop
2009 12 06 - LOINC Workshopdvreeman
 
Workshop on Systematic Searching (Oslo)
Workshop on Systematic Searching (Oslo)Workshop on Systematic Searching (Oslo)
Workshop on Systematic Searching (Oslo)jstaaks
 
Biosurveillance: Machine Learning And Disease Surveillance by Kass-Hout Di Tada
Biosurveillance: Machine Learning And Disease Surveillance by Kass-Hout Di TadaBiosurveillance: Machine Learning And Disease Surveillance by Kass-Hout Di Tada
Biosurveillance: Machine Learning And Disease Surveillance by Kass-Hout Di TadaTaha Kass-Hout, MD, MS
 
Conducting Twitter Reserch
Conducting Twitter ReserchConducting Twitter Reserch
Conducting Twitter ReserchKim Holmberg
 
Paper 9: How to increase one's ranking efficiently? (Xu)
Paper 9: How to increase one's ranking efficiently? (Xu)Paper 9: How to increase one's ranking efficiently? (Xu)
Paper 9: How to increase one's ranking efficiently? (Xu)Kent Business School
 
20080609 Loinc Workshop
20080609   Loinc Workshop20080609   Loinc Workshop
20080609 Loinc Workshopdvreeman
 
Study Proposal format-SRC.docx
Study Proposal format-SRC.docxStudy Proposal format-SRC.docx
Study Proposal format-SRC.docxKeerthana990449
 
RIFF - A Social Network and Collaborative Platform For Public Health Disease ...
RIFF - A Social Network and Collaborative Platform For Public Health Disease ...RIFF - A Social Network and Collaborative Platform For Public Health Disease ...
RIFF - A Social Network and Collaborative Platform For Public Health Disease ...InSTEDD
 
Riff: A Social Network and Collaborative Platform for Public Health Disease S...
Riff: A Social Network and Collaborative Platform for Public Health Disease S...Riff: A Social Network and Collaborative Platform for Public Health Disease S...
Riff: A Social Network and Collaborative Platform for Public Health Disease S...Taha Kass-Hout, MD, MS
 
PROMISE 2011: "Detecting Bug Duplicate Reports through Locality of Reference"
PROMISE 2011: "Detecting Bug Duplicate Reports through Locality of Reference"PROMISE 2011: "Detecting Bug Duplicate Reports through Locality of Reference"
PROMISE 2011: "Detecting Bug Duplicate Reports through Locality of Reference"CS, NcState
 
Essential biology 01 statistical analysis
Essential biology 01 statistical analysisEssential biology 01 statistical analysis
Essential biology 01 statistical analysismcnewbold
 
8323 Stats - Lesson 1 - 02 Introduction General 2008
8323 Stats - Lesson 1 - 02 Introduction General 20088323 Stats - Lesson 1 - 02 Introduction General 2008
8323 Stats - Lesson 1 - 02 Introduction General 2008untellectualism
 
first lecture to elementary statistcs
first lecture to elementary statistcsfirst lecture to elementary statistcs
first lecture to elementary statistcsssuser19c049
 
viriology1) Describe and explain the structure , genomic org.docx
viriology1) Describe and explain the structure , genomic org.docxviriology1) Describe and explain the structure , genomic org.docx
viriology1) Describe and explain the structure , genomic org.docxdickonsondorris
 

Ähnlich wie Example 1 (20)

Biosurveillance 2.0: Lecture at Emory University
Biosurveillance 2.0: Lecture at Emory UniversityBiosurveillance 2.0: Lecture at Emory University
Biosurveillance 2.0: Lecture at Emory University
 
2009 12 06 - LOINC Workshop
2009 12 06 - LOINC Workshop2009 12 06 - LOINC Workshop
2009 12 06 - LOINC Workshop
 
Principlles of statistics
Principlles of statisticsPrinciplles of statistics
Principlles of statistics
 
Workshop on Systematic Searching (Oslo)
Workshop on Systematic Searching (Oslo)Workshop on Systematic Searching (Oslo)
Workshop on Systematic Searching (Oslo)
 
Biosurveillance: Machine Learning And Disease Surveillance by Kass-Hout Di Tada
Biosurveillance: Machine Learning And Disease Surveillance by Kass-Hout Di TadaBiosurveillance: Machine Learning And Disease Surveillance by Kass-Hout Di Tada
Biosurveillance: Machine Learning And Disease Surveillance by Kass-Hout Di Tada
 
Conducting Twitter Reserch
Conducting Twitter ReserchConducting Twitter Reserch
Conducting Twitter Reserch
 
Paper 9: How to increase one's ranking efficiently? (Xu)
Paper 9: How to increase one's ranking efficiently? (Xu)Paper 9: How to increase one's ranking efficiently? (Xu)
Paper 9: How to increase one's ranking efficiently? (Xu)
 
Lab 1 intro
Lab 1 introLab 1 intro
Lab 1 intro
 
20080609 Loinc Workshop
20080609   Loinc Workshop20080609   Loinc Workshop
20080609 Loinc Workshop
 
Study Proposal format-SRC.docx
Study Proposal format-SRC.docxStudy Proposal format-SRC.docx
Study Proposal format-SRC.docx
 
Experimental
ExperimentalExperimental
Experimental
 
Biosurveillance 2.0
Biosurveillance 2.0Biosurveillance 2.0
Biosurveillance 2.0
 
RIFF - A Social Network and Collaborative Platform For Public Health Disease ...
RIFF - A Social Network and Collaborative Platform For Public Health Disease ...RIFF - A Social Network and Collaborative Platform For Public Health Disease ...
RIFF - A Social Network and Collaborative Platform For Public Health Disease ...
 
Riff: A Social Network and Collaborative Platform for Public Health Disease S...
Riff: A Social Network and Collaborative Platform for Public Health Disease S...Riff: A Social Network and Collaborative Platform for Public Health Disease S...
Riff: A Social Network and Collaborative Platform for Public Health Disease S...
 
PROMISE 2011: "Detecting Bug Duplicate Reports through Locality of Reference"
PROMISE 2011: "Detecting Bug Duplicate Reports through Locality of Reference"PROMISE 2011: "Detecting Bug Duplicate Reports through Locality of Reference"
PROMISE 2011: "Detecting Bug Duplicate Reports through Locality of Reference"
 
Essential biology 01 statistical analysis
Essential biology 01 statistical analysisEssential biology 01 statistical analysis
Essential biology 01 statistical analysis
 
8323 Stats - Lesson 1 - 02 Introduction General 2008
8323 Stats - Lesson 1 - 02 Introduction General 20088323 Stats - Lesson 1 - 02 Introduction General 2008
8323 Stats - Lesson 1 - 02 Introduction General 2008
 
first lecture to elementary statistcs
first lecture to elementary statistcsfirst lecture to elementary statistcs
first lecture to elementary statistcs
 
viriology1) Describe and explain the structure , genomic org.docx
viriology1) Describe and explain the structure , genomic org.docxviriology1) Describe and explain the structure , genomic org.docx
viriology1) Describe and explain the structure , genomic org.docx
 
InSTEDD HISA Conference
InSTEDD HISA ConferenceInSTEDD HISA Conference
InSTEDD HISA Conference
 

Kürzlich hochgeladen

Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...KokoStevan
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterMateoGardella
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfChris Hunter
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Shubhangi Sonawane
 

Kürzlich hochgeladen (20)

Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 

Example 1

  • 1. 3rd Ph.D. Review Tracking trends on the web using novel Machine Learning methods Vasileios Lampos advised by Professor Nello Cristianini Intelligent Systems Laboratory, University of Bristol May 25, 2010 Vasileios Lampos 3rd Ph.D. Review
  • 2. Vasileios Lampos 3rd Ph.D. Review
  • 3. Aims The general aims of our research project can be summarised in the following points: 1 Track trends on the Web by applying Machine Learning methods (track expresses the notions of infer or predict as well) 2 Extend current or invent new methodologies (where and if needed) for accomplishing our primary aim 3 Build tools that apply the experimental/theoretical results in real and large-scale applications (featured research) Vasileios Lampos 3rd Ph.D. Review
  • 4. Being more specific 1 Trends about what? Examples? Predict flu rates (epidemics) Infer vote intensions (politics) Infer traffic/weather conditions (toy problems) Vasileios Lampos 3rd Ph.D. Review
  • 5. Being more specific 1 Trends about what? Examples? Predict flu rates (epidemics) Infer vote intensions (politics) Infer traffic/weather conditions (toy problems) 2 Methodologies? Feature extraction/selection Exploit probabilistic relationships (PGMs) Regression/classification/ranking scenarios Active learning Vasileios Lampos 3rd Ph.D. Review
  • 6. Being more specific 1 Trends about what? Examples? Predict flu rates (epidemics) Infer vote intensions (politics) Infer traffic/weather conditions (toy problems) 2 Methodologies? Feature extraction/selection Exploit probabilistic relationships (PGMs) Regression/classification/ranking scenarios Active learning 3 Applications? Back-end infrastructure for data collection/retrieval/mining Real time online tools for making and displaying predictions (like the Flu detector) Vasileios Lampos 3rd Ph.D. Review
  • 7. P1 - Summary (1 of 3) Title: Tracking the flu pandemic by monitoring the Social Web Authors: V. Lampos and N. Cristianini Submitted to: IAPR Cognitive Information Processing 2010 (accepted) Twitter and Health Protection Agency data for weeks 26-49, 2009 (on average 160,000 tweets collected per day geolocated in 54 urban centres in the UK) Frequency of 41 flu related words (markers) in Twitter corpus had a correlation of >80% with the HPA flu rates in all UK regions Learn a better list of weighted markers automatically: Generate a list of candidate markers (1560 words taken from flu related web pages) Use LASSO for feature selection Vasileios Lampos 3rd Ph.D. Review
  • 8. P1 - Summary (2 of 3) Validation schemes: 1 Train on one region, validate regularisation parameter on another, test on the remaining regions (for all possible combinations) Train/Validate (regions) A B C D E A - 0.9594 0.9375 0.9348 0.9297 B 0.9455 - 0.9476 0.9267 0.9003 C 0.9154 0.9513 - 0.8188 0.908 D 0.9463 0.9459 0.9424 - 0.9337 E 0.8798 0.9506 0.9455 0.8935 - Total Avg. 0.9256 97 selected words : lung, unwel, temperatur, like, headach, season, unusu, chronic, child, dai, appetit, stai, symptom, spread, diarrhoea, start, muscl, weaken, immun, feel, liver, plenti, antivir, follow, sore, peopl, nation, small, pandem, pregnant, thermomet, bed, loss, heart, mention, condit, ... 2 Aggregate data from all regions, test on weeks 28 and 41 (2009) and train using the rest of the data set Vasileios Lampos 3rd Ph.D. Review
  • 9. P1 - Summary (3 of 3) 1 Inferred vs Official flu rate in North England HPA 100 Inferred Flu rate 50 0 180 200 220 240 260 280 300 320 340 Day Number (2009) 2 Inferred vs Official rates in all regions (aggregated data set) 150 HPA Inferred 100 Flu rate A B C D E 50 0 0 10 20 30 40 50 60 70 80 90 Days Vasileios Lampos 3rd Ph.D. Review
  • 10. P2 - Summary Title: Flu detector - Tracking epidemics on Twitter Authors: V. Lampos, T. De Bie, and N. Cristianini Submitted to: ECML PKDD 2010 Demos (under review) Extending and making more robust the methodology of P1 Larger data sets (bigger time series) and more (2675) candidate features Select a list of features (markers) using BoLASSO (bootstrap version of LASSO) Then learn weights of those markers via linear least squares regression Stricter evaluation of the methodology - Available online Put all this into practice and come up with the Flu detector Vasileios Lampos 3rd Ph.D. Review
  • 11. Other activities Studied/implemented the necessary statistical tools and algorithms (in MATLAB or Java) Extended further the infrastructure for conducting large scale experiments and data retrieval on demand TA for Intro to AI, Data Analysis and Pattern Analysis & Statistical Learning Attended some of the ISL meetings and seminars Vasileios Lampos 3rd Ph.D. Review
  • 12. General goals & activities 1 ... (content omitted) 2 ... (content omitted) 3 ... (content omitted) 4 ... (content omitted) Vasileios Lampos 3rd Ph.D. Review
  • 13. A more time specific tentative plan In June: ... (content omitted) In July: ... (content omitted) In August: ... (content omitted) In September - November: ... (content omitted) Vasileios Lampos 3rd Ph.D. Review
  • 14. This is the last slide. Vasileios Lampos 3rd Ph.D. Review
  • 15. This is the last slide. Any questions? Vasileios Lampos 3rd Ph.D. Review