SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Downloaden Sie, um offline zu lesen
CS-GN-TEAM: internal presentation




research taster project
    temporal expressions extraction
                        Michele Filannino + You




                                                     Manchester, 15/02/2012
presentation my research taster project




cdt?


■ 4-year PhD course
■ funded by EPSRC
■ industrial partners
■ multi-disciplinary
■ new model for all PhD training within the UK

                                           15/02/2012, Michele Filannino   2 / 23
presentation my research taster project




cdt?
■ 6 months of foundation period
   ●   3 postgraduate courses
        ▶   Machine Learning and Data Mining, Modelling and
            visualisation of high-dimensional data, Semi-structured data
            and the web
   ●   3 scientific methods courses
   ●   1 short taster project [6 weeks]
   ●   creativity workshops

■ 3,5 years of PhD research
                                                      15/02/2012, Michele Filannino   3 / 23
presentation my research taster project




where we are

■ Computer science
  ●   natural language processing
      ▶   information retrieval
           ★ information extraction

               ✦   temporal expressions extraction




                                                           15/02/2012, Michele Filannino   4 / 23
presentation my research taster project




or...

 ■ Computer science
   ●    data mining
        ▶   text mining
             ★ information extraction

                 ✦   temporal expressions extraction




                                                             15/02/2012, Michele Filannino   5 / 23
presentation my research taster project




temporal expression
       ■ natural language phrase that denotes a temporal
          entity: an interval or an instant1
            ●    fully-qualified: no reference to any other temporal
                 entity
                    ▶    March 15, 2001
            ●    deictic: reference to the time of utterance
                    ▶    today, yesterday, three weeks ago, last Thursday
            ●    anaphoric: reference to a timex2 previously evoked in
                 the text
                    ▶    March 15, the next week, Saturday, at that time
1 L.Ferro, I. Mani, B. Sundheim, and G. Wilson, “Tides temporal annotation guidelines, v.
1.0.2,” MITRE, 2001                                                                            15/02/2012, Michele Filannino   6 / 23
2 timex temporal expression
presentation my research taster project




why?

■ user’s perspective
   ●   temporal aspects of events and entities provide a
       natural mechanism for organising information.

■ machine’s perspective
   ●   improvements in
        ▶   question answering, summarisation, browsing



                                                  15/02/2012, Michele Filannino   7 / 23
presentation my research taster project




how?
■ annotation
  ●   recognition
      ▶   automatically detect and delimitate expressions
      ▶   mostly machine-learning techniques
  ●   normalisation
      ▶   assign attributes values for all the recognised
          expressions
      ▶   using a shared and formal format (standard?)
      ▶   mostly rule-based techniques
■ reasoning or searching
                                                   15/02/2012, Michele Filannino   8 / 23
presentation my research taster project




timex                     forms1

       ■ time or date references
            ●    11pm, February 14th, 2005

       ■ time references that anchor on another time
            ●    one hour after midnight, two weeks before Christmas

       ■ durations
            ●    few months, two days, five years

       ■ recurring times
            ●    every third month, twice in the hour

1 J.Poveda, M. Surdeanu, and J. Turmo, “An analysis of Bootstrapping for the Recognition
of Temporal Expressions”, 2009                                                               15/02/2012, Michele Filannino   9 / 23
presentation my research taster project




timex                     forms1

       ■ context-dependent times
            ●    today, last year

       ■ vague references
            ●    somewhere in the middle of June, the near future

       ■ times indicated by an event
            ●    the day S. Berlusconi resigned
                   ▶    an event is considered a cover term for situations that

                        happen or occur

1 J.Poveda, M. Surdeanu, and J. Turmo, “An analysis of Bootstrapping for the Recognition
of Temporal Expressions”, 2009                                                               15/02/2012, Michele Filannino   10 / 23
presentation my research taster project




timeline
                  ACE-2004 dev & eval                         TempEval Task#15                      TempEval-3 Task#1
                         (TERN2004 corpus)                           (in SemEval07)                           (in SemEval13)



                             TimeML                                                TempEval-2 Task#13
                              (standard)                                                     (in SemEval10)



   85%1                                      87.8%1        90.7%1
      2000       2001      2002      2003    2004   2005    2006    2007    2008      2009        2010       2011    2012      2013




                            TimeBank                        SVM                    Conditional Random Fields
                                  (corpus)             (machine learning)                       (machine learning)


      Hand grammar approach                         Maximum Entropy Class.                     Markov logic network
                      (rule-based)                         (machine learning)                            (machine learning)




1 TERN2004   corpus                                                                          15/02/2012, Michele Filannino     11 / 23
presentation my research taster project




standards

■ “the nice thing about standards is, there are so
  many to choose from” by Andrew S. Tanenbaum
   ●   TimeML
   ●   DAML-Time
   ●   TIDES
   ●   ACE-TERN


                                           15/02/2012, Michele Filannino   12 / 23
presentation my research taster project




standards

■ there’s a tension between
   ●   flexibility and efficiency
   ●   usability and flexibility
   ●   complexity and spreadability
   ●   flexibility and agreement



                                                15/02/2012, Michele Filannino   13 / 23
presentation my research taster project




about the spreadability




                             15/02/2012, Michele Filannino   14 / 23
presentation my research taster project




about the agreement
                   TimeML Tag                                       agreement
                         TIMEX3                                            0.83
                         SIGNAL                                            0.77
                          EVENT                                            0.78
                           ALINK                                           0.81
                           SLINK                                           0.85
                           TLINK                                          0.55
Source: http://timeml.org/site/timebank/documentation-1.2.html             15/02/2012, Michele Filannino   15 / 23
presentation my research taster project




example: raw text



        That means Unisys must pay about $100 million in interest every
        quarter, on top of $27 million in dividends on preferred stock.




Source: TRIOS TimeBank v.0.1                                 15/02/2012, Michele Filannino   16 / 23
presentation my research taster project




example: recognition


        That means Unisys must <ev>pay</ev> about $100 million in interest
        <te>every quarter</te>, on top of $27 million in dividends on preferred
        stock.




Source: TRIOS TimeBank v.0.1                                15/02/2012, Michele Filannino   17 / 23
presentation my research taster project




example: normalisation
        That means Unisys must <EVENT eid="e110" mainevent="YES"
        class="OCCURRENCE" stem="pay" tense="NONE" aspect="NONE"
        polarity="POS" pos="VERB">pay</EVENT> about $100 million in
        interest <TIMEX3 tid="t256" type="SET" value="P1Q"
        temporalFunction="false" functionInDocument="NONE"
        quant="every">every quarter</TIMEX3>, on top of $27 million in
        dividends on preferred stock.
        <TLINK lid="l32" relType="BEFORE" relatedToEvent="e110"
        eventID="e107"/>
        <TLINK lid="l26" relType="OVERLAP" eventID="e110"
        relatedToTime="t256"/>

Source: TRIOS TimeBank v.0.1                             15/02/2012, Michele Filannino   18 / 23
presentation my research taster project




considerations
■ specialised linguistic approaches do not pay
   ●   machine learning techniques usually perform better

■ scarcity of pre-annotated corpus
   ●   manual corpus annotation is very tricky
   ●   partially solved with TempEval-3 (2013)
        ▶   1M words corpus automatically annotated by TRIOS

■ vibrant area in bio-medical domain

                                                  15/02/2012, Michele Filannino   19 / 23
presentation my research taster project



          “temporal expressions”                          “temporal expressions” AND “clinical”

   500

   450                                                                                           44
                                                                 42            41      45
   400                                                                                                    46
                                                                        36

   350
                                                          22
   300                                             15

   250                                   15
                                16
                                                                                                433
   200                                                          410           410      412
             10        12                                                                                 382
                                                                       370

    150                                                  310
                                                  280
                               220      230
    100      182      180

     50                                                                                                               9
                                                                                                                      33
      0
           2000      2001     2002     2003       2004   2005   2006   2007   2008    2009     2010      2011        2012


Source: Google Scholar (last update 09/02/2012)                                      15/02/2012, Michele Filannino     20 / 23
presentation my research taster project



         “temporal expressions”                           “temporal expressions” AND “clinical”

  100%
             5%       6%                 6%       5%     7%
                               7%
                                                                9%     9%     9%      10%       9%        11%
   90%

   80%                                                                                                               21%


   70%

   60%

   50%
            95%      94%       93%      94%       95%    93%    91%    91%    91%     90%       91%      89%
   40%                                                                                                               79%


   30%

   20%

   10%

    0%
           2000      2001     2002     2003       2004   2005   2006   2007   2008    2009     2010      2011        2012


Source: Google Scholar (last update 09/02/2012)                                      15/02/2012, Michele Filannino     21 / 23
presentation my research taster project




considerations


■ rule-based approach will never die
   ●   CRF and MLN are machine learning hybridisation

■ better performance means clever decomposition
   ●   how to divide the general problem into sub-problems




                                              15/02/2012, Michele Filannino   22 / 23
presentation my research taster project




my to-do list
 ■ collect some corpus in clinical field
 ■ study novel machine learning approaches
    ●   maximum likelihood, logistic regression, CRF, MLN

 ■ implement a prototype
    ●   Python or MATLAB


            12 days elapsed                  18 days remaining
0       3          6          9   12   15   18          21           24           27         30




                                                             15/02/2012, Michele Filannino   23 / 23
Thank you.

Weitere ähnliche Inhalte

Ähnlich wie My research taster project

Pushing the awareness envelope
Pushing the awareness envelopePushing the awareness envelope
Pushing the awareness envelopeIsrael Gutiérrez
 
infoavond MC 2023 - Engelse versie -.pptx
infoavond MC 2023 - Engelse versie -.pptxinfoavond MC 2023 - Engelse versie -.pptx
infoavond MC 2023 - Engelse versie -.pptxdloijen
 
Reinventing the Data Analytics Classroom
Reinventing the Data Analytics ClassroomReinventing the Data Analytics Classroom
Reinventing the Data Analytics ClassroomGalit Shmueli
 
Outline new productmktg-2012fall-sep01
Outline new productmktg-2012fall-sep01Outline new productmktg-2012fall-sep01
Outline new productmktg-2012fall-sep01Yender McLee
 
Teaching speaking skill
Teaching speaking skillTeaching speaking skill
Teaching speaking skillPrum Rotana
 
Technology Use for Developing a Writing Plan
Technology Use for Developing a Writing PlanTechnology Use for Developing a Writing Plan
Technology Use for Developing a Writing PlanDoctoralNet Limited
 
NIDOS Log frames training 14th March 2013 - Jill Gentle
NIDOS Log frames training 14th March 2013 - Jill GentleNIDOS Log frames training 14th March 2013 - Jill Gentle
NIDOS Log frames training 14th March 2013 - Jill GentleNIDOS
 
Lee then-lim cc-fp finals_l 014-251115
Lee then-lim cc-fp finals_l 014-251115Lee then-lim cc-fp finals_l 014-251115
Lee then-lim cc-fp finals_l 014-251115Xiao Yun
 
Presentation Skills Part 1 - Planning & Organizing
Presentation Skills Part 1 - Planning & OrganizingPresentation Skills Part 1 - Planning & Organizing
Presentation Skills Part 1 - Planning & OrganizingMichelle Smyth
 
Ou video analysis workshopfin3
Ou video analysis workshopfin3Ou video analysis workshopfin3
Ou video analysis workshopfin3Anne Adams
 
Uses of Video Annotation Software to Promote Deep Learning - SoTE 2106
Uses of Video Annotation Software to Promote Deep Learning - SoTE 2106Uses of Video Annotation Software to Promote Deep Learning - SoTE 2106
Uses of Video Annotation Software to Promote Deep Learning - SoTE 2106Michael Johnson
 
Enbe the journal note brief
Enbe the journal note briefEnbe the journal note brief
Enbe the journal note briefshensin1015
 
Iaf article design
Iaf article designIaf article design
Iaf article designSpark cph
 
T ueworkshoplite.01
T ueworkshoplite.01T ueworkshoplite.01
T ueworkshoplite.01ProAkademia
 

Ähnlich wie My research taster project (20)

Pushing the awareness envelope
Pushing the awareness envelopePushing the awareness envelope
Pushing the awareness envelope
 
infoavond MC 2023 - Engelse versie -.pptx
infoavond MC 2023 - Engelse versie -.pptxinfoavond MC 2023 - Engelse versie -.pptx
infoavond MC 2023 - Engelse versie -.pptx
 
Reinventing the Data Analytics Classroom
Reinventing the Data Analytics ClassroomReinventing the Data Analytics Classroom
Reinventing the Data Analytics Classroom
 
Outline new productmktg-2012fall-sep01
Outline new productmktg-2012fall-sep01Outline new productmktg-2012fall-sep01
Outline new productmktg-2012fall-sep01
 
LESSON 16
LESSON 16LESSON 16
LESSON 16
 
Teaching speaking skill
Teaching speaking skillTeaching speaking skill
Teaching speaking skill
 
Pm and tm sofial
Pm and tm sofialPm and tm sofial
Pm and tm sofial
 
Pm and tm sofia
Pm and tm sofiaPm and tm sofia
Pm and tm sofia
 
Technology Use for Developing a Writing Plan
Technology Use for Developing a Writing PlanTechnology Use for Developing a Writing Plan
Technology Use for Developing a Writing Plan
 
Uwcsea day 1v2
Uwcsea day 1v2Uwcsea day 1v2
Uwcsea day 1v2
 
Nanoteaching Bhopal PPT
Nanoteaching Bhopal PPTNanoteaching Bhopal PPT
Nanoteaching Bhopal PPT
 
NIDOS Log frames training 14th March 2013 - Jill Gentle
NIDOS Log frames training 14th March 2013 - Jill GentleNIDOS Log frames training 14th March 2013 - Jill Gentle
NIDOS Log frames training 14th March 2013 - Jill Gentle
 
Lee then-lim cc-fp finals_l 014-251115
Lee then-lim cc-fp finals_l 014-251115Lee then-lim cc-fp finals_l 014-251115
Lee then-lim cc-fp finals_l 014-251115
 
Presentation Skills Part 1 - Planning & Organizing
Presentation Skills Part 1 - Planning & OrganizingPresentation Skills Part 1 - Planning & Organizing
Presentation Skills Part 1 - Planning & Organizing
 
Ou video analysis workshopfin3
Ou video analysis workshopfin3Ou video analysis workshopfin3
Ou video analysis workshopfin3
 
NCIHC HFT53 Teaching with Intention presentation slides.pdf
NCIHC HFT53 Teaching with Intention presentation slides.pdfNCIHC HFT53 Teaching with Intention presentation slides.pdf
NCIHC HFT53 Teaching with Intention presentation slides.pdf
 
Uses of Video Annotation Software to Promote Deep Learning - SoTE 2106
Uses of Video Annotation Software to Promote Deep Learning - SoTE 2106Uses of Video Annotation Software to Promote Deep Learning - SoTE 2106
Uses of Video Annotation Software to Promote Deep Learning - SoTE 2106
 
Enbe the journal note brief
Enbe the journal note briefEnbe the journal note brief
Enbe the journal note brief
 
Iaf article design
Iaf article designIaf article design
Iaf article design
 
T ueworkshoplite.01
T ueworkshoplite.01T ueworkshoplite.01
T ueworkshoplite.01
 

Mehr von Michele Filannino

Temporal information extraction in the general and clinical domain
Temporal information extraction in the general and clinical domainTemporal information extraction in the general and clinical domain
Temporal information extraction in the general and clinical domainMichele Filannino
 
Mining temporal footprints from Wikipedia
Mining temporal footprints from WikipediaMining temporal footprints from Wikipedia
Mining temporal footprints from WikipediaMichele Filannino
 
Can computers understand time?
Can computers understand time?Can computers understand time?
Can computers understand time?Michele Filannino
 
Detecting novel associations in large data sets
Detecting novel associations in large data setsDetecting novel associations in large data sets
Detecting novel associations in large data setsMichele Filannino
 
Sviluppo di un algoritmo di similarità a supporto dell'annotazione semantica ...
Sviluppo di un algoritmo di similarità a supporto dell'annotazione semantica ...Sviluppo di un algoritmo di similarità a supporto dell'annotazione semantica ...
Sviluppo di un algoritmo di similarità a supporto dell'annotazione semantica ...Michele Filannino
 
Algoritmo di text-similarity per l'annotazione semantica di Web Service
Algoritmo di text-similarity per l'annotazione semantica di Web ServiceAlgoritmo di text-similarity per l'annotazione semantica di Web Service
Algoritmo di text-similarity per l'annotazione semantica di Web ServiceMichele Filannino
 
Semantic Web Service Annotation
Semantic Web Service AnnotationSemantic Web Service Annotation
Semantic Web Service AnnotationMichele Filannino
 
Modulo di serendipità in un Item Recommender System
Modulo di serendipità in un Item Recommender SystemModulo di serendipità in un Item Recommender System
Modulo di serendipità in un Item Recommender SystemMichele Filannino
 
Serendipity module in Item Recommender System
Serendipity module in Item Recommender SystemSerendipity module in Item Recommender System
Serendipity module in Item Recommender SystemMichele Filannino
 

Mehr von Michele Filannino (10)

me_t3_october
me_t3_octoberme_t3_october
me_t3_october
 
Temporal information extraction in the general and clinical domain
Temporal information extraction in the general and clinical domainTemporal information extraction in the general and clinical domain
Temporal information extraction in the general and clinical domain
 
Mining temporal footprints from Wikipedia
Mining temporal footprints from WikipediaMining temporal footprints from Wikipedia
Mining temporal footprints from Wikipedia
 
Can computers understand time?
Can computers understand time?Can computers understand time?
Can computers understand time?
 
Detecting novel associations in large data sets
Detecting novel associations in large data setsDetecting novel associations in large data sets
Detecting novel associations in large data sets
 
Sviluppo di un algoritmo di similarità a supporto dell'annotazione semantica ...
Sviluppo di un algoritmo di similarità a supporto dell'annotazione semantica ...Sviluppo di un algoritmo di similarità a supporto dell'annotazione semantica ...
Sviluppo di un algoritmo di similarità a supporto dell'annotazione semantica ...
 
Algoritmo di text-similarity per l'annotazione semantica di Web Service
Algoritmo di text-similarity per l'annotazione semantica di Web ServiceAlgoritmo di text-similarity per l'annotazione semantica di Web Service
Algoritmo di text-similarity per l'annotazione semantica di Web Service
 
Semantic Web Service Annotation
Semantic Web Service AnnotationSemantic Web Service Annotation
Semantic Web Service Annotation
 
Modulo di serendipità in un Item Recommender System
Modulo di serendipità in un Item Recommender SystemModulo di serendipità in un Item Recommender System
Modulo di serendipità in un Item Recommender System
 
Serendipity module in Item Recommender System
Serendipity module in Item Recommender SystemSerendipity module in Item Recommender System
Serendipity module in Item Recommender System
 

Kürzlich hochgeladen

Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 

Kürzlich hochgeladen (20)

Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 

My research taster project

  • 1. CS-GN-TEAM: internal presentation research taster project temporal expressions extraction Michele Filannino + You Manchester, 15/02/2012
  • 2. presentation my research taster project cdt? ■ 4-year PhD course ■ funded by EPSRC ■ industrial partners ■ multi-disciplinary ■ new model for all PhD training within the UK 15/02/2012, Michele Filannino 2 / 23
  • 3. presentation my research taster project cdt? ■ 6 months of foundation period ● 3 postgraduate courses ▶ Machine Learning and Data Mining, Modelling and visualisation of high-dimensional data, Semi-structured data and the web ● 3 scientific methods courses ● 1 short taster project [6 weeks] ● creativity workshops ■ 3,5 years of PhD research 15/02/2012, Michele Filannino 3 / 23
  • 4. presentation my research taster project where we are ■ Computer science ● natural language processing ▶ information retrieval ★ information extraction ✦ temporal expressions extraction 15/02/2012, Michele Filannino 4 / 23
  • 5. presentation my research taster project or... ■ Computer science ● data mining ▶ text mining ★ information extraction ✦ temporal expressions extraction 15/02/2012, Michele Filannino 5 / 23
  • 6. presentation my research taster project temporal expression ■ natural language phrase that denotes a temporal entity: an interval or an instant1 ● fully-qualified: no reference to any other temporal entity ▶ March 15, 2001 ● deictic: reference to the time of utterance ▶ today, yesterday, three weeks ago, last Thursday ● anaphoric: reference to a timex2 previously evoked in the text ▶ March 15, the next week, Saturday, at that time 1 L.Ferro, I. Mani, B. Sundheim, and G. Wilson, “Tides temporal annotation guidelines, v. 1.0.2,” MITRE, 2001 15/02/2012, Michele Filannino 6 / 23 2 timex temporal expression
  • 7. presentation my research taster project why? ■ user’s perspective ● temporal aspects of events and entities provide a natural mechanism for organising information. ■ machine’s perspective ● improvements in ▶ question answering, summarisation, browsing 15/02/2012, Michele Filannino 7 / 23
  • 8. presentation my research taster project how? ■ annotation ● recognition ▶ automatically detect and delimitate expressions ▶ mostly machine-learning techniques ● normalisation ▶ assign attributes values for all the recognised expressions ▶ using a shared and formal format (standard?) ▶ mostly rule-based techniques ■ reasoning or searching 15/02/2012, Michele Filannino 8 / 23
  • 9. presentation my research taster project timex forms1 ■ time or date references ● 11pm, February 14th, 2005 ■ time references that anchor on another time ● one hour after midnight, two weeks before Christmas ■ durations ● few months, two days, five years ■ recurring times ● every third month, twice in the hour 1 J.Poveda, M. Surdeanu, and J. Turmo, “An analysis of Bootstrapping for the Recognition of Temporal Expressions”, 2009 15/02/2012, Michele Filannino 9 / 23
  • 10. presentation my research taster project timex forms1 ■ context-dependent times ● today, last year ■ vague references ● somewhere in the middle of June, the near future ■ times indicated by an event ● the day S. Berlusconi resigned ▶ an event is considered a cover term for situations that happen or occur 1 J.Poveda, M. Surdeanu, and J. Turmo, “An analysis of Bootstrapping for the Recognition of Temporal Expressions”, 2009 15/02/2012, Michele Filannino 10 / 23
  • 11. presentation my research taster project timeline ACE-2004 dev & eval TempEval Task#15 TempEval-3 Task#1 (TERN2004 corpus) (in SemEval07) (in SemEval13) TimeML TempEval-2 Task#13 (standard) (in SemEval10) 85%1 87.8%1 90.7%1 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 TimeBank SVM Conditional Random Fields (corpus) (machine learning) (machine learning) Hand grammar approach Maximum Entropy Class. Markov logic network (rule-based) (machine learning) (machine learning) 1 TERN2004 corpus 15/02/2012, Michele Filannino 11 / 23
  • 12. presentation my research taster project standards ■ “the nice thing about standards is, there are so many to choose from” by Andrew S. Tanenbaum ● TimeML ● DAML-Time ● TIDES ● ACE-TERN 15/02/2012, Michele Filannino 12 / 23
  • 13. presentation my research taster project standards ■ there’s a tension between ● flexibility and efficiency ● usability and flexibility ● complexity and spreadability ● flexibility and agreement 15/02/2012, Michele Filannino 13 / 23
  • 14. presentation my research taster project about the spreadability 15/02/2012, Michele Filannino 14 / 23
  • 15. presentation my research taster project about the agreement TimeML Tag agreement TIMEX3 0.83 SIGNAL 0.77 EVENT 0.78 ALINK 0.81 SLINK 0.85 TLINK 0.55 Source: http://timeml.org/site/timebank/documentation-1.2.html 15/02/2012, Michele Filannino 15 / 23
  • 16. presentation my research taster project example: raw text That means Unisys must pay about $100 million in interest every quarter, on top of $27 million in dividends on preferred stock. Source: TRIOS TimeBank v.0.1 15/02/2012, Michele Filannino 16 / 23
  • 17. presentation my research taster project example: recognition That means Unisys must <ev>pay</ev> about $100 million in interest <te>every quarter</te>, on top of $27 million in dividends on preferred stock. Source: TRIOS TimeBank v.0.1 15/02/2012, Michele Filannino 17 / 23
  • 18. presentation my research taster project example: normalisation That means Unisys must <EVENT eid="e110" mainevent="YES" class="OCCURRENCE" stem="pay" tense="NONE" aspect="NONE" polarity="POS" pos="VERB">pay</EVENT> about $100 million in interest <TIMEX3 tid="t256" type="SET" value="P1Q" temporalFunction="false" functionInDocument="NONE" quant="every">every quarter</TIMEX3>, on top of $27 million in dividends on preferred stock. <TLINK lid="l32" relType="BEFORE" relatedToEvent="e110" eventID="e107"/> <TLINK lid="l26" relType="OVERLAP" eventID="e110" relatedToTime="t256"/> Source: TRIOS TimeBank v.0.1 15/02/2012, Michele Filannino 18 / 23
  • 19. presentation my research taster project considerations ■ specialised linguistic approaches do not pay ● machine learning techniques usually perform better ■ scarcity of pre-annotated corpus ● manual corpus annotation is very tricky ● partially solved with TempEval-3 (2013) ▶ 1M words corpus automatically annotated by TRIOS ■ vibrant area in bio-medical domain 15/02/2012, Michele Filannino 19 / 23
  • 20. presentation my research taster project “temporal expressions” “temporal expressions” AND “clinical” 500 450 44 42 41 45 400 46 36 350 22 300 15 250 15 16 433 200 410 410 412 10 12 382 370 150 310 280 220 230 100 182 180 50 9 33 0 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 Source: Google Scholar (last update 09/02/2012) 15/02/2012, Michele Filannino 20 / 23
  • 21. presentation my research taster project “temporal expressions” “temporal expressions” AND “clinical” 100% 5% 6% 6% 5% 7% 7% 9% 9% 9% 10% 9% 11% 90% 80% 21% 70% 60% 50% 95% 94% 93% 94% 95% 93% 91% 91% 91% 90% 91% 89% 40% 79% 30% 20% 10% 0% 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 Source: Google Scholar (last update 09/02/2012) 15/02/2012, Michele Filannino 21 / 23
  • 22. presentation my research taster project considerations ■ rule-based approach will never die ● CRF and MLN are machine learning hybridisation ■ better performance means clever decomposition ● how to divide the general problem into sub-problems 15/02/2012, Michele Filannino 22 / 23
  • 23. presentation my research taster project my to-do list ■ collect some corpus in clinical field ■ study novel machine learning approaches ● maximum likelihood, logistic regression, CRF, MLN ■ implement a prototype ● Python or MATLAB 12 days elapsed 18 days remaining 0 3 6 9 12 15 18 21 24 27 30 15/02/2012, Michele Filannino 23 / 23