SlideShare ist ein Scribd-Unternehmen logo
1 von 26
Harnessing The Crowds For Automating
    The Identification Of Web APIs
 Carlos Pedrinaci, Chenghua Lin, Dong Liu, John Domingue
                 KMi, The Open University
Web APIs are the   Publicly offering valuable data and
                   functionality
new WEB services   Widely used and reused
                   Although their use is hardly automated
Web APIs and RESTful
      Services

• Services based on a simple(r) stack of
  technologies than WS-*
 • Roughly URL + HTTP + XML/JSON
• Easy way to provide a programmatic
  interface to existing Web sites
• Seldom adopt REST principles
How to Discover
  Web APIs?
Po
  or
     Res
        ul
          ts
OK
     Re
       su
          lts
Po
  or
     Res
        ul
          ts
Ou
  to
    fd
      ate
Issues for
 Discovering Web APIs


• There is no simple way to effectively
  and uniquely identify Web APIs
 • No standardised document
   describing the interface
 • URLs are hardly usable for this end
How can we
automatically find
    Web APIs?
Hypothesis

• Every Web API provides a/several
  public documentation page(s)
• These pages provide the most relevant
  information for developers
‣ Web API location can be approached as a
  documentation discovery problem
Web API          Given a Web page determine if it
                 documents an API or not
Identification   Sometimes a hard problem even for
                 humans
Collecting            Harnessing the crowds for detecting
                      documentation pages
documentation Pages
Generating a      Often the links are obsolete or point to
                  general pages
curated dataset
Dataset Generated

• We used API Validator to process 1,872
  APIs from ProgrammableWeb
 • 43% of the URLs we started with
   (data from 2010)
 • 624 a documentation page
 • 929 not a documentation page
 • 318 skipped (server down or unclear)
Web API identification
       Engine


• Web API identification as a binary
  classification problem
• Extract core features from Web pages
• Use machine learning algorithms to
  provide an identification engine
Preliminary Experiment

• Used initially only Web page words as a
  feature
• Trained two classifiers NB and SVM
• Used a simple keyword-based heuristic
  as baseline for comparison (the
  occurrence of 3 or more keywords)
 • api, input, output, GET, PUT, etc
Evaluation Results

 Model    Precision   Recall   F1     Accuracy



Keyword     60.3       75.7    67.0     70.2



  NB        71.0       79.2    74.8     78.6



 SVM        75.4       70.8    73.1     79.0
Evaluation Results

• Although preliminary the approach
  already provides promising results
• Both NB and SVM provide a good
  accuracy (about 80%)
• Best Precision (75.4%) achieved by SVM
  which is 15 points better than the
  baseline
Conclusions and Future
        Work
• Discovering Web APIs is becoming
  increasingly important and existing
  support is not optimal
• Web APIs identification is a first step
  that can well be approached as a
  documentation identification problem
• Crowds input (ProgWeb and API
  Validator) has been essential
Conclusions and
      Future Work

• Further features are been included for
  improving the results
 • Title, URL, presence of camelCase
   words
 • Current tests have reached an
   accuracy of 82% using SGD
Conclusions and
      Future Work


• A larger training set is necessary
 • Need more validated pages (help!)
 • http://iserve-dev.kmi.open.ac.uk/validator/

• A larger experiment will be carried over
  a normal Web crawl
Thanks for your
   attention

Weitere ähnliche Inhalte

Was ist angesagt?

Pushing the Boundaries - A Deep-Dive into Real-World SharePoint Add-In and Ap...
Pushing the Boundaries - A Deep-Dive into Real-World SharePoint Add-In and Ap...Pushing the Boundaries - A Deep-Dive into Real-World SharePoint Add-In and Ap...
Pushing the Boundaries - A Deep-Dive into Real-World SharePoint Add-In and Ap...Eric Shupps
 
SenchaCon 2016: Oracle Forms Modernisation - Owen Pagan
SenchaCon 2016: Oracle Forms Modernisation - Owen PaganSenchaCon 2016: Oracle Forms Modernisation - Owen Pagan
SenchaCon 2016: Oracle Forms Modernisation - Owen PaganSencha
 
Web Application Frameworks (WAF)
Web Application Frameworks (WAF)Web Application Frameworks (WAF)
Web Application Frameworks (WAF)Ako Kaman
 
Rest api to integrate with your site
Rest api to integrate with your siteRest api to integrate with your site
Rest api to integrate with your siteHoang Nguyen
 
Web Accessibility Evaluation with WAVE
Web Accessibility Evaluation with WAVEWeb Accessibility Evaluation with WAVE
Web Accessibility Evaluation with WAVEJared Smith
 
Introduction to the Web API
Introduction to the Web APIIntroduction to the Web API
Introduction to the Web APIBrad Genereaux
 
The SharePoint Survival Guide Top 10
The SharePoint Survival Guide Top 10The SharePoint Survival Guide Top 10
The SharePoint Survival Guide Top 10Eric Shupps
 
Single page applications with backbone js
Single page applications with backbone jsSingle page applications with backbone js
Single page applications with backbone jsGil Fink
 
Fast Track introduction to ASP.NET MVC
Fast Track introduction to ASP.NET MVCFast Track introduction to ASP.NET MVC
Fast Track introduction to ASP.NET MVCAnkit Kashyap
 

Was ist angesagt? (15)

Pushing the Boundaries - A Deep-Dive into Real-World SharePoint Add-In and Ap...
Pushing the Boundaries - A Deep-Dive into Real-World SharePoint Add-In and Ap...Pushing the Boundaries - A Deep-Dive into Real-World SharePoint Add-In and Ap...
Pushing the Boundaries - A Deep-Dive into Real-World SharePoint Add-In and Ap...
 
Yohan_CV
Yohan_CVYohan_CV
Yohan_CV
 
SenchaCon 2016: Oracle Forms Modernisation - Owen Pagan
SenchaCon 2016: Oracle Forms Modernisation - Owen PaganSenchaCon 2016: Oracle Forms Modernisation - Owen Pagan
SenchaCon 2016: Oracle Forms Modernisation - Owen Pagan
 
Web Application Frameworks (WAF)
Web Application Frameworks (WAF)Web Application Frameworks (WAF)
Web Application Frameworks (WAF)
 
Rest api to integrate with your site
Rest api to integrate with your siteRest api to integrate with your site
Rest api to integrate with your site
 
sell idea
sell ideasell idea
sell idea
 
Web Accessibility Evaluation with WAVE
Web Accessibility Evaluation with WAVEWeb Accessibility Evaluation with WAVE
Web Accessibility Evaluation with WAVE
 
Introduction to the Web API
Introduction to the Web APIIntroduction to the Web API
Introduction to the Web API
 
HTBYOOFIYRHT RubyConf
HTBYOOFIYRHT RubyConfHTBYOOFIYRHT RubyConf
HTBYOOFIYRHT RubyConf
 
The SharePoint Survival Guide Top 10
The SharePoint Survival Guide Top 10The SharePoint Survival Guide Top 10
The SharePoint Survival Guide Top 10
 
Benefits of developing single page web applications using angular js
Benefits of developing single page web applications using angular jsBenefits of developing single page web applications using angular js
Benefits of developing single page web applications using angular js
 
Single page applications with backbone js
Single page applications with backbone jsSingle page applications with backbone js
Single page applications with backbone js
 
Api crash
Api crashApi crash
Api crash
 
Aapkamanch
AapkamanchAapkamanch
Aapkamanch
 
Fast Track introduction to ASP.NET MVC
Fast Track introduction to ASP.NET MVCFast Track introduction to ASP.NET MVC
Fast Track introduction to ASP.NET MVC
 

Andere mochten auch

Linked Services for the Web of Data
Linked Services for the Web of DataLinked Services for the Web of Data
Linked Services for the Web of DataCarlos Pedrinaci
 
Supporting the virtual physiological human with semantics and services e scie...
Supporting the virtual physiological human with semantics and services e scie...Supporting the virtual physiological human with semantics and services e scie...
Supporting the virtual physiological human with semantics and services e scie...Carlos Pedrinaci
 
2014 IEEE JAVA PARALLEL DISTRIBUTED PROJECT Web service recommendation via ex...
2014 IEEE JAVA PARALLEL DISTRIBUTED PROJECT Web service recommendation via ex...2014 IEEE JAVA PARALLEL DISTRIBUTED PROJECT Web service recommendation via ex...
2014 IEEE JAVA PARALLEL DISTRIBUTED PROJECT Web service recommendation via ex...IEEEFINALYEARSTUDENTPROJECT
 
Web Mapping - exploiting location based information through eGovernment
Web Mapping - exploiting  location based information through eGovernmentWeb Mapping - exploiting  location based information through eGovernment
Web Mapping - exploiting location based information through eGovernmentDavid Hayward
 
Dieter Fensel's view on the future of Linked Data
Dieter Fensel's view on the future of Linked DataDieter Fensel's view on the future of Linked Data
Dieter Fensel's view on the future of Linked DataCarlos Pedrinaci
 
Semantics for the Web of Things
Semantics for the Web of ThingsSemantics for the Web of Things
Semantics for the Web of ThingsCarlos Pedrinaci
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionIn a Rocket
 
How to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanHow to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanPost Planner
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldabaux singapore
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting PersonalKirsty Hulse
 

Andere mochten auch (13)

Linked Services for the Web of Data
Linked Services for the Web of DataLinked Services for the Web of Data
Linked Services for the Web of Data
 
iServe Version 1
iServe Version 1iServe Version 1
iServe Version 1
 
Supporting the virtual physiological human with semantics and services e scie...
Supporting the virtual physiological human with semantics and services e scie...Supporting the virtual physiological human with semantics and services e scie...
Supporting the virtual physiological human with semantics and services e scie...
 
2014 IEEE JAVA PARALLEL DISTRIBUTED PROJECT Web service recommendation via ex...
2014 IEEE JAVA PARALLEL DISTRIBUTED PROJECT Web service recommendation via ex...2014 IEEE JAVA PARALLEL DISTRIBUTED PROJECT Web service recommendation via ex...
2014 IEEE JAVA PARALLEL DISTRIBUTED PROJECT Web service recommendation via ex...
 
Towards a Web of Services
Towards a Web of ServicesTowards a Web of Services
Towards a Web of Services
 
Web Mapping - exploiting location based information through eGovernment
Web Mapping - exploiting  location based information through eGovernmentWeb Mapping - exploiting  location based information through eGovernment
Web Mapping - exploiting location based information through eGovernment
 
Dieter Fensel's view on the future of Linked Data
Dieter Fensel's view on the future of Linked DataDieter Fensel's view on the future of Linked Data
Dieter Fensel's view on the future of Linked Data
 
Semantics for the Web of Things
Semantics for the Web of ThingsSemantics for the Web of Things
Semantics for the Web of Things
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming Convention
 
How to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanHow to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media Plan
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting Personal
 
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job? Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
 

Ähnlich wie Harnessing the Crowds for Automating the Identification of Web APIs

apidays LIVE Paris 2021 - Lessons from the API Stewardship Journey in Azure b...
apidays LIVE Paris 2021 - Lessons from the API Stewardship Journey in Azure b...apidays LIVE Paris 2021 - Lessons from the API Stewardship Journey in Azure b...
apidays LIVE Paris 2021 - Lessons from the API Stewardship Journey in Azure b...apidays
 
Lessons learned on the Azure API Stewardship Journey.pptx
Lessons learned on the Azure API Stewardship Journey.pptxLessons learned on the Azure API Stewardship Journey.pptx
Lessons learned on the Azure API Stewardship Journey.pptxapidays
 
Do not automate GUI testing
Do not automate GUI testingDo not automate GUI testing
Do not automate GUI testingAtila Inovecký
 
Owin from spec to application
Owin from spec to applicationOwin from spec to application
Owin from spec to applicationdamian-h
 
Azure Functions Real World Examples
Azure Functions Real World Examples Azure Functions Real World Examples
Azure Functions Real World Examples Yochay Kiriaty
 
Single page applications the basics
Single page applications the basicsSingle page applications the basics
Single page applications the basicsChris Love
 
Build Modern Web Apps Using ASP.NET Web API and AngularJS
Build Modern Web Apps Using ASP.NET Web API and AngularJSBuild Modern Web Apps Using ASP.NET Web API and AngularJS
Build Modern Web Apps Using ASP.NET Web API and AngularJSTaiseer Joudeh
 
RESTful HATEOAS standards using Java based Katharsis
RESTful HATEOAS standards using Java based KatharsisRESTful HATEOAS standards using Java based Katharsis
RESTful HATEOAS standards using Java based KatharsisKeith Moore
 
RESTful HATEOAS standards using Java based Katharsis
RESTful HATEOAS standards using Java based KatharsisRESTful HATEOAS standards using Java based Katharsis
RESTful HATEOAS standards using Java based KatharsisKeith Moore
 
Web api using rest based architecture
Web api using rest based architectureWeb api using rest based architecture
Web api using rest based architectureSoham Kulkarni
 
Amish Umesh - Future Of Web App Testing - ClubHack2007
Amish Umesh - Future Of Web App Testing  - ClubHack2007Amish Umesh - Future Of Web App Testing  - ClubHack2007
Amish Umesh - Future Of Web App Testing - ClubHack2007ClubHack
 
Cross-Lingual Web API Classification
Cross-Lingual Web API ClassificationCross-Lingual Web API Classification
Cross-Lingual Web API Classificationmmaleshkova
 
Backbonification for dummies - Arrrrug 10/1/2012
Backbonification for dummies - Arrrrug 10/1/2012Backbonification for dummies - Arrrrug 10/1/2012
Backbonification for dummies - Arrrrug 10/1/2012Dimitri de Putte
 
Selenium – Web Browser Automation
Selenium – Web Browser AutomationSelenium – Web Browser Automation
Selenium – Web Browser AutomationPakorn Weecharungsan
 
Best Practices in Web Service Design
Best Practices in Web Service DesignBest Practices in Web Service Design
Best Practices in Web Service DesignLorna Mitchell
 
Planet of the APIs: Monitoring Transactions in the Wild
Planet of the APIs: Monitoring Transactions in the WildPlanet of the APIs: Monitoring Transactions in the Wild
Planet of the APIs: Monitoring Transactions in the WildDeborah Schalm
 
Planet of the APIs: Monitoring Transactions in the Wild
Planet of the APIs: Monitoring Transactions in the WildPlanet of the APIs: Monitoring Transactions in the Wild
Planet of the APIs: Monitoring Transactions in the WildDevOps.com
 

Ähnlich wie Harnessing the Crowds for Automating the Identification of Web APIs (20)

apidays LIVE Paris 2021 - Lessons from the API Stewardship Journey in Azure b...
apidays LIVE Paris 2021 - Lessons from the API Stewardship Journey in Azure b...apidays LIVE Paris 2021 - Lessons from the API Stewardship Journey in Azure b...
apidays LIVE Paris 2021 - Lessons from the API Stewardship Journey in Azure b...
 
Lessons learned on the Azure API Stewardship Journey.pptx
Lessons learned on the Azure API Stewardship Journey.pptxLessons learned on the Azure API Stewardship Journey.pptx
Lessons learned on the Azure API Stewardship Journey.pptx
 
Do not automate GUI testing
Do not automate GUI testingDo not automate GUI testing
Do not automate GUI testing
 
Owin from spec to application
Owin from spec to applicationOwin from spec to application
Owin from spec to application
 
Understanding Web services
Understanding Web servicesUnderstanding Web services
Understanding Web services
 
ASP.NET MVC - Latest & Greatest So Far
ASP.NET MVC - Latest & Greatest So FarASP.NET MVC - Latest & Greatest So Far
ASP.NET MVC - Latest & Greatest So Far
 
Azure Functions Real World Examples
Azure Functions Real World Examples Azure Functions Real World Examples
Azure Functions Real World Examples
 
Single page applications the basics
Single page applications the basicsSingle page applications the basics
Single page applications the basics
 
Build Modern Web Apps Using ASP.NET Web API and AngularJS
Build Modern Web Apps Using ASP.NET Web API and AngularJSBuild Modern Web Apps Using ASP.NET Web API and AngularJS
Build Modern Web Apps Using ASP.NET Web API and AngularJS
 
RESTful HATEOAS standards using Java based Katharsis
RESTful HATEOAS standards using Java based KatharsisRESTful HATEOAS standards using Java based Katharsis
RESTful HATEOAS standards using Java based Katharsis
 
RESTful HATEOAS standards using Java based Katharsis
RESTful HATEOAS standards using Java based KatharsisRESTful HATEOAS standards using Java based Katharsis
RESTful HATEOAS standards using Java based Katharsis
 
Web api using rest based architecture
Web api using rest based architectureWeb api using rest based architecture
Web api using rest based architecture
 
Amish Umesh - Future Of Web App Testing - ClubHack2007
Amish Umesh - Future Of Web App Testing  - ClubHack2007Amish Umesh - Future Of Web App Testing  - ClubHack2007
Amish Umesh - Future Of Web App Testing - ClubHack2007
 
Cross-Lingual Web API Classification
Cross-Lingual Web API ClassificationCross-Lingual Web API Classification
Cross-Lingual Web API Classification
 
Backbonification for dummies - Arrrrug 10/1/2012
Backbonification for dummies - Arrrrug 10/1/2012Backbonification for dummies - Arrrrug 10/1/2012
Backbonification for dummies - Arrrrug 10/1/2012
 
Selenium – Web Browser Automation
Selenium – Web Browser AutomationSelenium – Web Browser Automation
Selenium – Web Browser Automation
 
Best Practices in Web Service Design
Best Practices in Web Service DesignBest Practices in Web Service Design
Best Practices in Web Service Design
 
Planet of the APIs: Monitoring Transactions in the Wild
Planet of the APIs: Monitoring Transactions in the WildPlanet of the APIs: Monitoring Transactions in the Wild
Planet of the APIs: Monitoring Transactions in the Wild
 
Planet of the APIs: Monitoring Transactions in the Wild
Planet of the APIs: Monitoring Transactions in the WildPlanet of the APIs: Monitoring Transactions in the Wild
Planet of the APIs: Monitoring Transactions in the Wild
 
Web Based APIs
Web Based APIsWeb Based APIs
Web Based APIs
 

Kürzlich hochgeladen

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 

Kürzlich hochgeladen (20)

Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 

Harnessing the Crowds for Automating the Identification of Web APIs

  • 1. Harnessing The Crowds For Automating The Identification Of Web APIs Carlos Pedrinaci, Chenghua Lin, Dong Liu, John Domingue KMi, The Open University
  • 2. Web APIs are the Publicly offering valuable data and functionality new WEB services Widely used and reused Although their use is hardly automated
  • 3. Web APIs and RESTful Services • Services based on a simple(r) stack of technologies than WS-* • Roughly URL + HTTP + XML/JSON • Easy way to provide a programmatic interface to existing Web sites • Seldom adopt REST principles
  • 4.
  • 5.
  • 6.
  • 7. How to Discover Web APIs?
  • 8. Po or Res ul ts
  • 9. OK Re su lts
  • 10. Po or Res ul ts
  • 11. Ou to fd ate
  • 12. Issues for Discovering Web APIs • There is no simple way to effectively and uniquely identify Web APIs • No standardised document describing the interface • URLs are hardly usable for this end
  • 13. How can we automatically find Web APIs?
  • 14. Hypothesis • Every Web API provides a/several public documentation page(s) • These pages provide the most relevant information for developers ‣ Web API location can be approached as a documentation discovery problem
  • 15. Web API Given a Web page determine if it documents an API or not Identification Sometimes a hard problem even for humans
  • 16. Collecting Harnessing the crowds for detecting documentation pages documentation Pages
  • 17. Generating a Often the links are obsolete or point to general pages curated dataset
  • 18. Dataset Generated • We used API Validator to process 1,872 APIs from ProgrammableWeb • 43% of the URLs we started with (data from 2010) • 624 a documentation page • 929 not a documentation page • 318 skipped (server down or unclear)
  • 19. Web API identification Engine • Web API identification as a binary classification problem • Extract core features from Web pages • Use machine learning algorithms to provide an identification engine
  • 20. Preliminary Experiment • Used initially only Web page words as a feature • Trained two classifiers NB and SVM • Used a simple keyword-based heuristic as baseline for comparison (the occurrence of 3 or more keywords) • api, input, output, GET, PUT, etc
  • 21. Evaluation Results Model Precision Recall F1 Accuracy Keyword 60.3 75.7 67.0 70.2 NB 71.0 79.2 74.8 78.6 SVM 75.4 70.8 73.1 79.0
  • 22. Evaluation Results • Although preliminary the approach already provides promising results • Both NB and SVM provide a good accuracy (about 80%) • Best Precision (75.4%) achieved by SVM which is 15 points better than the baseline
  • 23. Conclusions and Future Work • Discovering Web APIs is becoming increasingly important and existing support is not optimal • Web APIs identification is a first step that can well be approached as a documentation identification problem • Crowds input (ProgWeb and API Validator) has been essential
  • 24. Conclusions and Future Work • Further features are been included for improving the results • Title, URL, presence of camelCase words • Current tests have reached an accuracy of 82% using SGD
  • 25. Conclusions and Future Work • A larger training set is necessary • Need more validated pages (help!) • http://iserve-dev.kmi.open.ac.uk/validator/ • A larger experiment will be carried over a normal Web crawl
  • 26. Thanks for your attention

Hinweis der Redaktion

  1. \n
  2. \n
  3. \n
  4. A Web API - the main page\n
  5. A Web API - the documentation page\n
  6. A Web API - the offered functionality and data\n(example of an invocation and the XML obtained)\n
  7. \n
  8. Google?\n
  9. Prog Web\n\nIssues: \n- requires manual registration\n- gets out of date (example)\n- discovery remains at a very coarse grain (manual categorisation, or some keywords) \nNo notion of operations and resources provided, etc\n
  10. Prog Web\n\nIssues: \n- requires manual registration\n- gets out of date (example)\n- discovery remains at a very coarse grain (manual categorisation, or some keywords) \nNo notion of operations and resources provided, etc\n
  11. Prog Web\n\nIssues: \n- requires manual registration\n- gets out of date (example)\n- discovery remains at a very coarse grain (manual categorisation, or some keywords) \nNo notion of operations and resources provided, etc\n
  12. \n
  13. \n
  14. Some example documentation pages\n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n