SlideShare ist ein Scribd-Unternehmen logo
1 von 32
By Sanif S S
Reg
No:10007399
S7 IT
Overview
 Understanding the terms.
 Objectives.
 In detail
o Keyword Retrieval
o Variable Retrieval
o API Specification Mining
o Function Retrieval
o Code Generation
 Experiment
 Conclusion
 References
API Specification-Based Function Search Engine
Using Natural Language Query
 API – Application Programming Interface.
 An API is a set of commands, functions, and protocols which programmers can use when
building software for a specific operating system
 APIs are usually Implemented as Header Files.
 EX:
o Java APIs
o ODBC for Microsoft Windows
API Specification-Based Function Search Engine
Using Natural Language Query
 Description about the classes and methods inside the API.
 Each method(or function) and its uses are briefly described in the API Specifications.
API Specification-Based Function Search Engine
Using Natural Language Query
 Function search engine is nothing but as the name suggests a search engine for all the
methods in the API.
API Specification-Based Function Search Engine
Using Natural Language Query
 Natural Language Query is a query that uses a complete sentence or question to
begin a search.
 Ex:
o “What is the capital of India?”
o “How to make pizza?”
API Specification-Based Function Search Engine
Using Natural Language Query
 Means a search engine to search all the functions/methods in an Application
programming interface(API) using simple queries.
 Additionally this paper also suggests a means of generating automatic function calls
based on the search.
 Programmers nearly always use existing functions while developing their applications.
 The functions have grown more numerous and more diverse.
 The Problem is that ‘what functions they want’ and know ‘how to
call those functions?’.
 The Solution:-
o This paper present two novel approaches to address these problems.
o The first is the approach to find right functions based on the API specification.
o The second is approach to automatically generate code for “function call”
 There are two main objectives in this paper:
o Retrieving functions, and
o Generating code for function calls.
 Two different forms of queries corresponding to these objectives.
o The first is “function search query” which requests to look for functions.
o The second is “function call query” which requests to generate code for function
calls.
Code
Generation
Variable
Retrieval
Function
Description
API
Document
Fig:Function Search Model
Function
Search Query
Keyword
Retrieval
Mining
Function
Retrieval
Function retrieval is the process of finding suitable
functions by matching “the extracted keywords from
a function search query” to “descriptions of
functions in the API specification”.
Keyword retrieval is the process of extracting
keywords from a function search query
Mining is the process of extracting contents in the
API specification to support function retrieval
Function Call
Query
Function
Call
Variable retrieval is the process of extracting
Variables from a function call query
Code generation is the process of generating code
for a function call based on both the variables
extracted from function call query.
 There are several methods to identify keywords in a natural
language sequence.
 Some methods identify keyword as a simple word, while others identify a keyword
phrase.
 In this paper Introducing four technologies of natural language processing to extract
keywords.
-POS tagging, POS filtering, Stemming, Synonym generation.
Word/POS
POS
Filter
POS tagging (part-ofspeech tagging) is the
technology to mark up a word in a natural language
sentence (NL Sentence).
Fig Keyword Retrieval Process
NL Sentence POS
Tagging
Stemming
keywordsSynonym
Generation
Main
Word
Original
Word
POS filtering is the technology to remove stopwords
such as prepositions, pronouns, conjunctions, and
interjections.
Stemming is the technology to reduce inflected (or
sometimes derived) words to their root form.
(Ex: ‘return’ is the root form of words “returns,
returning, returned”.
Synonym generation is the technology
to identify synonyms of the retrieved keywords
 For the natural language query “Gets an element in the collection”. The followings are
results obtained in the above stages.
o POS Tagging: Gets/VB an/DT element/NN in/IN the/DF collection/NN.
o POS Filtering: Gets element collection.
o Stemming: Get element collection.
o Synonym Generation: Get-have/return
element-object/component
collection-list/set.
NOTE:
VB-Verb
DT-Determiner
NN-Noun
IN-Preposition
DF-Adjective
 Two kinds of objects in a function call query:
-Words and Variables.
 Many words related to each variable in the query.
 Also each word in the query is only relevant to one(or zero) variable.
 words, which are relevant to a variable, is called features of this variable.
 Every relation between words and variable is represented by a “variable retrieval rule”
derived from a corresponding syntactic rule.
 Ex:Some variable retrieval rules
o Root(sf V ) -> V B(wf W)NP(sf V )
o NP(sf fv1; v2g) -> NP(vf v1)PP(vf v2)
o NP(sf V [ fvg) -> NP(sf V )PP(vf v)
o NP(sf V1 [ V2) -> NP(sf V1)PP(sf V2)
o PP(vf v[W1 W2]) -> IN(wf W1)NP(wf v[W2])
o PP(sf V ) -> IN(wf W)NP(sf V )
o NP(wf W1 W2) -> NN(wf W1)NN(wf W2)
o NP(vf v[W1 W2]) -> NN(wf W1)NN(vf v[W2])
o NP(vf v[W1 W2 W3]) ->DT(wf W1)
V BN(wf W2)
NN(vf v[W3])
 In figure 3, a query in natural
language (“Insert element e in a set
at index k”) is parsed in a tree
structure by using Stanford-Parser
tool.
 The last result is:
o e[element];
o a[a set];
o k[at index];
Fig. 3: Parsing tree for
the function call query
 This subsection focuses on mining the API specification of Java ,called Java API
specification.
 In the Java API specification, there are many contents related to function which may be
mined to support the function retrieval process and the code generation process.
 They are:-
o function specification
o functionality description
o parameter features
 Function specification: is a structured data that describes the usage of function.
 information, which can be extracted from this content, is:Function name, function scope,
return type, a list of parameters,and so on…
 Functionality description: is an unstructured data in the form of natural language that
describes the functionality of the function.
 To extract information in this content, the keyword retrieval method (presented in
previous slide) is used.
 Parameter features: is an unstructured data in the form of natural language that
describes
 features of the parameters in the function specification.
 The necessary information in this content are extracted by usingnatural language
processing technologies.
Example:
 The function add() is described in the Java API specification ArrayList as follows.
 Function specification: public void add(int index,Object element).
 Functionality description: “Inserts the specified element at the specifiedposition in this
list”.
 Parameter features: “index - index at which the specified element is to be inserted” and
“element - element to be inserted”.
 There are three stages in the process of retrieving function.
 Stage 1: extracting the functions related to user’s query based on some constraints.
 Stage 2: refining the obtained result in the previous stage by removing some irrelevant
functions.
 Stage 3: ranking the collected relevant functions in descending order of appropriate
degree of query.
 The standard syntax of a function call statement is object.callName(arg1, arg2,…., argk)
 To generate code for a function call, we map user’s query to the corresponding function
call based on its function definition.
 Two Steps:
i. identifying certain variable vj as the object o , and
ii. mapping the remaining variables to the corresponding arguments arg1, arg2, argk
 In the first Step , the function retrieval method is used to identify a set of functions
related to user’s query.
 However, to use this method, the “function call query” need to be transferred to the
“function search query” by removing all variables in this query.
 The variable, whose type contains at least one function related to the new query, is the
desired object o
 In the second step all Other variables are set as parameters.
 For example, give the query “inserts an element <e:Object> in a collection <a:ArrayList>”,
the variable a with type ArrayList contains the function add related to the new query
“inserts an element in a collection”, so a:add(?) is a suitable function call.
A. User Study
 In the first user study, ten common search tasks are designed and assigned them to the
participants.
 Then, each participant used FSE and some other search engines to complete these tasks.
 Three search engines are given to users for study: FSE, Krugle, Koder.
 In the second user study, the participants suggested over 100 requests that generate
code for function call.
 Then, they checked degree of fitness between obtained results and their requests to
calculate accuracy for FSE.
 There are four degrees of fitness: Highly Relevant, Somewhat Relevant, Somewhat
Irrelevant, Highly Irrelevant.
 Hightly Relevant- The top result in the set of the returned solutions is absolutely fit with
user’s request.
 Somewhat Relevant- The desired result in result set was not in the first position.
 Somewhat Irrelevant- If it contains the function with correct name but wrong
parameters.
 Highly Irrelevant- The lowest level.
B. Results
0
0.1
0.2
0.3
0.4
0.5
0.6
User 1 User 2 User 3
Krugle
Koder
FSE
B. Results
In this figure
 92% -correct functions that were
relevant to user’s request.
 71% -correct function in the first
position of solution set.
 7% -did not find any proper
function.
 Efficient function search approach by using the API specification is proposed in this
paper
 Also presented a novel function call generation method that generates source code to
invoke the functions based on variable features extracted from user’s query.
 Finally, we have implemented FSE, a function search engine that helps programmers to
quickly examine different functions that might be appropriate for a problem, obtain
more information about particular functions, and automatically generate code for
function calls to know how to use a function.
[1] A. J. Ko, B. A. Myers, and H. H. Aung, “Six learning barriers in enduser programming systems,” in
Proc. of the 2004 IEEE Symposium on
Visual Languages - Human Centric Computing, ser. VLHCC ’04. IEEE Computer Society, 2004, pp. 199–
206.
[2] D. Mandelin, L. Xu, R. Bod´ık, and D. Kimelman, “Jungloid mining: helping to navigate the api jungle,”
in Proc. of the 2005 ACM SIGPLAN conference on Programming language design and
implementation, ser. PLDI ’05. ACM, 2005, pp. 48–61.
[3] J. Stylos and B. A. Myers, “Mica: A web-search tool for finding api components and examples,” in
Proc. of the Visual Languages and Human-Centric Computing, ser. VLHCC ’06. IEEE Computer
Society, 2006, pp. 195–202.
[4] R. Hoffmann, J. Fogarty, and D. S. Weld, “Assieme: finding and leveraging implicit references in a
web search interface for programmers,” in Proc. of the 20th annual ACM symposium on User interface
software and technology, ser. UIST ’07. ACM, 2007, pp. 13–22.
[5] S. Thummalapenta and T. Xie, “Parseweb: a programmer assistant for reusing open source code on
the web,” in Proc. of the twentysecond IEEE/ACM international conference on Automated software
engineering, ser. ASE ’07. ACM, 2007, pp. 204–213.
[6] M. Grechanik, C. Fu, Q. Xie, C. McMillan, D. Poshyvanyk, and C. Cumby, “A search engine for finding
highly relevant applications,” in Proc. of the 32nd ACM/IEEE International Conference on Software
Engineering - Volume 1, ser. ICSE ’10. ACM, 2010, pp. 475–484.
[7] S. Chatterjee, S. Juvekar, and K. Sen, “Sniff: A search engine for java using free-form queries,” in
Proc. of the 12th International Conference on Fundamental Approaches to Software Engineering: Held
as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009, ser. FASE
’09. Springer-Verlag, 2009, pp. 385–400.
[8] M. Grechanik, K. M. Conroy, and K. A. Probst, “Finding relevant applications for prototyping,” in
Proc. of the Fourth International Workshop on Mining Software Repositories, ser. MSR ’07. IEEE
Computer Society, 2007, pp. 12–.
[9] R. Pandita, X. Xiao, H. Zhong, T. Xie, S. Oney, and A. Paradkar, “Inferring method specifications from
natural language api descriptions,” in Proceedings of the 2012 International Conference on Software
Engineering, ser. ICSE 2012. IEEE Press, 2012, pp. 815–825.
[10] A. Fantechi, S. Gnesi, G. Lami, and A. Maccari, “Application of linguistic techniques for use case
analysis,” in Proc. of the 10th Anniversary IEEE Joint International Conference on Requirements
Engineering, ser. RE ’02. IEEE Computer Society, 2002, pp. 157–164.
[11] D. Klein and C. D. Manning, “Accurate unlexicalized parsing,” in Proc. of the 41st Annual Meeting
on Association for Computational Linguistics - Volume 1, ser. ACL ’03. Association for Computational
Linguistics, 2003, pp. 423–430.
[12] L. Kof, “Scenarios: Identifying missing objects and actions by means of computational linguistics.”
in RE. IEEE, 2007, pp. 121–130.
[13] K. Rothenhausler and H. Schutze, “Part of speech filtered word spaces,” in Proc. of the 2007
Workshop on Contextual Information in Semantic Space Models: Beyond Words and
Documents, 2007, pp. 25–32.
[14] D. Shepherd, Z. P. Fry, E. Hill, L. Pollock, and K. Vijay-Shanker, “Using natural language program
analysis to locate and understand action-oriented concerns,” in Proc. of the 6th international
conference on Aspect-oriented software development, ser. AOSD ’07. ACM, 2007, pp. 212–224.
[15] R. Hemayati, W. Meng, and C. Yu, “Semantic-based grouping of search engine results using
wordnet,” in Proc. of the joint 9th Asia- Pacific web and 8th international conference on web-age
information management conference on Advances in data and web management, ser.
APWeb/WAIM’07. Springer-Verlag, 2007, pp. 678–686.
[16] C. Manning and D. Klein. The stanford parser. [Online]. Available:
http://nlp.stanford.edu/software/lex-parser.shtml
[17] Java api. [Online]. Available: docs.oracle.com/javase/1.4.2/docs/api
[18] L. Vaughan, “New measurements for search engine evaluation proposed and tested,” Inf. Process.
Manage., vol. 40, no. 4, pp. 677–691, May 2004.
[19] Krugle inc. [Online]. Available: http://opensearch.krugle.com/
[20] Koder inc. [Online]. Available: http://www.koders.com/
[21] S. E. Sim, M. Umarji, S. Ratanotayanon, and C. V. Lopes, “How well do search engines support code
retrieval on the web?” ACM Trans. Softw. Eng. Methodol., vol. 21, no. 1, pp. 4:1–4:25, Dec. 2011
Api specification based function search engine using natural language query-Seminar Conducted by me

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Java 8 Lambda Expressions & Streams
Java 8 Lambda Expressions & StreamsJava 8 Lambda Expressions & Streams
Java 8 Lambda Expressions & Streams
 
Function arguments In Python
Function arguments In PythonFunction arguments In Python
Function arguments In Python
 
Java 8 presentation
Java 8 presentationJava 8 presentation
Java 8 presentation
 
Introduction of Java 8 with emphasis on Lambda Expressions and Streams
Introduction of Java 8 with emphasis on Lambda Expressions and StreamsIntroduction of Java 8 with emphasis on Lambda Expressions and Streams
Introduction of Java 8 with emphasis on Lambda Expressions and Streams
 
Functional programming with Java 8
Functional programming with Java 8Functional programming with Java 8
Functional programming with Java 8
 
Functions in C
Functions in CFunctions in C
Functions in C
 
java 8 new features
java 8 new features java 8 new features
java 8 new features
 
Actor based approach in practice for Swift developers
Actor based approach in practice for Swift developersActor based approach in practice for Swift developers
Actor based approach in practice for Swift developers
 
Lecture20 user definedfunctions.ppt
Lecture20 user definedfunctions.pptLecture20 user definedfunctions.ppt
Lecture20 user definedfunctions.ppt
 
Java 8 Lambda and Streams
Java 8 Lambda and StreamsJava 8 Lambda and Streams
Java 8 Lambda and Streams
 
Siteimprove TechTalk: Demystifying Accessible Names
Siteimprove TechTalk: Demystifying Accessible NamesSiteimprove TechTalk: Demystifying Accessible Names
Siteimprove TechTalk: Demystifying Accessible Names
 
VIT351 Software Development VI Unit1
VIT351 Software Development VI Unit1VIT351 Software Development VI Unit1
VIT351 Software Development VI Unit1
 
Functional programming in java 8 by harmeet singh
Functional programming in java 8 by harmeet singhFunctional programming in java 8 by harmeet singh
Functional programming in java 8 by harmeet singh
 
Intro to php
Intro to phpIntro to php
Intro to php
 
Amit user defined functions xi (2)
Amit  user defined functions xi (2)Amit  user defined functions xi (2)
Amit user defined functions xi (2)
 
Function in C
Function in CFunction in C
Function in C
 
Java 8 new features
Java 8 new featuresJava 8 new features
Java 8 new features
 
Functions and types of user defined functions in
Functions and types of user defined functions inFunctions and types of user defined functions in
Functions and types of user defined functions in
 
Java 8 - Features Overview
Java 8 - Features OverviewJava 8 - Features Overview
Java 8 - Features Overview
 
Python Programming Basics for begginners
Python Programming Basics for begginnersPython Programming Basics for begginners
Python Programming Basics for begginners
 

Andere mochten auch

SociaLite: High-level Query Language for Big Data Analysis
SociaLite: High-level Query Language for Big Data AnalysisSociaLite: High-level Query Language for Big Data Analysis
SociaLite: High-level Query Language for Big Data Analysis
DataWorks Summit
 
Flight Delay Prediction Model (2)
Flight Delay Prediction Model (2)Flight Delay Prediction Model (2)
Flight Delay Prediction Model (2)
Shubham Gupta
 

Andere mochten auch (8)

SociaLite: High-level Query Language for Big Data Analysis
SociaLite: High-level Query Language for Big Data AnalysisSociaLite: High-level Query Language for Big Data Analysis
SociaLite: High-level Query Language for Big Data Analysis
 
AvocadoDB query language (DRAFT!)
AvocadoDB query language (DRAFT!)AvocadoDB query language (DRAFT!)
AvocadoDB query language (DRAFT!)
 
Phase1review
Phase1reviewPhase1review
Phase1review
 
Flight Delay Prediction Model (2)
Flight Delay Prediction Model (2)Flight Delay Prediction Model (2)
Flight Delay Prediction Model (2)
 
Airline flights delay prediction- 2014 Spring Data Mining Project
Airline flights delay prediction- 2014 Spring Data Mining ProjectAirline flights delay prediction- 2014 Spring Data Mining Project
Airline flights delay prediction- 2014 Spring Data Mining Project
 
BIG DATA TO AVOID WEATHER RELATED FLIGHT DELAYS PPT
BIG DATA TO AVOID WEATHER RELATED FLIGHT DELAYS PPTBIG DATA TO AVOID WEATHER RELATED FLIGHT DELAYS PPT
BIG DATA TO AVOID WEATHER RELATED FLIGHT DELAYS PPT
 
Data Mining & Analytics for U.S. Airlines On-Time Performance
Data Mining & Analytics for U.S. Airlines On-Time Performance Data Mining & Analytics for U.S. Airlines On-Time Performance
Data Mining & Analytics for U.S. Airlines On-Time Performance
 
Slideshare ppt
Slideshare pptSlideshare ppt
Slideshare ppt
 

Ähnlich wie Api specification based function search engine using natural language query-Seminar Conducted by me

Nt1310 Unit 3 Language Analysis
Nt1310 Unit 3 Language AnalysisNt1310 Unit 3 Language Analysis
Nt1310 Unit 3 Language Analysis
Nicole Gomez
 
Notes5
Notes5Notes5
Notes5
hccit
 

Ähnlich wie Api specification based function search engine using natural language query-Seminar Conducted by me (20)

FUNCTIONS IN R PROGRAMMING.pptx
FUNCTIONS IN R PROGRAMMING.pptxFUNCTIONS IN R PROGRAMMING.pptx
FUNCTIONS IN R PROGRAMMING.pptx
 
Linq
LinqLinq
Linq
 
Functions in c
Functions in cFunctions in c
Functions in c
 
Function in c programming
Function in c programmingFunction in c programming
Function in c programming
 
arrays.ppt
arrays.pptarrays.ppt
arrays.ppt
 
FUNCTION CPU
FUNCTION CPUFUNCTION CPU
FUNCTION CPU
 
Ch4 functions
Ch4 functionsCh4 functions
Ch4 functions
 
Functions part1
Functions part1Functions part1
Functions part1
 
Fnctions part2
Fnctions part2Fnctions part2
Fnctions part2
 
CTE 313 - Lecture 3.pptx
CTE 313 - Lecture 3.pptxCTE 313 - Lecture 3.pptx
CTE 313 - Lecture 3.pptx
 
EContent_11_2023_04_09_11_30_38_Unit_3_Objects_and_Classespptx__2023_03_20_12...
EContent_11_2023_04_09_11_30_38_Unit_3_Objects_and_Classespptx__2023_03_20_12...EContent_11_2023_04_09_11_30_38_Unit_3_Objects_and_Classespptx__2023_03_20_12...
EContent_11_2023_04_09_11_30_38_Unit_3_Objects_and_Classespptx__2023_03_20_12...
 
C++ classes tutorials
C++ classes tutorialsC++ classes tutorials
C++ classes tutorials
 
Functional JavaScript Fundamentals
Functional JavaScript FundamentalsFunctional JavaScript Fundamentals
Functional JavaScript Fundamentals
 
functions.pptx
functions.pptxfunctions.pptx
functions.pptx
 
PSPC-UNIT-4.pdf
PSPC-UNIT-4.pdfPSPC-UNIT-4.pdf
PSPC-UNIT-4.pdf
 
USER DEFINED FUNCTIONS IN C.pdf
USER DEFINED FUNCTIONS IN C.pdfUSER DEFINED FUNCTIONS IN C.pdf
USER DEFINED FUNCTIONS IN C.pdf
 
Nt1310 Unit 3 Language Analysis
Nt1310 Unit 3 Language AnalysisNt1310 Unit 3 Language Analysis
Nt1310 Unit 3 Language Analysis
 
Notes5
Notes5Notes5
Notes5
 
Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)Deep API Learning (FSE 2016)
Deep API Learning (FSE 2016)
 
Jsp session 7
Jsp   session 7Jsp   session 7
Jsp session 7
 

Kürzlich hochgeladen

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Kürzlich hochgeladen (20)

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Magic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxMagic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptx
 
Third Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxThird Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 

Api specification based function search engine using natural language query-Seminar Conducted by me

  • 1. By Sanif S S Reg No:10007399 S7 IT
  • 2. Overview  Understanding the terms.  Objectives.  In detail o Keyword Retrieval o Variable Retrieval o API Specification Mining o Function Retrieval o Code Generation  Experiment  Conclusion  References
  • 3. API Specification-Based Function Search Engine Using Natural Language Query  API – Application Programming Interface.  An API is a set of commands, functions, and protocols which programmers can use when building software for a specific operating system  APIs are usually Implemented as Header Files.  EX: o Java APIs o ODBC for Microsoft Windows
  • 4. API Specification-Based Function Search Engine Using Natural Language Query  Description about the classes and methods inside the API.  Each method(or function) and its uses are briefly described in the API Specifications.
  • 5. API Specification-Based Function Search Engine Using Natural Language Query  Function search engine is nothing but as the name suggests a search engine for all the methods in the API.
  • 6. API Specification-Based Function Search Engine Using Natural Language Query  Natural Language Query is a query that uses a complete sentence or question to begin a search.  Ex: o “What is the capital of India?” o “How to make pizza?”
  • 7. API Specification-Based Function Search Engine Using Natural Language Query  Means a search engine to search all the functions/methods in an Application programming interface(API) using simple queries.  Additionally this paper also suggests a means of generating automatic function calls based on the search.
  • 8.  Programmers nearly always use existing functions while developing their applications.  The functions have grown more numerous and more diverse.  The Problem is that ‘what functions they want’ and know ‘how to call those functions?’.  The Solution:- o This paper present two novel approaches to address these problems. o The first is the approach to find right functions based on the API specification. o The second is approach to automatically generate code for “function call”
  • 9.  There are two main objectives in this paper: o Retrieving functions, and o Generating code for function calls.  Two different forms of queries corresponding to these objectives. o The first is “function search query” which requests to look for functions. o The second is “function call query” which requests to generate code for function calls.
  • 10. Code Generation Variable Retrieval Function Description API Document Fig:Function Search Model Function Search Query Keyword Retrieval Mining Function Retrieval Function retrieval is the process of finding suitable functions by matching “the extracted keywords from a function search query” to “descriptions of functions in the API specification”. Keyword retrieval is the process of extracting keywords from a function search query Mining is the process of extracting contents in the API specification to support function retrieval Function Call Query Function Call Variable retrieval is the process of extracting Variables from a function call query Code generation is the process of generating code for a function call based on both the variables extracted from function call query.
  • 11.  There are several methods to identify keywords in a natural language sequence.  Some methods identify keyword as a simple word, while others identify a keyword phrase.  In this paper Introducing four technologies of natural language processing to extract keywords. -POS tagging, POS filtering, Stemming, Synonym generation.
  • 12. Word/POS POS Filter POS tagging (part-ofspeech tagging) is the technology to mark up a word in a natural language sentence (NL Sentence). Fig Keyword Retrieval Process NL Sentence POS Tagging Stemming keywordsSynonym Generation Main Word Original Word POS filtering is the technology to remove stopwords such as prepositions, pronouns, conjunctions, and interjections. Stemming is the technology to reduce inflected (or sometimes derived) words to their root form. (Ex: ‘return’ is the root form of words “returns, returning, returned”. Synonym generation is the technology to identify synonyms of the retrieved keywords
  • 13.  For the natural language query “Gets an element in the collection”. The followings are results obtained in the above stages. o POS Tagging: Gets/VB an/DT element/NN in/IN the/DF collection/NN. o POS Filtering: Gets element collection. o Stemming: Get element collection. o Synonym Generation: Get-have/return element-object/component collection-list/set. NOTE: VB-Verb DT-Determiner NN-Noun IN-Preposition DF-Adjective
  • 14.  Two kinds of objects in a function call query: -Words and Variables.  Many words related to each variable in the query.  Also each word in the query is only relevant to one(or zero) variable.  words, which are relevant to a variable, is called features of this variable.
  • 15.  Every relation between words and variable is represented by a “variable retrieval rule” derived from a corresponding syntactic rule.  Ex:Some variable retrieval rules o Root(sf V ) -> V B(wf W)NP(sf V ) o NP(sf fv1; v2g) -> NP(vf v1)PP(vf v2) o NP(sf V [ fvg) -> NP(sf V )PP(vf v) o NP(sf V1 [ V2) -> NP(sf V1)PP(sf V2) o PP(vf v[W1 W2]) -> IN(wf W1)NP(wf v[W2]) o PP(sf V ) -> IN(wf W)NP(sf V ) o NP(wf W1 W2) -> NN(wf W1)NN(wf W2) o NP(vf v[W1 W2]) -> NN(wf W1)NN(vf v[W2]) o NP(vf v[W1 W2 W3]) ->DT(wf W1) V BN(wf W2) NN(vf v[W3])
  • 16.  In figure 3, a query in natural language (“Insert element e in a set at index k”) is parsed in a tree structure by using Stanford-Parser tool.  The last result is: o e[element]; o a[a set]; o k[at index]; Fig. 3: Parsing tree for the function call query
  • 17.  This subsection focuses on mining the API specification of Java ,called Java API specification.  In the Java API specification, there are many contents related to function which may be mined to support the function retrieval process and the code generation process.  They are:- o function specification o functionality description o parameter features
  • 18.  Function specification: is a structured data that describes the usage of function.  information, which can be extracted from this content, is:Function name, function scope, return type, a list of parameters,and so on…  Functionality description: is an unstructured data in the form of natural language that describes the functionality of the function.  To extract information in this content, the keyword retrieval method (presented in previous slide) is used.  Parameter features: is an unstructured data in the form of natural language that describes  features of the parameters in the function specification.  The necessary information in this content are extracted by usingnatural language processing technologies.
  • 19. Example:  The function add() is described in the Java API specification ArrayList as follows.  Function specification: public void add(int index,Object element).  Functionality description: “Inserts the specified element at the specifiedposition in this list”.  Parameter features: “index - index at which the specified element is to be inserted” and “element - element to be inserted”.
  • 20.  There are three stages in the process of retrieving function.  Stage 1: extracting the functions related to user’s query based on some constraints.  Stage 2: refining the obtained result in the previous stage by removing some irrelevant functions.  Stage 3: ranking the collected relevant functions in descending order of appropriate degree of query.
  • 21.  The standard syntax of a function call statement is object.callName(arg1, arg2,…., argk)  To generate code for a function call, we map user’s query to the corresponding function call based on its function definition.  Two Steps: i. identifying certain variable vj as the object o , and ii. mapping the remaining variables to the corresponding arguments arg1, arg2, argk
  • 22.  In the first Step , the function retrieval method is used to identify a set of functions related to user’s query.  However, to use this method, the “function call query” need to be transferred to the “function search query” by removing all variables in this query.  The variable, whose type contains at least one function related to the new query, is the desired object o  In the second step all Other variables are set as parameters.  For example, give the query “inserts an element <e:Object> in a collection <a:ArrayList>”, the variable a with type ArrayList contains the function add related to the new query “inserts an element in a collection”, so a:add(?) is a suitable function call.
  • 23. A. User Study  In the first user study, ten common search tasks are designed and assigned them to the participants.  Then, each participant used FSE and some other search engines to complete these tasks.  Three search engines are given to users for study: FSE, Krugle, Koder.
  • 24.  In the second user study, the participants suggested over 100 requests that generate code for function call.  Then, they checked degree of fitness between obtained results and their requests to calculate accuracy for FSE.  There are four degrees of fitness: Highly Relevant, Somewhat Relevant, Somewhat Irrelevant, Highly Irrelevant.  Hightly Relevant- The top result in the set of the returned solutions is absolutely fit with user’s request.  Somewhat Relevant- The desired result in result set was not in the first position.  Somewhat Irrelevant- If it contains the function with correct name but wrong parameters.  Highly Irrelevant- The lowest level.
  • 25. B. Results 0 0.1 0.2 0.3 0.4 0.5 0.6 User 1 User 2 User 3 Krugle Koder FSE
  • 26. B. Results In this figure  92% -correct functions that were relevant to user’s request.  71% -correct function in the first position of solution set.  7% -did not find any proper function.
  • 27.  Efficient function search approach by using the API specification is proposed in this paper  Also presented a novel function call generation method that generates source code to invoke the functions based on variable features extracted from user’s query.  Finally, we have implemented FSE, a function search engine that helps programmers to quickly examine different functions that might be appropriate for a problem, obtain more information about particular functions, and automatically generate code for function calls to know how to use a function.
  • 28. [1] A. J. Ko, B. A. Myers, and H. H. Aung, “Six learning barriers in enduser programming systems,” in Proc. of the 2004 IEEE Symposium on Visual Languages - Human Centric Computing, ser. VLHCC ’04. IEEE Computer Society, 2004, pp. 199– 206. [2] D. Mandelin, L. Xu, R. Bod´ık, and D. Kimelman, “Jungloid mining: helping to navigate the api jungle,” in Proc. of the 2005 ACM SIGPLAN conference on Programming language design and implementation, ser. PLDI ’05. ACM, 2005, pp. 48–61. [3] J. Stylos and B. A. Myers, “Mica: A web-search tool for finding api components and examples,” in Proc. of the Visual Languages and Human-Centric Computing, ser. VLHCC ’06. IEEE Computer Society, 2006, pp. 195–202. [4] R. Hoffmann, J. Fogarty, and D. S. Weld, “Assieme: finding and leveraging implicit references in a web search interface for programmers,” in Proc. of the 20th annual ACM symposium on User interface software and technology, ser. UIST ’07. ACM, 2007, pp. 13–22.
  • 29. [5] S. Thummalapenta and T. Xie, “Parseweb: a programmer assistant for reusing open source code on the web,” in Proc. of the twentysecond IEEE/ACM international conference on Automated software engineering, ser. ASE ’07. ACM, 2007, pp. 204–213. [6] M. Grechanik, C. Fu, Q. Xie, C. McMillan, D. Poshyvanyk, and C. Cumby, “A search engine for finding highly relevant applications,” in Proc. of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1, ser. ICSE ’10. ACM, 2010, pp. 475–484. [7] S. Chatterjee, S. Juvekar, and K. Sen, “Sniff: A search engine for java using free-form queries,” in Proc. of the 12th International Conference on Fundamental Approaches to Software Engineering: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009, ser. FASE ’09. Springer-Verlag, 2009, pp. 385–400. [8] M. Grechanik, K. M. Conroy, and K. A. Probst, “Finding relevant applications for prototyping,” in Proc. of the Fourth International Workshop on Mining Software Repositories, ser. MSR ’07. IEEE Computer Society, 2007, pp. 12–.
  • 30. [9] R. Pandita, X. Xiao, H. Zhong, T. Xie, S. Oney, and A. Paradkar, “Inferring method specifications from natural language api descriptions,” in Proceedings of the 2012 International Conference on Software Engineering, ser. ICSE 2012. IEEE Press, 2012, pp. 815–825. [10] A. Fantechi, S. Gnesi, G. Lami, and A. Maccari, “Application of linguistic techniques for use case analysis,” in Proc. of the 10th Anniversary IEEE Joint International Conference on Requirements Engineering, ser. RE ’02. IEEE Computer Society, 2002, pp. 157–164. [11] D. Klein and C. D. Manning, “Accurate unlexicalized parsing,” in Proc. of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1, ser. ACL ’03. Association for Computational Linguistics, 2003, pp. 423–430. [12] L. Kof, “Scenarios: Identifying missing objects and actions by means of computational linguistics.” in RE. IEEE, 2007, pp. 121–130. [13] K. Rothenhausler and H. Schutze, “Part of speech filtered word spaces,” in Proc. of the 2007 Workshop on Contextual Information in Semantic Space Models: Beyond Words and Documents, 2007, pp. 25–32. [14] D. Shepherd, Z. P. Fry, E. Hill, L. Pollock, and K. Vijay-Shanker, “Using natural language program analysis to locate and understand action-oriented concerns,” in Proc. of the 6th international conference on Aspect-oriented software development, ser. AOSD ’07. ACM, 2007, pp. 212–224.
  • 31. [15] R. Hemayati, W. Meng, and C. Yu, “Semantic-based grouping of search engine results using wordnet,” in Proc. of the joint 9th Asia- Pacific web and 8th international conference on web-age information management conference on Advances in data and web management, ser. APWeb/WAIM’07. Springer-Verlag, 2007, pp. 678–686. [16] C. Manning and D. Klein. The stanford parser. [Online]. Available: http://nlp.stanford.edu/software/lex-parser.shtml [17] Java api. [Online]. Available: docs.oracle.com/javase/1.4.2/docs/api [18] L. Vaughan, “New measurements for search engine evaluation proposed and tested,” Inf. Process. Manage., vol. 40, no. 4, pp. 677–691, May 2004. [19] Krugle inc. [Online]. Available: http://opensearch.krugle.com/ [20] Koder inc. [Online]. Available: http://www.koders.com/ [21] S. E. Sim, M. Umarji, S. Ratanotayanon, and C. V. Lopes, “How well do search engines support code retrieval on the web?” ACM Trans. Softw. Eng. Methodol., vol. 21, no. 1, pp. 4:1–4:25, Dec. 2011