SlideShare ist ein Scribd-Unternehmen logo
1 von 38
FlashMeta Microsoft PROSE SDK:
A Framework for
Inductive Program Synthesis
Oleksandr Polozov
University of Washington
Sumit Gulwani
Microsoft Research
Why do people create frameworks?
Industrialization (a.k.a. “Tech Transfer”)
2
3
4
Program Synthesis: “The Ultimate Dream” of CS
User Intent
Programming
Language
Search
Algorithm Program
5
Industrialization Time?
Flash Fill (2010-2012) Trifacta (2012-2015) SPIRAL (2000-2015)
+114
more
6
Microsoft Program Synthesis using Examples SDK
https://microsoft.github.io/prose
7
Shoulders of Giants
PROSE
Deductive
Synthesis
Syntax-Guided
Synthesis
Domain-Specific
Inductive Synthesis
8
Shoulders of Giants
PROSE
Deductive
Synthesis
Püschel et al. [IEEE '05]
Panchekha et al. [PLDI '15]
Manna, Waldinger [TOPLAS '80]
+ No invalid candidates ⟹ fast
− [Usually] complete specs
− Domain axiomatization
9
Shoulders of Giants
PROSE
Syntax-Guided
Synthesis
Alur et al. [FMCAD '13]
+ Shrinks the search space
+ Generic algorithms
− No domain-specific insights
− Limited to SMT-LIB
10
Shoulders of Giants
PROSE
Domain-Specific
Inductive Synthesis
Lau et al. [ICML '00]
Gulwani [POPL '10] etc.
Feser et al. [PLDI '15]
+ Arbitrarily complex DSLs
+ Input/output examples
− 1-2 person-years (PhD)
− One-off
11
Shoulders of Giants
Domain-Specific
Inductive Synthesis
Syntax-Guided
Synthesis
“Learn from examples”“Search over a DSL”
User
Intent
Programming
Language
⇓⇓
Deductive
Synthesis
“Deduce subexpressions”
Search
Algorithm
⇓
12
Meta-synthesizer framework
PROSE
Synthesis
Strategies
DSL
Definition
I/O
Specification
Synthesizer
Input
Output
Programs
App
PROSE
13
Domain-Specific Language
14
string output(string[] inputs) :=
| ConstStr(s)
| let string x = std.list.Kth(inputs, k) in
SubStr(x, positionPair(x));
Tuple<int, int> positionPair(string s) :=
std.Pair(positionIn(s), positionIn(s));
int positionIn(string s) := AbsPos(s, k)
| RegPos(s, std.Pair(r, r), k);
const int k; const RegularExpression r; const string s;
FlashFill (portion) as a PROSE DSL
15
Inductive Specification
16
Input-Output Examples
input state 𝜎 ⟹ output value 𝑜
“206-279-6261” ⟹ “(206) 279-6261”
“415.413.0703” ⟹ “(415) 413-0703”
“(646) 408 6649” ⟹ “(646) 408-6649”
17
When one example is too many
⟹
18
Inductive Specification
input state 𝜎 ⟹ output constraint 𝜑(out)
⟹ 𝑜𝑢𝑡 ⊒ "2010", "2014", …
19
Inductive Specification
input state 𝜎 ⟹ output constraint 𝜑(out)
∧
∨ ∨…
⊒ "2010", "2014", … ∋ "Springer" ∋ "[11]"
20
Examples are ambiguous!
21
From:
all lines ending with “Number ∘ Dot”
“Space ∘ Number ∘ Dot”
starting with “Word ∘ Space ∘ CamelCase”
Extract:
the first “Number” before a “Dot”
the last “Number” before a “Dot”
the last “Number” before a “Dot ∘ LineBreak”
the last “Number”
text between the last “Space” and the last “Dot”
the first “Comma ∘ Space” and the last “Dot ∘ LineBreak”
…and up to 1020 more candidates
22
One program is insufficient.
Program Set ⟹ Ranking
User interaction
Runtime correction
…
23
(Version Space Algebra)
Synthesis Strategy
24
Observation 1: Inverse Semantics
𝐹 𝐴, 𝐵 ⊨ 𝜙?
𝐴 ⊨ 𝜙 𝐴? 𝐵 ⊨ 𝜙 𝐵?
25
Concat(𝐹, 𝐸)
∃𝐸: Concat(𝐹, 𝐸) satisfies 𝜑 if and only if 𝐹 satisfies ___________ ?
∃F: Concat(𝐹, 𝐸) satisfies 𝜑 if and only if 𝐸 satisfies ___________ ?
𝐹 and 𝐸 are not independent!
𝜑: “Kathleen S. Fisher” ⟹ “Dr. Fisher”
“Bill Gates, Sr.” ⟹ “Dr. Gates”
𝜑 𝑓:
“Kathleen S. Fisher” ⟹ “D” ∨ “Dr” ∨ “Dr.” ∨ “Dr. ” ∨ “Dr. F” ∨ …
“Bill Gates, Sr.” ⟹ “D” ∨ “Dr” ∨ “Dr.” ∨ “Dr. ” ∨ “Dr. G” ∨ …
26
Observation 2: Skolemization
𝐹 𝐴, 𝐵 ⊨ 𝜙?
𝐴 ⊨ 𝜙 𝐴? 𝐵 ⊨ 𝜙 𝐵?
27
given 𝐴 𝜎 = 𝑎
Concat(𝐹, 𝐸)
∃E: Concat(𝐹, 𝐸) satisfies 𝜑 if and only if 𝐹 satisfies ___________ ?
Given an output of 𝐹, Concat(𝐹, 𝐸) satisfies 𝜑 if and only if 𝐸 satisfies ___________ ?
𝜑: “Kathleen S. Fisher” ⟹ “Dr. Fisher”
“Bill Gates, Sr.” ⟹ “Dr. Gates”
𝜑 𝑓:
“Kathleen S. Fisher” ⟹ “D” ∨ “Dr” ∨ “Dr.” ∨ “Dr. ” ∨ “Dr. F” ∨ …
“Bill Gates, Sr.” ⟹ “D” ∨ “Dr” ∨ “Dr.” ∨ “Dr. ” ∨ “Dr. G” ∨ …
“Kathleen S. Fisher” ⟹ “Dr. ”
“Bill Gates, Sr.” ⟹ “Dr. ”𝐹 =
“Kathleen S. Fisher” ⟹ “Fisher”
“Bill Gates, Sr.” ⟹ “Gates”
𝜑 𝐸:
28
Inverse Semantics + Skolemization = Witness Function
∃𝐸: Concat(𝐹, 𝐸) satisfies 𝜑 if and only if 𝐹 satisfies ___________ ?
Given an output of 𝐹, Concat(𝐹, 𝐸) satisfies 𝜑 if and only if 𝐸 satisfies ___________ ?
Witness function: 𝜑 ↦ 𝜑 𝐹
Conditional witness function: 𝜑 ∣ 𝐹 𝜎 = 𝑓 ↦ 𝜑 𝐸
Domain-Specific
Modular
No synthesis reasoning
Enable efficient deduction29
Results
30
Unifies 10+ prior POPL/PLDI/… papers
• Lau, T., Domingos, P., & Weld, D. S. (2000). Version Space Algebra and its Application to Programming by Demonstration. In
ICML (pp. 527–534).
• Kitzelmann, E. (2011). A combined analytical and search-based approach for the inductive synthesis of functional programs. KI-
Künstliche Intelligenz, 25(2), 179–182.
• Gulwani, S. (2011). Automating string processing in spreadsheets using input-output examples. In POPL (Vol. 46, p. 317).
• Singh, R., & Gulwani, S. (2012). Learning semantic string transformations from examples. VLDB, 5(8), 740–751.
• Andersen, E., Gulwani, S., & Popovic, Z. (2013). A Trace-based Framework for Analyzing and Synthesizing Educational
Progressions. In CHI (pp. 773–782).
• Yessenov, K., Tulsiani, S., Menon, A., Miller, R. C., Gulwani, S., Lampson, B., & Kalai, A. (2013). A colorful approach to text
processing by example. In UIST (pp. 495–504).
• Le, V., & Gulwani, S. (2014). FlashExtract : A Framework for Data Extraction by Examples. In PLDI (p. 55).
• Barowy, D. W., Gulwani, S., Hart, T., & Zorn, B. (2015). FlashRelate: Extracting Relational Data from Semi-Structured
Spreadsheets Using Examples. In PLDI.
• Kini, D., & Gulwani, S. (2015). FlashNormalize : Programming by Examples for Text Normalization. IJCAI.
• Osera, P.-M., & Zdancewic, S. (2015). Type-and-Example-Directed Program Synthesis. In PLDI.
• Feser, J., Chaudhuri, S., & Dillig, I. (2015). Synthesizing Data Structure Transformations from Input-Output Examples. In PLDI.
• …
31
Program Synthesis meets Software Engineering
Project Reference
Lines of Code Development Time
Original PROSE Original PROSE
FlashFill POPL 2010 12K 3K 9 months 1 month
Text Extraction PLDI 2014 7K 4K 8 months 1 month
Text Normalization IJCAI 2015 17K 2K 7 months 2 months
Spreadsheet Layout PLDI 2015 5K 2K 8 months 1 month
Web Extraction — — 2.5K — 1.5 months
32
Performance: 0.5 − 3X Original
More general ⇒ Slower Algorithmic advances ⇒ Faster
Example: FlashExtract
Learning time = 1.6 sec
2300 nodes in a VSA data structure ≈ log(# of programs)
3 examples till task completion
33
Performance: 0.5 − 3X Original
More general ⇒ Slower Algorithmic advances ⇒ Faster
Example: FlashExtract
34
Applications
35
Email Parsing in Cortana
36
ConvertFrom-String in PowerShell
37
Research: https://microsoft.github.io/prose
Play: https://microsoft.github.io/prose/demo
Contact: prose-contact@microsoft.com
See our demo @ MSR table
Thank you!
Questions?
38

Weitere ähnliche Inhalte

Was ist angesagt?

Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Simplilearn
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learning
butest
 

Was ist angesagt? (20)

Genetic algorithm raktim
Genetic algorithm raktimGenetic algorithm raktim
Genetic algorithm raktim
 
Machine learning in image processing
Machine learning in image processingMachine learning in image processing
Machine learning in image processing
 
Particle Swarm Optimization - PSO
Particle Swarm Optimization - PSOParticle Swarm Optimization - PSO
Particle Swarm Optimization - PSO
 
Neural networks of artificial intelligence
Neural networks of artificial  intelligenceNeural networks of artificial  intelligence
Neural networks of artificial intelligence
 
2001: An Introduction to Artificial Immune Systems
2001: An Introduction to Artificial Immune Systems2001: An Introduction to Artificial Immune Systems
2001: An Introduction to Artificial Immune Systems
 
Pattern Recognition.pptx
Pattern Recognition.pptxPattern Recognition.pptx
Pattern Recognition.pptx
 
K nearest neighbor
K nearest neighborK nearest neighbor
K nearest neighbor
 
Ai notes
Ai notesAi notes
Ai notes
 
CART: Not only Classification and Regression Trees
CART: Not only Classification and Regression TreesCART: Not only Classification and Regression Trees
CART: Not only Classification and Regression Trees
 
Support Vector Machine ppt presentation
Support Vector Machine ppt presentationSupport Vector Machine ppt presentation
Support Vector Machine ppt presentation
 
Machine learning
Machine learningMachine learning
Machine learning
 
Genetic Algorithm
Genetic AlgorithmGenetic Algorithm
Genetic Algorithm
 
Heuristic Search Techniques Unit -II.ppt
Heuristic Search Techniques Unit -II.pptHeuristic Search Techniques Unit -II.ppt
Heuristic Search Techniques Unit -II.ppt
 
Uninformed search /Blind search in AI
Uninformed search /Blind search in AIUninformed search /Blind search in AI
Uninformed search /Blind search in AI
 
State Space Search in ai
State Space Search in aiState Space Search in ai
State Space Search in ai
 
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
 
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNINGARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
 
artificial neural network
artificial neural networkartificial neural network
artificial neural network
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learning
 

Ähnlich wie Microsoft PROSE SDK: A Framework for Inductive Program Synthesis

The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge Graph
Trey Grainger
 

Ähnlich wie Microsoft PROSE SDK: A Framework for Inductive Program Synthesis (20)

Evolving as a professional software developer
Evolving as a professional software developerEvolving as a professional software developer
Evolving as a professional software developer
 
ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...
ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...
ISEC'18 Keynote: Intelligent Software Engineering: Synergy between AI and Sof...
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021Hala skafkeynote@conferencedata2021
Hala skafkeynote@conferencedata2021
 
The Relevance of the Apache Solr Semantic Knowledge Graph
The Relevance of the Apache Solr Semantic Knowledge GraphThe Relevance of the Apache Solr Semantic Knowledge Graph
The Relevance of the Apache Solr Semantic Knowledge Graph
 
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Dedalo, looking for Cluster Explanations in a labyrinth of Linked DataDedalo, looking for Cluster Explanations in a labyrinth of Linked Data
Dedalo, looking for Cluster Explanations in a labyrinth of Linked Data
 
Антон Кириллов, ZeptoLab
Антон Кириллов, ZeptoLabАнтон Кириллов, ZeptoLab
Антон Кириллов, ZeptoLab
 
On being a professional software developer
On being a professional software developerOn being a professional software developer
On being a professional software developer
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
 
EDBT 2015: Summer School Overview
EDBT 2015: Summer School OverviewEDBT 2015: Summer School Overview
EDBT 2015: Summer School Overview
 
Knowledge graphs for knowing more and knowing for sure
Knowledge graphs for knowing more and knowing for sureKnowledge graphs for knowing more and knowing for sure
Knowledge graphs for knowing more and knowing for sure
 
Four Languages From Forty Years Ago
Four Languages From Forty Years AgoFour Languages From Forty Years Ago
Four Languages From Forty Years Ago
 
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail ScienceSQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
 
Myria: Analytics-as-a-Service for (Data) Scientists
Myria: Analytics-as-a-Service for (Data) ScientistsMyria: Analytics-as-a-Service for (Data) Scientists
Myria: Analytics-as-a-Service for (Data) Scientists
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph Introduction
 
“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...
“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...
“Towards Multi-Step Expert Advice for Cognitive Computing” - Dr. Achim Rettin...
 
The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge Graph
 
AI Beyond Deep Learning
AI Beyond Deep LearningAI Beyond Deep Learning
AI Beyond Deep Learning
 
The Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge GraphThe Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge Graph
 

Kürzlich hochgeladen

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 

Kürzlich hochgeladen (20)

How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 

Microsoft PROSE SDK: A Framework for Inductive Program Synthesis

  • 1. FlashMeta Microsoft PROSE SDK: A Framework for Inductive Program Synthesis Oleksandr Polozov University of Washington Sumit Gulwani Microsoft Research
  • 2. Why do people create frameworks? Industrialization (a.k.a. “Tech Transfer”) 2
  • 3. 3
  • 4. 4
  • 5. Program Synthesis: “The Ultimate Dream” of CS User Intent Programming Language Search Algorithm Program 5
  • 6. Industrialization Time? Flash Fill (2010-2012) Trifacta (2012-2015) SPIRAL (2000-2015) +114 more 6
  • 7. Microsoft Program Synthesis using Examples SDK https://microsoft.github.io/prose 7
  • 9. Shoulders of Giants PROSE Deductive Synthesis Püschel et al. [IEEE '05] Panchekha et al. [PLDI '15] Manna, Waldinger [TOPLAS '80] + No invalid candidates ⟹ fast − [Usually] complete specs − Domain axiomatization 9
  • 10. Shoulders of Giants PROSE Syntax-Guided Synthesis Alur et al. [FMCAD '13] + Shrinks the search space + Generic algorithms − No domain-specific insights − Limited to SMT-LIB 10
  • 11. Shoulders of Giants PROSE Domain-Specific Inductive Synthesis Lau et al. [ICML '00] Gulwani [POPL '10] etc. Feser et al. [PLDI '15] + Arbitrarily complex DSLs + Input/output examples − 1-2 person-years (PhD) − One-off 11
  • 12. Shoulders of Giants Domain-Specific Inductive Synthesis Syntax-Guided Synthesis “Learn from examples”“Search over a DSL” User Intent Programming Language ⇓⇓ Deductive Synthesis “Deduce subexpressions” Search Algorithm ⇓ 12
  • 15. string output(string[] inputs) := | ConstStr(s) | let string x = std.list.Kth(inputs, k) in SubStr(x, positionPair(x)); Tuple<int, int> positionPair(string s) := std.Pair(positionIn(s), positionIn(s)); int positionIn(string s) := AbsPos(s, k) | RegPos(s, std.Pair(r, r), k); const int k; const RegularExpression r; const string s; FlashFill (portion) as a PROSE DSL 15
  • 17. Input-Output Examples input state 𝜎 ⟹ output value 𝑜 “206-279-6261” ⟹ “(206) 279-6261” “415.413.0703” ⟹ “(415) 413-0703” “(646) 408 6649” ⟹ “(646) 408-6649” 17
  • 18. When one example is too many ⟹ 18
  • 19. Inductive Specification input state 𝜎 ⟹ output constraint 𝜑(out) ⟹ 𝑜𝑢𝑡 ⊒ "2010", "2014", … 19
  • 20. Inductive Specification input state 𝜎 ⟹ output constraint 𝜑(out) ∧ ∨ ∨… ⊒ "2010", "2014", … ∋ "Springer" ∋ "[11]" 20
  • 22. From: all lines ending with “Number ∘ Dot” “Space ∘ Number ∘ Dot” starting with “Word ∘ Space ∘ CamelCase” Extract: the first “Number” before a “Dot” the last “Number” before a “Dot” the last “Number” before a “Dot ∘ LineBreak” the last “Number” text between the last “Space” and the last “Dot” the first “Comma ∘ Space” and the last “Dot ∘ LineBreak” …and up to 1020 more candidates 22
  • 23. One program is insufficient. Program Set ⟹ Ranking User interaction Runtime correction … 23 (Version Space Algebra)
  • 25. Observation 1: Inverse Semantics 𝐹 𝐴, 𝐵 ⊨ 𝜙? 𝐴 ⊨ 𝜙 𝐴? 𝐵 ⊨ 𝜙 𝐵? 25
  • 26. Concat(𝐹, 𝐸) ∃𝐸: Concat(𝐹, 𝐸) satisfies 𝜑 if and only if 𝐹 satisfies ___________ ? ∃F: Concat(𝐹, 𝐸) satisfies 𝜑 if and only if 𝐸 satisfies ___________ ? 𝐹 and 𝐸 are not independent! 𝜑: “Kathleen S. Fisher” ⟹ “Dr. Fisher” “Bill Gates, Sr.” ⟹ “Dr. Gates” 𝜑 𝑓: “Kathleen S. Fisher” ⟹ “D” ∨ “Dr” ∨ “Dr.” ∨ “Dr. ” ∨ “Dr. F” ∨ … “Bill Gates, Sr.” ⟹ “D” ∨ “Dr” ∨ “Dr.” ∨ “Dr. ” ∨ “Dr. G” ∨ … 26
  • 27. Observation 2: Skolemization 𝐹 𝐴, 𝐵 ⊨ 𝜙? 𝐴 ⊨ 𝜙 𝐴? 𝐵 ⊨ 𝜙 𝐵? 27 given 𝐴 𝜎 = 𝑎
  • 28. Concat(𝐹, 𝐸) ∃E: Concat(𝐹, 𝐸) satisfies 𝜑 if and only if 𝐹 satisfies ___________ ? Given an output of 𝐹, Concat(𝐹, 𝐸) satisfies 𝜑 if and only if 𝐸 satisfies ___________ ? 𝜑: “Kathleen S. Fisher” ⟹ “Dr. Fisher” “Bill Gates, Sr.” ⟹ “Dr. Gates” 𝜑 𝑓: “Kathleen S. Fisher” ⟹ “D” ∨ “Dr” ∨ “Dr.” ∨ “Dr. ” ∨ “Dr. F” ∨ … “Bill Gates, Sr.” ⟹ “D” ∨ “Dr” ∨ “Dr.” ∨ “Dr. ” ∨ “Dr. G” ∨ … “Kathleen S. Fisher” ⟹ “Dr. ” “Bill Gates, Sr.” ⟹ “Dr. ”𝐹 = “Kathleen S. Fisher” ⟹ “Fisher” “Bill Gates, Sr.” ⟹ “Gates” 𝜑 𝐸: 28
  • 29. Inverse Semantics + Skolemization = Witness Function ∃𝐸: Concat(𝐹, 𝐸) satisfies 𝜑 if and only if 𝐹 satisfies ___________ ? Given an output of 𝐹, Concat(𝐹, 𝐸) satisfies 𝜑 if and only if 𝐸 satisfies ___________ ? Witness function: 𝜑 ↦ 𝜑 𝐹 Conditional witness function: 𝜑 ∣ 𝐹 𝜎 = 𝑓 ↦ 𝜑 𝐸 Domain-Specific Modular No synthesis reasoning Enable efficient deduction29
  • 31. Unifies 10+ prior POPL/PLDI/… papers • Lau, T., Domingos, P., & Weld, D. S. (2000). Version Space Algebra and its Application to Programming by Demonstration. In ICML (pp. 527–534). • Kitzelmann, E. (2011). A combined analytical and search-based approach for the inductive synthesis of functional programs. KI- Künstliche Intelligenz, 25(2), 179–182. • Gulwani, S. (2011). Automating string processing in spreadsheets using input-output examples. In POPL (Vol. 46, p. 317). • Singh, R., & Gulwani, S. (2012). Learning semantic string transformations from examples. VLDB, 5(8), 740–751. • Andersen, E., Gulwani, S., & Popovic, Z. (2013). A Trace-based Framework for Analyzing and Synthesizing Educational Progressions. In CHI (pp. 773–782). • Yessenov, K., Tulsiani, S., Menon, A., Miller, R. C., Gulwani, S., Lampson, B., & Kalai, A. (2013). A colorful approach to text processing by example. In UIST (pp. 495–504). • Le, V., & Gulwani, S. (2014). FlashExtract : A Framework for Data Extraction by Examples. In PLDI (p. 55). • Barowy, D. W., Gulwani, S., Hart, T., & Zorn, B. (2015). FlashRelate: Extracting Relational Data from Semi-Structured Spreadsheets Using Examples. In PLDI. • Kini, D., & Gulwani, S. (2015). FlashNormalize : Programming by Examples for Text Normalization. IJCAI. • Osera, P.-M., & Zdancewic, S. (2015). Type-and-Example-Directed Program Synthesis. In PLDI. • Feser, J., Chaudhuri, S., & Dillig, I. (2015). Synthesizing Data Structure Transformations from Input-Output Examples. In PLDI. • … 31
  • 32. Program Synthesis meets Software Engineering Project Reference Lines of Code Development Time Original PROSE Original PROSE FlashFill POPL 2010 12K 3K 9 months 1 month Text Extraction PLDI 2014 7K 4K 8 months 1 month Text Normalization IJCAI 2015 17K 2K 7 months 2 months Spreadsheet Layout PLDI 2015 5K 2K 8 months 1 month Web Extraction — — 2.5K — 1.5 months 32
  • 33. Performance: 0.5 − 3X Original More general ⇒ Slower Algorithmic advances ⇒ Faster Example: FlashExtract Learning time = 1.6 sec 2300 nodes in a VSA data structure ≈ log(# of programs) 3 examples till task completion 33
  • 34. Performance: 0.5 − 3X Original More general ⇒ Slower Algorithmic advances ⇒ Faster Example: FlashExtract 34
  • 36. Email Parsing in Cortana 36
  • 38. Research: https://microsoft.github.io/prose Play: https://microsoft.github.io/prose/demo Contact: prose-contact@microsoft.com See our demo @ MSR table Thank you! Questions? 38

Hinweis der Redaktion

  1. Dimensions in Program Synthesis
  2. Who is the customer or provider of each piece? Depends on the application! FF: user provides examples and consumes output FE: user provides examples and consumes program MailAnts: developer provides examples and consumes program
  3. Small pieces of domain-specific insight that can be written once for an operator and reused in any DSL where it’s embedded. Most importantly, they can be written by a person without any background in … We verified it when we had implemented PROSE and built several applications on top of it.
  4. New implementations are deployed
  5. VSA volume = # of elements in the tree-based set representation # of programs is exponential in this number For FF can reach 10^30.