SlideShare ist ein Scribd-Unternehmen logo
1 von 26
VTD-XML: The Future of XML Processing Albert Guo junyuo@gmail.com
VTD-XML Motivations Behind VTD-XML Why VTD-XML? When to Use VTD-XML? Known Limitations Basic Concept Essential Classes Shortcomings Typical Programming Flows Demo Reference Agenda 2
*Numerous*, well-known issues of old XML processing models, below summarizes a few: Comparison with DOM, SAX and PULL	 http://vtd-xml.sourceforge.net/userGuide/5.html Motivations Behind VTD-XML 3
[object Object]
The world's most memory-efficient random-access XML parser.
The world's fastest XML parser
The world's fastest XPath 1.0 implementation.
The world's most efficient XML indexer that seamlessly integrates with your XML applications.
The world's only incremental-update capable XML parser capable of cutting, pasting, splitting and assembling XML documents with max efficiency.
The world's only XML parser that allows you to use XPath to process 256 GB XML documents.Why VTD-XML? 4
The scenarios that you may consider using VTD-XML Large XML files that DOM can’t handle Performance-critical transactional Web- Services/SOA applications Native XML database applications Network-based XML content switching/routing/security applications When to Use VTD-XML? 5
[object Object]
Not yet process DTD (return as a single VTD record)
Schema validation feature is planned for a future release.
Extreme long (>=512 chars) element/attribute names or ultra deep document (>= 255 levels) will cause parse exception
http://vtd-xml.sourceforge.net/userGuide/0.htmlKnown Limitations 6
Basic Concept ,[object Object]
The XML document is kept intact and un-decoded.7
Basic Concept – cont. ,[object Object]
VTD-XML performs many string operations directly on VTD records
String to VTD record comparison (both boolean and lexicographically)
Direct conversions from VTD records to ints, longs, floats and doubles
VTD record to String conversion also provided, but avoid them whenever possible for performance reasons8
Basic Concept – cont. ,[object Object]
Move a single, global cursor to different locations in the document tree
Many VTDNav’s methods identify a VTD record with its index value

Weitere ähnliche Inhalte

Ähnlich wie VTD-XML: The Future of XML Processing

AD215 - Practical Magic with DXL
AD215 - Practical Magic with DXLAD215 - Practical Magic with DXL
AD215 - Practical Magic with DXLStephan H. Wissel
 
Implementing the Genetic Algorithm in XSLT: PoC
Implementing the Genetic Algorithm in XSLT: PoCImplementing the Genetic Algorithm in XSLT: PoC
Implementing the Genetic Algorithm in XSLT: PoCjimfuller2009
 
Overview Of Parallel Development - Ericnel
Overview Of Parallel Development -  EricnelOverview Of Parallel Development -  Ericnel
Overview Of Parallel Development - Ericnelukdpe
 
Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7Jack Gudenkauf
 
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­ticaA noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­ticaData Con LA
 
Mark Logic StrangeLoop 2010
Mark Logic StrangeLoop 2010Mark Logic StrangeLoop 2010
Mark Logic StrangeLoop 2010Christopher Biow
 
Xml processing-by-asfak
Xml processing-by-asfakXml processing-by-asfak
Xml processing-by-asfakAsfak Mahamud
 
WatsonBruce Ikadega sample 1
WatsonBruce Ikadega sample 1WatsonBruce Ikadega sample 1
WatsonBruce Ikadega sample 1Bruce Watson
 
Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...
Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...
Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...arnaudsoullie
 
IE 8 et les standards du Web - Chris Wilson - Paris Web 2008
IE 8 et les standards du Web - Chris Wilson - Paris Web 2008IE 8 et les standards du Web - Chris Wilson - Paris Web 2008
IE 8 et les standards du Web - Chris Wilson - Paris Web 2008Association Paris-Web
 
eProsima RPC over DDS - Connext Conf London October 2015
eProsima RPC over DDS - Connext Conf London October 2015 eProsima RPC over DDS - Connext Conf London October 2015
eProsima RPC over DDS - Connext Conf London October 2015 Jaime Martin Losa
 
Entenda de onde vem toda a potência do Intel® Xeon Phi™
Entenda de onde vem toda a potência do Intel® Xeon Phi™ Entenda de onde vem toda a potência do Intel® Xeon Phi™
Entenda de onde vem toda a potência do Intel® Xeon Phi™ Intel Software Brasil
 
Krzysztof Mazepa - Netflow/cflow - ulubionym narzędziem operatorów SP
Krzysztof Mazepa - Netflow/cflow - ulubionym narzędziem operatorów SPKrzysztof Mazepa - Netflow/cflow - ulubionym narzędziem operatorów SP
Krzysztof Mazepa - Netflow/cflow - ulubionym narzędziem operatorów SPPROIDEA
 
MongoDB World 2018: MongoDB for High Volume Time Series Data Streams
MongoDB World 2018: MongoDB for High Volume Time Series Data StreamsMongoDB World 2018: MongoDB for High Volume Time Series Data Streams
MongoDB World 2018: MongoDB for High Volume Time Series Data StreamsMongoDB
 
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...apidays
 

Ähnlich wie VTD-XML: The Future of XML Processing (20)

VLSI
VLSIVLSI
VLSI
 
Odp
OdpOdp
Odp
 
VLSI
VLSIVLSI
VLSI
 
AD215 - Practical Magic with DXL
AD215 - Practical Magic with DXLAD215 - Practical Magic with DXL
AD215 - Practical Magic with DXL
 
Implementing the Genetic Algorithm in XSLT: PoC
Implementing the Genetic Algorithm in XSLT: PoCImplementing the Genetic Algorithm in XSLT: PoC
Implementing the Genetic Algorithm in XSLT: PoC
 
Overview Of Parallel Development - Ericnel
Overview Of Parallel Development -  EricnelOverview Of Parallel Development -  Ericnel
Overview Of Parallel Development - Ericnel
 
Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7
 
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­ticaA noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
 
Mark Logic StrangeLoop 2010
Mark Logic StrangeLoop 2010Mark Logic StrangeLoop 2010
Mark Logic StrangeLoop 2010
 
Readme
ReadmeReadme
Readme
 
Xml processing-by-asfak
Xml processing-by-asfakXml processing-by-asfak
Xml processing-by-asfak
 
WatsonBruce Ikadega sample 1
WatsonBruce Ikadega sample 1WatsonBruce Ikadega sample 1
WatsonBruce Ikadega sample 1
 
Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...
Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...
Introduction to Industrial Control Systems : Pentesting PLCs 101 (BlackHat Eu...
 
IE 8 et les standards du Web - Chris Wilson - Paris Web 2008
IE 8 et les standards du Web - Chris Wilson - Paris Web 2008IE 8 et les standards du Web - Chris Wilson - Paris Web 2008
IE 8 et les standards du Web - Chris Wilson - Paris Web 2008
 
eProsima RPC over DDS - Connext Conf London October 2015
eProsima RPC over DDS - Connext Conf London October 2015 eProsima RPC over DDS - Connext Conf London October 2015
eProsima RPC over DDS - Connext Conf London October 2015
 
Entenda de onde vem toda a potência do Intel® Xeon Phi™
Entenda de onde vem toda a potência do Intel® Xeon Phi™ Entenda de onde vem toda a potência do Intel® Xeon Phi™
Entenda de onde vem toda a potência do Intel® Xeon Phi™
 
Krzysztof Mazepa - Netflow/cflow - ulubionym narzędziem operatorów SP
Krzysztof Mazepa - Netflow/cflow - ulubionym narzędziem operatorów SPKrzysztof Mazepa - Netflow/cflow - ulubionym narzędziem operatorów SP
Krzysztof Mazepa - Netflow/cflow - ulubionym narzędziem operatorów SP
 
An Optics Life
An Optics LifeAn Optics Life
An Optics Life
 
MongoDB World 2018: MongoDB for High Volume Time Series Data Streams
MongoDB World 2018: MongoDB for High Volume Time Series Data StreamsMongoDB World 2018: MongoDB for High Volume Time Series Data Streams
MongoDB World 2018: MongoDB for High Volume Time Series Data Streams
 
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...
Apidays Paris 2023 - Forget TypeScript, Choose Rust to build Robust, Fast and...
 

Mehr von Guo Albert

AWS IAM (Identity and Access Management) Policy Simulator
AWS IAM (Identity and Access Management) Policy SimulatorAWS IAM (Identity and Access Management) Policy Simulator
AWS IAM (Identity and Access Management) Policy SimulatorGuo Albert
 
TOEIC 準備心得
TOEIC 準備心得TOEIC 準備心得
TOEIC 準備心得Guo Albert
 
DBM專案環境建置
DBM專案環境建置DBM專案環境建置
DBM專案環境建置Guo Albert
 
JPA Optimistic Locking With @Version
JPA Optimistic Locking With @VersionJPA Optimistic Locking With @Version
JPA Optimistic Locking With @VersionGuo Albert
 
OCEJPA Study Notes
OCEJPA Study NotesOCEJPA Study Notes
OCEJPA Study NotesGuo Albert
 
OCEJPA(1Z0-898) Preparation Tips
OCEJPA(1Z0-898) Preparation TipsOCEJPA(1Z0-898) Preparation Tips
OCEJPA(1Z0-898) Preparation TipsGuo Albert
 
JPA lifecycle events practice
JPA lifecycle events practiceJPA lifecycle events practice
JPA lifecycle events practiceGuo Albert
 
XDate - a modern java-script date library
XDate -  a modern java-script date libraryXDate -  a modern java-script date library
XDate - a modern java-script date libraryGuo Albert
 
How to avoid check style errors
How to avoid check style errorsHow to avoid check style errors
How to avoid check style errorsGuo Albert
 
NIG系統報表開發指南
NIG系統報表開發指南NIG系統報表開發指南
NIG系統報表開發指南Guo Albert
 
Ease Your Effort of Putting Data into History Table
Ease Your Effort of Putting Data into History TableEase Your Effort of Putting Data into History Table
Ease Your Effort of Putting Data into History TableGuo Albert
 
NIG 系統開發指引
NIG 系統開發指引NIG 系統開發指引
NIG 系統開發指引Guo Albert
 
NIG系統開發文件閱讀步驟
NIG系統開發文件閱讀步驟NIG系統開發文件閱讀步驟
NIG系統開發文件閱讀步驟Guo Albert
 
Form Bean Creation Process for NIG System
Form Bean Creation Process for NIG SystemForm Bean Creation Process for NIG System
Form Bean Creation Process for NIG SystemGuo Albert
 
A Short Intorduction to JasperReports
A Short Intorduction to JasperReportsA Short Intorduction to JasperReports
A Short Intorduction to JasperReportsGuo Albert
 
Apply Template Method Pattern in Report Implementation
Apply Template Method Pattern in Report ImplementationApply Template Method Pattern in Report Implementation
Apply Template Method Pattern in Report ImplementationGuo Albert
 
Utilize Commons BeansUtils to do copy object
Utilize Commons BeansUtils to do copy objectUtilize Commons BeansUtils to do copy object
Utilize Commons BeansUtils to do copy objectGuo Albert
 
Apply my eclipse to do entity class generation
Apply my eclipse to do entity class generationApply my eclipse to do entity class generation
Apply my eclipse to do entity class generationGuo Albert
 
Nig project setup quickly tutorial
Nig project setup quickly tutorialNig project setup quickly tutorial
Nig project setup quickly tutorialGuo Albert
 
Spring JDBCTemplate
Spring JDBCTemplateSpring JDBCTemplate
Spring JDBCTemplateGuo Albert
 

Mehr von Guo Albert (20)

AWS IAM (Identity and Access Management) Policy Simulator
AWS IAM (Identity and Access Management) Policy SimulatorAWS IAM (Identity and Access Management) Policy Simulator
AWS IAM (Identity and Access Management) Policy Simulator
 
TOEIC 準備心得
TOEIC 準備心得TOEIC 準備心得
TOEIC 準備心得
 
DBM專案環境建置
DBM專案環境建置DBM專案環境建置
DBM專案環境建置
 
JPA Optimistic Locking With @Version
JPA Optimistic Locking With @VersionJPA Optimistic Locking With @Version
JPA Optimistic Locking With @Version
 
OCEJPA Study Notes
OCEJPA Study NotesOCEJPA Study Notes
OCEJPA Study Notes
 
OCEJPA(1Z0-898) Preparation Tips
OCEJPA(1Z0-898) Preparation TipsOCEJPA(1Z0-898) Preparation Tips
OCEJPA(1Z0-898) Preparation Tips
 
JPA lifecycle events practice
JPA lifecycle events practiceJPA lifecycle events practice
JPA lifecycle events practice
 
XDate - a modern java-script date library
XDate -  a modern java-script date libraryXDate -  a modern java-script date library
XDate - a modern java-script date library
 
How to avoid check style errors
How to avoid check style errorsHow to avoid check style errors
How to avoid check style errors
 
NIG系統報表開發指南
NIG系統報表開發指南NIG系統報表開發指南
NIG系統報表開發指南
 
Ease Your Effort of Putting Data into History Table
Ease Your Effort of Putting Data into History TableEase Your Effort of Putting Data into History Table
Ease Your Effort of Putting Data into History Table
 
NIG 系統開發指引
NIG 系統開發指引NIG 系統開發指引
NIG 系統開發指引
 
NIG系統開發文件閱讀步驟
NIG系統開發文件閱讀步驟NIG系統開發文件閱讀步驟
NIG系統開發文件閱讀步驟
 
Form Bean Creation Process for NIG System
Form Bean Creation Process for NIG SystemForm Bean Creation Process for NIG System
Form Bean Creation Process for NIG System
 
A Short Intorduction to JasperReports
A Short Intorduction to JasperReportsA Short Intorduction to JasperReports
A Short Intorduction to JasperReports
 
Apply Template Method Pattern in Report Implementation
Apply Template Method Pattern in Report ImplementationApply Template Method Pattern in Report Implementation
Apply Template Method Pattern in Report Implementation
 
Utilize Commons BeansUtils to do copy object
Utilize Commons BeansUtils to do copy objectUtilize Commons BeansUtils to do copy object
Utilize Commons BeansUtils to do copy object
 
Apply my eclipse to do entity class generation
Apply my eclipse to do entity class generationApply my eclipse to do entity class generation
Apply my eclipse to do entity class generation
 
Nig project setup quickly tutorial
Nig project setup quickly tutorialNig project setup quickly tutorial
Nig project setup quickly tutorial
 
Spring JDBCTemplate
Spring JDBCTemplateSpring JDBCTemplate
Spring JDBCTemplate
 

Kürzlich hochgeladen

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 

Kürzlich hochgeladen (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 

VTD-XML: The Future of XML Processing

  • 1. VTD-XML: The Future of XML Processing Albert Guo junyuo@gmail.com
  • 2. VTD-XML Motivations Behind VTD-XML Why VTD-XML? When to Use VTD-XML? Known Limitations Basic Concept Essential Classes Shortcomings Typical Programming Flows Demo Reference Agenda 2
  • 3. *Numerous*, well-known issues of old XML processing models, below summarizes a few: Comparison with DOM, SAX and PULL http://vtd-xml.sourceforge.net/userGuide/5.html Motivations Behind VTD-XML 3
  • 4.
  • 5. The world's most memory-efficient random-access XML parser.
  • 7. The world's fastest XPath 1.0 implementation.
  • 8. The world's most efficient XML indexer that seamlessly integrates with your XML applications.
  • 9. The world's only incremental-update capable XML parser capable of cutting, pasting, splitting and assembling XML documents with max efficiency.
  • 10. The world's only XML parser that allows you to use XPath to process 256 GB XML documents.Why VTD-XML? 4
  • 11. The scenarios that you may consider using VTD-XML Large XML files that DOM can’t handle Performance-critical transactional Web- Services/SOA applications Native XML database applications Network-based XML content switching/routing/security applications When to Use VTD-XML? 5
  • 12.
  • 13. Not yet process DTD (return as a single VTD record)
  • 14. Schema validation feature is planned for a future release.
  • 15. Extreme long (>=512 chars) element/attribute names or ultra deep document (>= 255 levels) will cause parse exception
  • 17.
  • 18. The XML document is kept intact and un-decoded.7
  • 19.
  • 20. VTD-XML performs many string operations directly on VTD records
  • 21. String to VTD record comparison (both boolean and lexicographically)
  • 22. Direct conversions from VTD records to ints, longs, floats and doubles
  • 23. VTD record to String conversion also provided, but avoid them whenever possible for performance reasons8
  • 24.
  • 25. Move a single, global cursor to different locations in the document tree
  • 26. Many VTDNav’s methods identify a VTD record with its index value
  • 27. -1 corresponds to “no such record”9
  • 30. Poor exception handling Shortcomings 12 If this method does not execute properly, it will just return false from parseFile method, and does not report any exception message.
  • 31. Add BufferedInput Stream in parseFile method to avoid running out of read buffer max size in UNIX platform Shortcomings – cont. 13 You need to modify the build.bat to rebuild VTD-XML jar file, then set it into class path. //add commons-io jar file into the first line javac-classpath .;D:ibommons-io-1.4ommons-io-1.4.jar comimpleware.java javac comimplewarepath.java javac comimplewarearser.java … Finally, you just need to execute build.bat file. Then it will generate the brand-new jar file for you.
  • 32. Typical Programming Flows Call VTDGen’s parseFile(…) Start with a byte buffer containing the content of XML, call set_doc() of VTDGen Call VTDGen’s loadIndex(…) Call VTDGen’s parse() Obtain an instance VTDNav from VTDGen Move VTDNav’s cursor manually to various locations and perform corresponding application logic Instantiate autoPilot for node iteration and XPath to perform Corresponding application logic 14
  • 34. 1. Add <age> tag after <geneder> 16
  • 35. 1. Add <age> tag after <geneder> – cont. 17 Compiled XPath expression Binded with NTDNav Assigned age value Moved to gender cursor, and added <age> tag after <gender> tag Outputted to new xml file
  • 37. 2. Remove <age> tag – cont. 19 Compiled XPath expression Binded with NTDNav Remove <age> Outputted to new xml file
  • 38. 3. Add Contact info after <age> tag 20
  • 39. 3. Add Contact info after <age> tag – cont. 21 Compiled XPath expression Binded with NTDNav Assigned age value Inserted new value after <gender> tag Outputted to new xml file
  • 40. 4. Visit XML file 22 2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:339) - name=Albert, gender=男 2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:341) - phone=02-11111111 2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:341) - phone=0911111111 2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:341) - phone=0915555555 2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:339) - name=Mandy, gender=女 2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:341) - phone=02-22222222 2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:341) - phone=0912222222 2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:339) - name=Verio, gender=男 2009-12-12 14:30:31,187 DEBUG xml.XmlTest.visitPersonData(XmlTest.java:341) - phone=0913333333
  • 41. 4. Visit XML file – cont. 23 Compiled XPath expression Binded with NTDNav
  • 42. 4. Visit XML file – cont. 24
  • 43. Official site http://vtd-xml.sourceforge.net/ Jar files, source code, sample code http://sourceforge.net/projects/vtd-xml/files/ JavaDoc http://vtd-xml.sourceforge.net/javadoc/ Accelerate WSS applications with VTD-XML http://www.javaworld.com/javaworld/jw-01-2007/jw-01-vtd.html Reference 25
  • 44. Process SOAP with VTD-XML http://jimmyzhang.sys-con.com/node/48764/mobile XPathTutorial: http://www.w3schools.com/XPath/default.asp Sample code http://sites.google.com/site/junyuo/Home/code 26 Reference – cont.