SlideShare ist ein Scribd-Unternehmen logo
1 von 69
The Ultimate Performance Challenge:“How to Make XML Perform ?!” Marco Gralike
Agenda
Agenda
“XML is not a ‘fast’ thing, there is a ton of parsing involved. Sorry, I never saw the point in huge XML files – they are many times larger than they should be and the amount of work involved in parsing them is incredible”. Tom Kyte - Januari 9, 2009, AskTom
“The foundation is there; So why not use it?” …referring to the Relational Model… Chris Date- Hotsos keynote, 2009
Relational…
XML…?
Evolution…
If you’re a performance nerd,  	this is actually cool… No one figured out XML yet… Solving the customer problem… Back to basics… Deeper understanding of 	the data handling issues… So why the “Hxxx” XML…?
Agenda
Free Format…”XML is cool”…  (aka no design effort) Have to uphold the “Coding Granny Argument” (among others meaningful names) Everyone for themselves… Waiting for “Codd, Date”… Square wheels… What’s spoiling the soup…?
Different data models XPath models an XML document as  	a tree while most general purpose  	programming languages  	have no native data types for a tree. Different programming paradigms  XSLT is a functional language, while Java  	is object-oriented and Perl is a procedural one. Impedance Mismatch
Effects, Costs Unnecessary CPU and Memory  Overhead  A lot of expensive type and  	encoding conversions Impedance Mismatch
Agenda
Containerization
The “Dimensions” in 1 XML doc. 1 3 4 5 2 X Y 6 Z nx rows  Elements with maxoccurs=“unbounded”
Multi Dimensional Issues… Its a database… Its Row based Its Column based Its multiple databases… More then 1 XML doc Not uncommon 1 Mb >>
Complexities of a database “Relations” “Redundancy” “Nullology” Design, etc… It can contain a database 10 Mb or bigger nowadays More often than less… Enormous complex XSD’s  XMLType – Not just a “Container”
Checked on XML Well-Formedness One root element Begin & End tags If XML Schema reference XOB methods will be used if an XML Schema is available DOM methods will be used if registered  	XML Schema information is not available  XMLType – Not just a “Container”
What you want in access… Fast DDL Selects Inserts, Deletes, Updates Specific / Smart Small XML Fragments Direct Access
Agenda
Document contra Data Driven
Structured / Semi-Structured Structured Semi Structured
Common XML Parsers Often DOM or Infoset based CPU intensive Memory intensive Serializing, parsing, tree traversals, happen in memory…
In Memory: Common XML Parsers Often handle XML tree traversals only via  ONEmethod It is not structured, semi-structured or unstructured XML content aware It is not very “smart” / “content aware” regarding XMLhandling based on its XML tree’s and/or XML data content
XMLType Physical Storage CLOB LOB LOB index Object Relational Varray, Types, Nested Tables IOT, B-Tree, XML Schema Binary XML LOB, LOB Index Stored in Post Parse Representation
Choosing a Storage Model
Hybrid CLOB Mixed complex[n] un/structured XSD [y] B-Tree, IOT Document na unstructured XSD [n] XMLIndex Relational World XMLDB World XML Data Storage XMLType column/tables XMLType Views Obj.Rel. Binary XML Content complex[n] structured XSD [y] B-Tree, IOT (Object)  Relational  Objects Mixed complex[y] un/structured XSD [y/n] XMLIndex Relational  Tables
Partition XML data EMPLOYEES_PROJ_TAB PROJ_DETAILS_TAB EMP_PROJ_P11 “employees”.”employee” reference_id EMP_PROJ_P12
XML Partitioning Object Relational Partitioning Equi-Partitioning since version Oracle 11.1.0.7.0 Binary XML Partitioning Range, List, Hash Local partitioned XMLIndex LOCAL keyword in XMLIndex create syntax XMLIndex is not supported for HASH partitioning Partition Key on virtual Column (Binary XML) Partition Key on column (Object Relational)
Agenda
Index Quick Sheet
Unstructured XMLIndex (UXI) PathTable UsePath Subsetting FullBlown XMLIndex canbe BIG  Token Tables (XDB.X$......) Query re-writeonTokens Fuzzy Searches, // Optimizer Statistics CanbemaintainedManually Recorded inPending Table Secondaryindexespossible Unstructured XMLIndex f (x) Path Table
PathTable INDEXED COLUMNS PATH INDEX ,[object Object],ORDER INDEX ,[object Object],VALUE INDEX ,[object Object]
FUNCTION BASEDNotIndexed: LOCATOR column, pointer to  XML fragments (XDB.X$...) SECONDARY INDEXES Unstructured XMLIndex f (x) Path Table
Structured XMLIndex (SXI) Content Table(s) BasedonXMLTABLE syntax XMLTable construct canbe nestedbut: Only 1 extra XMLType allowed VIRTUAL column is passed CanbemaintainedManually Secondaryindexespossible Structured XMLIndex f (x) Content Tables
Content Table(s) INDEXED COLUMNS KEY INDEX ,[object Object],RID INDEX ,[object Object],Indexesneededforcombined XMLIndex Types Mixing Unstructured and StructuredXMLIndexes Yourdefined columns  Secondaryindexes Structured XMLIndex f (x) Content Tables
Driving access on CONTENT BTree Index bookstore Secondary Oracle Text Index Function based Index (XPath) book whitepaper    StructuredXMLIndex Unstructured XMLIndex title author author chapter title author id paragraph content structured content Structured XMLIndex
There can be only one XMLIndex…
Agenda
Design
XML Schema will be parsed only once If registered in the XDB Repository XML Schema will be cached in memory (SGA) No additional parsing No additional validation XML Schema Advantages
XML Document structure is known, therefore No parsing is needed when loaded from disk into memory XML OBject (XOB) structures can be applied Memory footprint is much less compared to DOM structure Needed specific nodes can now be handled efficiently in memory XML Schema Advantages
XDB Annotations Hybrid: CLOB withinOR
XDB Annotations (OR/Binary XML) Levels Root, Simpletype, Complextype xmlns:xdb="http://xmlns.oracle.com/xdb" xdb:storeVarrayAsTable xdb:defaultTable xdb:maintainDom xdb:maintainOrder xdb:SQLInline Oracle V.11.1.0.7.0 - Partitioning  xdb:tableprops
Mixing Logical and Physical Design
XML Schema - Query Rewrite String CHAR String Float bookstore CLOB VARCHAR2 (20) book whitepaper title author author chapter title author id paragraph NUMBER (15) content content
XML Design Avoid Cyclic References in XML Schemata For ease of Maintenance: xdb:annotations Is DOM validation, fidelity needed ? CPU / XML parsing:  	XML Schema validation “overhead” ? Index maintenance overhead,  	when using “disk” solutions Y X
Be aware of what you are doing ! Avoid unneeded (full) XML Schema validation During Storage (Inserts), Generating XML xdb:MaintainDOM=false Avoid Impedance mismatch Java  XML  Java  XML  Relational  XML  Java (“All In One Go Objective”) Avoid XML fragments //  and/or via XMLEXISTS Use Indexes  Y X
Agenda
Keep XML small Do not use / enforce Pretty Print if not needed Avoid namespace reference “Overkill” Most used Namespace is Leading  Use short Namespace References (aliases) Make XML data as “sparse” as possible <employee><name>Marco</name></employee> <employee name=“Marco”/> XML Data Partitioning Binary XML if needed Y X
Keep XML small (OR specific) Don’t use “meaning full element names” 64Kb DDL “create table” buffer ORA 01792 maximum number of columns in a table or view is 1000 Break XML up Out of Line CLOB (unstructured) Not Accessed Data Don’t create objects if you don’t need it Use xdb:defaultTable=“” for global types
Holistic Approach (Recap)
Customer Use Case Memory / DOM Memory / DOM CLOB  Oracle  Advanced Queue XMLType BLOB Process  Checks Validation XML Schema (JAVA) Store in  ETL Tables Shred Elements Via XMLDOM
Duration (1000 Cases)
New XML Approach Rewrite on Disk  / XOB  (Relational) CLOB  Oracle  Advanced Queue BLOB Store in  ETL Tables Oracle  Workflow Validation Against  XML Schema Checks XMLType Table (O.R)
Using the CBO as an XML Parser… ORA-31186 ORA-31186 ORA-31186 ORA-31186: 	Document contains too many nodes Cause: 	Unable to load the document because it has exceeded the maximum allocated number of DOM nodes.
Using the (XML) Relational Mindset Design XSD as you would with E(E)R Design for proper physical access, performance: Storage, Index Content Awareness Partitioning  Overkill of “meaning full” data parsing Avoid Redundancy, whitespace, “Pretty Print” Design with the future in mind
So in short: Balanced Design Inserts, Updates & Deletes XML Future Changes  Index Maintenance Selects In Memory Via Indexes XML Validation Strict, Lazy Client Side Possibilities
Reward Optimal performance Out performing XML  Proper design will give 	you 10, 100 fold  	performance increase over 	XML handling… …also known as…ehh… …standard relational database  performance…

Weitere ähnliche Inhalte

Mehr von Marco Gralike

BGOUG 2012 - Design concepts for xml applications that will perform
BGOUG 2012 - Design concepts for xml applications that will performBGOUG 2012 - Design concepts for xml applications that will perform
BGOUG 2012 - Design concepts for xml applications that will perform
Marco Gralike
 
UKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File Server
UKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File ServerUKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File Server
UKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File Server
Marco Gralike
 
Miracle Open World 2011 - XML Index Strategies
Miracle Open World 2011  -  XML Index StrategiesMiracle Open World 2011  -  XML Index Strategies
Miracle Open World 2011 - XML Index Strategies
Marco Gralike
 
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...
Marco Gralike
 
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2
Marco Gralike
 
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1
Marco Gralike
 

Mehr von Marco Gralike (20)

UKOUG Tech14 - Using Database In-Memory Column Store with Complex Datatypes
UKOUG Tech14 - Using Database In-Memory Column Store with Complex DatatypesUKOUG Tech14 - Using Database In-Memory Column Store with Complex Datatypes
UKOUG Tech14 - Using Database In-Memory Column Store with Complex Datatypes
 
Ordina Oracle Open World
Ordina Oracle Open WorldOrdina Oracle Open World
Ordina Oracle Open World
 
Starting with JSON Path Expressions in Oracle 12.1.0.2
Starting with JSON Path Expressions in Oracle 12.1.0.2Starting with JSON Path Expressions in Oracle 12.1.0.2
Starting with JSON Path Expressions in Oracle 12.1.0.2
 
An introduction into Oracle VM V3.x
An introduction into Oracle VM V3.xAn introduction into Oracle VM V3.x
An introduction into Oracle VM V3.x
 
An introduction into Oracle Enterprise Manager Cloud Control 12c Release 3
An introduction into Oracle Enterprise Manager Cloud Control 12c Release 3An introduction into Oracle Enterprise Manager Cloud Control 12c Release 3
An introduction into Oracle Enterprise Manager Cloud Control 12c Release 3
 
XML Amsterdam - Creating structure in unstructured data
XML Amsterdam - Creating structure in unstructured dataXML Amsterdam - Creating structure in unstructured data
XML Amsterdam - Creating structure in unstructured data
 
An AMIS Overview of Oracle database 12c (12.1)
An AMIS Overview of Oracle database 12c (12.1)An AMIS Overview of Oracle database 12c (12.1)
An AMIS Overview of Oracle database 12c (12.1)
 
Flexibiliteit & Snel Schakelen
Flexibiliteit & Snel SchakelenFlexibiliteit & Snel Schakelen
Flexibiliteit & Snel Schakelen
 
Hotsos 2013 - Creating Structure in Unstructured Data
Hotsos 2013 - Creating Structure in Unstructured DataHotsos 2013 - Creating Structure in Unstructured Data
Hotsos 2013 - Creating Structure in Unstructured Data
 
Expertezed 2012 Webcast - XML DB Use Cases
Expertezed 2012 Webcast - XML DB Use CasesExpertezed 2012 Webcast - XML DB Use Cases
Expertezed 2012 Webcast - XML DB Use Cases
 
BGOUG 2012 - Drag & drop and other stuff - Using your database as a file server
BGOUG 2012 - Drag & drop and other stuff - Using your database as a file serverBGOUG 2012 - Drag & drop and other stuff - Using your database as a file server
BGOUG 2012 - Drag & drop and other stuff - Using your database as a file server
 
BGOUG 2012 - XML Index Strategies
BGOUG 2012 - XML Index StrategiesBGOUG 2012 - XML Index Strategies
BGOUG 2012 - XML Index Strategies
 
BGOUG 2012 - Design concepts for xml applications that will perform
BGOUG 2012 - Design concepts for xml applications that will performBGOUG 2012 - Design concepts for xml applications that will perform
BGOUG 2012 - Design concepts for xml applications that will perform
 
ODTUG Webcast - Thinking Clearly about XML
ODTUG Webcast - Thinking Clearly about XMLODTUG Webcast - Thinking Clearly about XML
ODTUG Webcast - Thinking Clearly about XML
 
UKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File Server
UKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File ServerUKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File Server
UKOUG 2011 - Drag, Drop and other Stuff. Using your Database as a File Server
 
XFILES, The APEX 4 version - The truth is in there
XFILES, The APEX 4 version - The truth is in thereXFILES, The APEX 4 version - The truth is in there
XFILES, The APEX 4 version - The truth is in there
 
Miracle Open World 2011 - XML Index Strategies
Miracle Open World 2011  -  XML Index StrategiesMiracle Open World 2011  -  XML Index Strategies
Miracle Open World 2011 - XML Index Strategies
 
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...
UKOUG 2010 (Birmingham) - XML Indexing strategies - Choosing the Right Index ...
 
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2
 
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Kürzlich hochgeladen (20)

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 

Hotsos 2010 - The Ultimate Performance Challenge: How To Make Xml Perform?

  • 1. The Ultimate Performance Challenge:“How to Make XML Perform ?!” Marco Gralike
  • 3.
  • 4.
  • 6. “XML is not a ‘fast’ thing, there is a ton of parsing involved. Sorry, I never saw the point in huge XML files – they are many times larger than they should be and the amount of work involved in parsing them is incredible”. Tom Kyte - Januari 9, 2009, AskTom
  • 7. “The foundation is there; So why not use it?” …referring to the Relational Model… Chris Date- Hotsos keynote, 2009
  • 11. If you’re a performance nerd, this is actually cool… No one figured out XML yet… Solving the customer problem… Back to basics… Deeper understanding of the data handling issues… So why the “Hxxx” XML…?
  • 13. Free Format…”XML is cool”… (aka no design effort) Have to uphold the “Coding Granny Argument” (among others meaningful names) Everyone for themselves… Waiting for “Codd, Date”… Square wheels… What’s spoiling the soup…?
  • 14. Different data models XPath models an XML document as a tree while most general purpose programming languages have no native data types for a tree. Different programming paradigms XSLT is a functional language, while Java is object-oriented and Perl is a procedural one. Impedance Mismatch
  • 15. Effects, Costs Unnecessary CPU and Memory Overhead A lot of expensive type and encoding conversions Impedance Mismatch
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 23. The “Dimensions” in 1 XML doc. 1 3 4 5 2 X Y 6 Z nx rows Elements with maxoccurs=“unbounded”
  • 24. Multi Dimensional Issues… Its a database… Its Row based Its Column based Its multiple databases… More then 1 XML doc Not uncommon 1 Mb >>
  • 25. Complexities of a database “Relations” “Redundancy” “Nullology” Design, etc… It can contain a database 10 Mb or bigger nowadays More often than less… Enormous complex XSD’s XMLType – Not just a “Container”
  • 26. Checked on XML Well-Formedness One root element Begin & End tags If XML Schema reference XOB methods will be used if an XML Schema is available DOM methods will be used if registered XML Schema information is not available XMLType – Not just a “Container”
  • 27. What you want in access… Fast DDL Selects Inserts, Deletes, Updates Specific / Smart Small XML Fragments Direct Access
  • 30. Structured / Semi-Structured Structured Semi Structured
  • 31. Common XML Parsers Often DOM or Infoset based CPU intensive Memory intensive Serializing, parsing, tree traversals, happen in memory…
  • 32. In Memory: Common XML Parsers Often handle XML tree traversals only via ONEmethod It is not structured, semi-structured or unstructured XML content aware It is not very “smart” / “content aware” regarding XMLhandling based on its XML tree’s and/or XML data content
  • 33. XMLType Physical Storage CLOB LOB LOB index Object Relational Varray, Types, Nested Tables IOT, B-Tree, XML Schema Binary XML LOB, LOB Index Stored in Post Parse Representation
  • 35. Hybrid CLOB Mixed complex[n] un/structured XSD [y] B-Tree, IOT Document na unstructured XSD [n] XMLIndex Relational World XMLDB World XML Data Storage XMLType column/tables XMLType Views Obj.Rel. Binary XML Content complex[n] structured XSD [y] B-Tree, IOT (Object) Relational Objects Mixed complex[y] un/structured XSD [y/n] XMLIndex Relational Tables
  • 36.
  • 37. Partition XML data EMPLOYEES_PROJ_TAB PROJ_DETAILS_TAB EMP_PROJ_P11 “employees”.”employee” reference_id EMP_PROJ_P12
  • 38. XML Partitioning Object Relational Partitioning Equi-Partitioning since version Oracle 11.1.0.7.0 Binary XML Partitioning Range, List, Hash Local partitioned XMLIndex LOCAL keyword in XMLIndex create syntax XMLIndex is not supported for HASH partitioning Partition Key on virtual Column (Binary XML) Partition Key on column (Object Relational)
  • 41. Unstructured XMLIndex (UXI) PathTable UsePath Subsetting FullBlown XMLIndex canbe BIG Token Tables (XDB.X$......) Query re-writeonTokens Fuzzy Searches, // Optimizer Statistics CanbemaintainedManually Recorded inPending Table Secondaryindexespossible Unstructured XMLIndex f (x) Path Table
  • 42.
  • 43. FUNCTION BASEDNotIndexed: LOCATOR column, pointer to XML fragments (XDB.X$...) SECONDARY INDEXES Unstructured XMLIndex f (x) Path Table
  • 44. Structured XMLIndex (SXI) Content Table(s) BasedonXMLTABLE syntax XMLTable construct canbe nestedbut: Only 1 extra XMLType allowed VIRTUAL column is passed CanbemaintainedManually Secondaryindexespossible Structured XMLIndex f (x) Content Tables
  • 45.
  • 46. Driving access on CONTENT BTree Index bookstore Secondary Oracle Text Index Function based Index (XPath) book whitepaper StructuredXMLIndex Unstructured XMLIndex title author author chapter title author id paragraph content structured content Structured XMLIndex
  • 47. There can be only one XMLIndex…
  • 50. XML Schema will be parsed only once If registered in the XDB Repository XML Schema will be cached in memory (SGA) No additional parsing No additional validation XML Schema Advantages
  • 51. XML Document structure is known, therefore No parsing is needed when loaded from disk into memory XML OBject (XOB) structures can be applied Memory footprint is much less compared to DOM structure Needed specific nodes can now be handled efficiently in memory XML Schema Advantages
  • 52. XDB Annotations Hybrid: CLOB withinOR
  • 53. XDB Annotations (OR/Binary XML) Levels Root, Simpletype, Complextype xmlns:xdb="http://xmlns.oracle.com/xdb" xdb:storeVarrayAsTable xdb:defaultTable xdb:maintainDom xdb:maintainOrder xdb:SQLInline Oracle V.11.1.0.7.0 - Partitioning xdb:tableprops
  • 54. Mixing Logical and Physical Design
  • 55. XML Schema - Query Rewrite String CHAR String Float bookstore CLOB VARCHAR2 (20) book whitepaper title author author chapter title author id paragraph NUMBER (15) content content
  • 56. XML Design Avoid Cyclic References in XML Schemata For ease of Maintenance: xdb:annotations Is DOM validation, fidelity needed ? CPU / XML parsing: XML Schema validation “overhead” ? Index maintenance overhead, when using “disk” solutions Y X
  • 57. Be aware of what you are doing ! Avoid unneeded (full) XML Schema validation During Storage (Inserts), Generating XML xdb:MaintainDOM=false Avoid Impedance mismatch Java  XML  Java  XML  Relational  XML  Java (“All In One Go Objective”) Avoid XML fragments // and/or via XMLEXISTS Use Indexes Y X
  • 59. Keep XML small Do not use / enforce Pretty Print if not needed Avoid namespace reference “Overkill” Most used Namespace is Leading Use short Namespace References (aliases) Make XML data as “sparse” as possible <employee><name>Marco</name></employee> <employee name=“Marco”/> XML Data Partitioning Binary XML if needed Y X
  • 60. Keep XML small (OR specific) Don’t use “meaning full element names” 64Kb DDL “create table” buffer ORA 01792 maximum number of columns in a table or view is 1000 Break XML up Out of Line CLOB (unstructured) Not Accessed Data Don’t create objects if you don’t need it Use xdb:defaultTable=“” for global types
  • 62. Customer Use Case Memory / DOM Memory / DOM CLOB Oracle Advanced Queue XMLType BLOB Process Checks Validation XML Schema (JAVA) Store in ETL Tables Shred Elements Via XMLDOM
  • 64. New XML Approach Rewrite on Disk / XOB (Relational) CLOB Oracle Advanced Queue BLOB Store in ETL Tables Oracle Workflow Validation Against XML Schema Checks XMLType Table (O.R)
  • 65. Using the CBO as an XML Parser… ORA-31186 ORA-31186 ORA-31186 ORA-31186: Document contains too many nodes Cause: Unable to load the document because it has exceeded the maximum allocated number of DOM nodes.
  • 66. Using the (XML) Relational Mindset Design XSD as you would with E(E)R Design for proper physical access, performance: Storage, Index Content Awareness Partitioning Overkill of “meaning full” data parsing Avoid Redundancy, whitespace, “Pretty Print” Design with the future in mind
  • 67. So in short: Balanced Design Inserts, Updates & Deletes XML Future Changes Index Maintenance Selects In Memory Via Indexes XML Validation Strict, Lazy Client Side Possibilities
  • 68. Reward Optimal performance Out performing XML Proper design will give you 10, 100 fold performance increase over XML handling… …also known as…ehh… …standard relational database performance…
  • 69.
  • 70. References Oracle XML DB http://www.oracle.com/pls/db112/homepage XML DB FAQ Thread http://forums.oracle.com/forums/thread.jspa?threadID=410714 Blog http://technology.amis.nl/blog http://blog.gralike.com

Hinweis der Redaktion

  1. Square wheel  JSON?
  2. Emp/Dept tables, Foreign/Primary Keys…Showing here ONLY 1 XML document…