Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Real Time BOM Explosions with Apache Solr and Spark

668 Aufrufe

Veröffentlicht am

Apache Big Data Conference 2016, Vancouver BC: Talk by Andreas Zitzelsberger (@andreasz82, Principal Software Architect at QAware)

Abstract: Bills of materials (BOMs) are at the heart of every manufacturing process. Especially large BOMs can be found in the automotive industry, where a complex and highly variable product meets high production volumes.
Drawing from the experiences made in an ongoing real world project for a major car manufacturer, Andreas provides an in-depth view how Apache Solr and Apache Spark were used to power an innovative architecture that provides lightning-fast BOM explosions, demand forecasts and scenario-based planning on 20 billion records per scenario.

Veröffentlicht in: Technologie
  • Did u try to use external powers for studying? Like ⇒ www.WritePaper.info ⇐ ? They helped me a lot once.
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier
  • Very Easy To Follow. This plans are great. I was able to bring my laptop batteries and several other types of batteries back to life with your methods. Your instructions are very easy to follow. I have a few more batteries I'm going to recondition today also. ➤➤ http://ishbv.com/ezbattery/pdf
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier

Real Time BOM Explosions with Apache Solr and Spark

  1. 1. REAL TIME BOM EXPLOSIONS WITH APACHE SOLR AND SPARK Andreas Zitzelsberger
  2. 2. BILLS OF MATERIAL (BOMS) EXPLAINED
  3. 3. BOMS ARE NEEDED FOR… Production Planning Forecasting Demand Scenario-Based PlanningRunning Simulations
  4. 4. Demand Resolver THE BIG PICTURE Parts and 
 abstract demands Orders / Independent Demands Production Planning Analytics BOMs / Dependent demands
  5. 5. ORDERS / INDEPENDENT DEMANDS Order Data Car Configuration Id, Order Time, … TYPE VARIANT UPHOLSTERY PAINT OPTION_1, OPTION_2, … … … … … Customer‘s orders Predicted Orders Stock cars Image: Energy Solutions International Inc
  6. 6. PARTS AND ABSTRACT DEMANDS Id Description Attributes Terms TR_1 Automatic transmission, supplier A … TYPE_1 OR VAR_2 AND NOT OPT_2 … TR_2 Automatic transmission, supplier B … TYPE_2 AND VAR_1 OR OPT_5 OR OPT_6 DT_1 150hp, automatic transmission drive train … LI_1 Xenon lights … OPT_3 OR VAR_9 OR … BT_1 Small Battery … 8 <= OPT_3*200 + VAR_9*5 + OPT_4*2 … <= 12 The constructive approach is impossible due to combinatory explosion. 
 The rules need to be evaluated during demand resolution.
  7. 7. Demand Resolver WE NEED TO OVERCOME THREE CORE PROBLEMS… Parts and 
 abstract demands Orders / Independent Demands Production Planning Analytics BOMs / Dependent demands
  8. 8. Demand Resolver 1. HOW TO RESOLVE DEMANDS AS FAST AS POSSIBLE? Parts and 
 abstract demands Orders / Independent Demands BOMs / Dependent demands
  9. 9. 2. HOW TO JOIN LARGE DATA SETS? Parts and 
 abstract demands Analytics BOMs / Dependent demands Orders / Independent Demands
  10. 10. 3. ANALYTICS ON LARGE DATA SETS? Parts and 
 abstract demands Analytics BOMs / Dependent demands Orders / Independent Demands
  11. 11. Demand Resolver 1. HOW TO RESOLVE DEMANDS AS FAST AS POSSIBLE? Parts and 
 abstract demands Orders / Independent Demands On to problem #1… BOMs / Dependent demands
  12. 12. THE SIMPLE SOLUTION…
  13. 13. A BETTER SOLUTION…
  14. 14. 2 million parts 15 million abstract demands OR: ONLY CALCULATE WHAT IS STRICTLY NECESSARY Step 2: REFINE Refine the superset by evaluating the usage terms in O(n). Actual demands Post- Filter Pre- Filter Abstract Demands Step 1: FILTER Pre-filter demands with a fast index in O(log n). Yields a superset of the required parts. ~ 6000 possible demands ~ 3000 actual demands
  15. 15. PRE-FILTER USING CANDIDATE AND A PROSCRIPTION SETS ■ Consider ((OPT_1) OR (OPT_2)) AND NOT(OPT_3 OR OPT_4) ■ Candidate set: C(t) = {OPT_1, OPT_2} ■ Proscription set: P(t) = {OPT_3, OPT_4} ■ Preselect by the criterium: ■ At least one condition from C(t) is satisfied ■ None of the conditions from P(t) are satisfied
  16. 16. EXAMPLE CANDIDATE AND PROSCRIPTION SETS Candidate set Proscription set A part can only be used together with “white” or “blue” paint. Entry: PART_1, C-PAINT:[white, blue] Query: C-PAINT=blue A part cannot be used with “yellow” paint. Entry: PART_2, P-PAINT:[yellow] Query: NOT P-PAINT=yellow
  17. 17. Solr Server SOLR IMPLEMENTATION PART_ID: “…” // Part attributes // …. TERMS: […] TYPE_C: […] TYPE_P : […] VARIANT_C : […] VARIANT_P : […] OPTIONS_C : […] OPTIONS_P : […] Part If evaluate(TERMS) accept Else drop Post filter Bulk Solr query for N orders Bulk export of the search results Dependent Demands (Solr) Demand Resolver
  18. 18. WHY APACHE SOLR ■ Really fast ■ Powerful faceting ■ Solr cloud ■ Extensible ■ Most importantly, it just works
  19. 19. BOMs / Dependent demands Demand Resolver 1. HOW TO RESOLVE DEMANDS AS FAST AS POSSIBLE? Parts and 
 abstract demands Orders / Independent Demands
  20. 20. 2. HOW TO JOIN TWO LARGE DATA SETS? Parts and 
 abstract demands Analytics BOMs / Dependend demands Orders / Independent Demands let‘s continue with problem #2…
  21. 21. FORWARD AND BACKWARD JOINS ON PARTS Orders / Independent Demands Part Usages /
 Dependend Demands Parts / Abstract Demands Given: Order criteria (TYPE= XY12) 
 and parts criteria (PART_ID = 4711 OR 4712) Task: Find all orders that satisfy the order criteria
 AND contain at least one part, that satisifies the part criteria Return all orders and the parts for each order that satisfy the part criteria. Id … Id …Order Id Part Id 7.000.000 2.000.000 21.000.000.000
  22. 22. ■ Nobody wants to join 7 million x 21 billion x 2 million records. THIS IS ACTUALLY HARD ■ Nobody wants to store 21 billion records.
  23. 23. PARTIAL DENORMALIZATION TO THE RESCUE Orders <10 Mio. records instead of 21 billion Ids Attributes Part Ids 1 … 1, 3, 786,… 2 … 2, 6, 786,… 3 .. 3, 75, 95, .. Ids Attributes Order Ids 1 … 54, 23321, .. 2 … 1, 4, 8652, .. 3 … 864, 23452, … Parts
  24. 24. SIMPLE, BUT NOT EASY: CUSTOM JOIN LOGIC IN SOLR Step 1 Pre-filter the parts using an index search Step 2 Pre-filter the orders using an index search Step 3 Compute the intersection of both supersets All parts Searched parts with attribute A= XY All orders Orders to be built in Q4 with attribute A Intersect records based on technical idsOrders to be built in Q4 Final Result • Custom Search Components • Solr Search Components
  25. 25. WE ONLY EXECUTE THE JOIN IF WE MUST No filter Filter by order attributes Filter by part attributes Filter by order and part attributes Facets on parts Index JOIN Index JOIN Facects on orders Index Index JOIN JOIN List parts JOIN JOIN JOIN JOIN List orders Index Index JOIN JOIN Result sets may be unwieldingly large. Early termination for interactive queries if the result grows too large.
  26. 26. Orders / Independent Demands BOMs / Dependend demands 2. HOW TO JOIN TWO LARGE DATA SETS? Parts and 
 abstract demands Analytics
  27. 27. 3. ANALYTICS ON VERY LARGE DATA SETS? Parts and 
 abstract demands Analytics BOMs / Dependend demands Orders / Independent Demands What about problem #3?
  28. 28. WHY APACHE SPARK ■ Fast ■ High-Level functional API ■ Well integrated ■ Batch semantics ■ Works well with Solr (If you use your own connector…)
  29. 29. TECHNICAL ARCHITECTURE Apache Solr Apache Spark Demand Resolver API Facade Loader Source Systems
  30. 30. Hardware Node 1 Solr Instance Shard 20 Hardware Node 2 Solr Instance Shard 30 Spark Worker Spark Worker Shard 21Shard 11 Spark Master ZooKeeper . . . . . . Spark Master ZooKeeper (Leader) API FacadeAPI Facade Load-Balancer Jenkins + Loader (Leader)Jenkins + Loader (Backup) Hardware Node n Solr Instance Shard n Shard n . . . Spark Worker Add nodes as required OPERATIONS ARCHITECTURE API Facade
  31. 31. BULK EXPORT WITH A SHARED OFF-HEAP CACHE Lucene Segment Lucene Segment Lucene Segment Shared Off-Heap Cache 1 DocId Order DocId Part Web interface 2 3 Business Facade Object data 4 http://chronicle.software/products/chronicle-map/
  32. 32. Orders / Independent Demands BOMs / Dependend demands 3. ANALYTICS ON VERY LARGE DATA SETS? Parts and 
 abstract demands Analytics
  33. 33. LESSONS LEARNED https://en.wikipedia.org/wiki/Melk_Abbey
  34. 34. OPTIMIZE LATE, BUT DO OPTIMIZE © 2015 Max Siedentopf
  35. 35. SOME OF OUR OPTIMIZATIONS Bulk RequestHandler Binary DocValue support Boolean interpreter as postfilter Mass data binary response format Search components with 
 custom JOIN algorithm Solving thousands of orders with one request Be able to store data effective using our own JOIN implementation. Speed up the access to persisted data dramatically using binary doc values. 0111 0111 Use the standard Solr cinary codec with an optimized data- model that reduce the amount 
 of data by a factor of 8. Computing BOM explosions Enable Solr with custom post filters to filter documents using stored bool expessions .
  36. 36. LOW LEVEL OPTIMIZATIONS CAN YIELD GREAT RETURNS October 14 January 15 May 15 October 15 4,9 ms 0,28 ms 24 ms TimetocalculatetheBoMforoneorder 0,08 ms Scoring (-8%) Solr–Query Parser (-25%) Stat-Cache (-8%) Using String DocValues (-28%) Development of the processing time Demand Calulation Service PoC Profiling result and the some improvements to reduce the query time.
  37. 37. DON’T TRUST PAPER, DO POCS
  38. 38. IN CONCLUSION
  39. 39. THANK YOU @andreasz82 andreas.zitzelsberger@qaware.de

×