Ayasdi and Teradata launched our strategic partnership to enable mainstream business users and knowledge experts at large organizations to rapidly discover and act upon critical insights hidden in their massive and complex datasets.
Machine Learning Software Engineering Patterns and Their Engineering
Ayasdi & Teradata : Applying Topological Data Analysis to Complex Data
1. Teradata Partners Conference ’14
Applying Topological Data Analysis to Complex Data
Abhishek Gupta, Senior Engineer
2. Ayasdi makes the world’s complex data useful
by extracting powerful insights automatically.
Ayasdi named one of the Top 10
Most Innovative Companies in Big
Data for 2013
These big data
companies are ones
to watch
The Structure Data Awards:
Machine Learning /
Artificial Intelligence
Top 100 Private Companies
– Big Data/Analytics
Named by Mary Meeker as
one of the most interesting
companies in the data/
analytics space
Company Confidential & Proprietary 2
3. The Promise of Big Data
Company Confidential & Proprietary
3
Information Understanding Business Impact
4. Why Do Current Approaches Fail?
Company Confidential & Proprietary
4
Today’s Approach to Analytics
Hypothesis
Challenges:
• Incomplete and missing insights
• Depends on humans to scale
• Slow responses due to iteration
5. A New Approach Is Required
Company Confidential & Proprietary
5
Algorithms & Compute
OR
Benefits:
• Automated understanding
• Comprehensive
• Fast
6. Comparison
Hypothesis
Verifies Explains
Company Confidential & Proprietary
6
Traditional Analytics Ayasdi Approach
Algorithms & Compute
Labor Intensive Automated
Analysts and Data Scientists Domain Experts
or
7. Ayasdi’s topological framework incorporates, unifies and
enhances other disciplines. Because of these properties
it has extraordinary reach and effectiveness.
Statistics
Machine Learning
Geometry
Company Confidential & Proprietary 7
8. Ayasdi & Teradata Partnership
Company Confidential & Proprietary
8
SQL Code
DDL
Data pushed
through analysis
Key Benefit:
Making your ETL process simpler.
9. Use Case: Anomaly Detection
Leader in flash
memory
storage and
software
Company Confidential & Proprietary
ABOUT THE DATA
• Data consists of die level test information for 1 wafer
• 12,000+ dies with 100+ tests done for each of the die
• Network was built using all the test columns
• Test Result column with pass/fail flag used as metadata
GOAL OF THE ANALYSIS
• Identify different subgroups of dies based on similar test
information
• Find tests that uniquely identify failed die subgroups
9
Fortune 500 and
S&P 500 company
$5B+ in revenue
11. Use Case: Anomaly Detection
Rows in Node
Company Confidential & Proprietary
11
High Low
12. Use Case: Anomaly Detection
Key Takeaway:
Tight concentration of
wafers that pass their tests
in the middle of the cluster
Test Result=True
Company Confidential & Proprietary
12
High Low
13. Use Case: Anomaly Detection
Key Takeaway:
Two distinct regions of
wafers failing their tests
à Next action:
investigate the “why”
Test Result=False
Company Confidential & Proprietary
13
High Low
14. Use Case: Anomaly Detection
Select first failure group
to view underlying
wafer properties
Test Result=False
Company Confidential & Proprietary
14
High Low
15. Use Case: Anomaly Detection
Company Confidential & Proprietary
15
KS scores for test 13
show correlations for
specific failures
16. Use Case: Anomaly Detection
Select second failure
group to view
underlying wafer
properties
Company Confidential & Proprietary
16
High Low
17. Use Case: Anomaly Detection
Company Confidential & Proprietary
17
KS scores for tests 8, 11,
and 3 show correlations
for specific failures
18. Use Case: Anomaly Detection
Leader in flash
memory
storage and
software
Company Confidential & Proprietary
• Pinpoint wafer anomalies that result in scrap and lost revenue
• Previously required at least two days of analysis to identify even the
18
CHALLENGE
most systemic anomalies
SOLUTION
• Accelerated the analysis of wafer data and yield rates to identify
and resolve issues
• Identified additional systemic anomalies previously dismissed as
“random”
• Estimated to save hundred million dollars in the first year from a
reduction in scrap by reducing yield loss by 10%
Fortune 500 and
S&P 500 company
$5B+ in revenue