This webinar will focus on the promise AI holds for organizations in every industry and every size, and how to overcome some of the challenge today of how to prepare for AI in the organization and how to plan AI applications.
The foundation for AI is data. You must have enough data to analyze and build models. Your data determines the depth of AI you can achieve — for example, statistical modeling, machine learning, or deep learning — and its accuracy. The increased availability of data is the single biggest contributor to the uptake in AI where it is thriving. Indeed, data’s highest use in the organization soon will be training algorithms. AI is providing a powerful foundation for impending competitive advantage and business disruption.
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
ADV Slides: Data Curation for Artificial Intelligence Strategies
1. Data Curation for Artificial
Intelligence Strategies
Presented by: William McKnight
President, McKnight Consulting Group
@williammcknight
www.mcknightcg.com
(214) 514-1444
2. Enhance in-car navigation
using computer vision
Reduce cost of handling
misplaced items
improve call center
experiences with chatbots
Improve financial fraud
detection and reduce costly
false positives
Automate paper-based,
human-intensive process
and reduce Document
Verification
Predict flight delays based
on maintenance records and
past flights, in order reduce
cost associated with delays
AI in Action
3. What’s New is Deep Learning
• AI: 1950s
• Machine Learning: 2000s
– supervised learning, unsupervised learning,
reinforcement learning
• Deep Learning: 2010s
– Higher Predictive Accuracy
– Can Analyze All Data Sets
Deep Learning allows more complex problems to be
tackled, and others to be solved with higher accuracy,
with less cumbersome manual fine-tuning
4. AI Affects the Entire Organization
• Strategic
• Technical
• Operational
• Talent
• Data
4
5. Where to Look for AI Opportunities
• The products you make and the services you offer
• The supply chain for those products and services
• Business operations (hiring, procurement, after-
sale service, etc.)
• The intelligence used in determining and designing
your product and service set
• The intelligence used in the marketing/approval
funnel for your products and services
5
6. AI is on the Data Maturity Spectrum
Maturity Level 4 (of 5):
Data Strategy
Data as asset in
financial
statements /
executives; All
development is
within
architecture; All in
on AI
Architecture
EDW with DQ
above standard; 3
& 5 year
architecture plans
Technology
DI=streaming; Graph db for
relationship data; Specialized
analytic stores for workloads
with requirements not suited
for the EDW; EDW columnar;
No ODS; minimal cubes;
MDM – all functions for all
major subject areas; Looking
at GPU DBMS
Organization
Data Governance by subject
area across all major subject
areas; Organizational Change
Management program is part
of all projects; True Self-
Service Business Intelligence;
Chief Information Architect
7. AI Data
• Governance and Quality
• Curated, Most/All Data
• At Scale, History
• High Velocity
• Integrated
• Training Data Curation
7
8. Data to Collect
• This is wide ranging, spanning all current data
• eCommerce
• ERP / CRM
• Iot (e.g., Heavy Industry, Factory, Consumer,
Health, Aircraft)
– Equipment performance
– Forecast breakdowns
– Health risk
• Publicly available (e.g., governmental)
• Third party
• Careful of overfitting
8
9. AI Data
• Call center recordings and chat logs
– content and data relationships as well as answers to questions
• Streaming sensor data, historical maintenance records and search logs
– use cases and user problems
• Customer account data and purchase history
– similarities in buyers and predict responses to offers
• Email response metrics
– processed with text content of offers to surface buyer segments.
• Product catalogs and data sheets
– sources of attributes and attribute values.
• Public references
– procedures, tool lists, and product associations.
• YouTube video content audio tracks
– converted to text and mined for product associations.
• User website behaviors
– correlated with offers and dynamic content.
• Sentiment analysis, user-generated content, social graph data, and other external data sources
– mined and recombined to yield knowledge and user-intent signals.
9
10. Example: Data for Predictive Maintenance
10
• Structured Data
– Time Series
– Events
– Graph
• Unstructured Data
– Text
– Image
– Sound
11. Where to put data for Machine Learning
• Cloud Storage
• DBMS
• HDFS
– optimized for sequential read/writes
• Unstructured Data Stores
• Text-based serializations (CSV, JSON)
– for interoperability
11
12. AI Pattern
1. Hire/Grow Data Science
2. Uncouple AI from Organizational Constraints
– While Conforming the Organization
3. Ideation
4. Compile Data!
– Internal and External
5. Label Data
6. Build Model
7. Prototype
8. Iterate
9. Productionalize
10. Scale
12
13. Algorithm & Data Matching
• Naive Bayes Classification
• Ordinary Least Squares Regression
• Logistic Regression
Try Multiple; Run Contests
13
14. AI Business Use Case Examples
• Marketing – segmentation analysis, campaign
effectiveness
• Cybersecurity – proactive data collection and analysis
of threats
• Smart Cities – track vehicle movements, traffic data,
environmental factors to optimize traffic lights,
ensure smooth flow and manage tolling
• Retail, Manufacturing – Supply flow, Customer flow
• Oil and Gas - determine drilling patterns, ensure
maximum utilization of assets, manage operational
expenses, ensure safety, predictive maintenance
• Life Sciences – study human genome (100s
MB/person) for improving health
• Customer
• Employee
• Partner
• Patient
• Supplier
• Product
• Bill of Materials
• Assets
• Equipment
• Media
• Agencies
• Branches
• Facilities
• Franchises
• Stores
• Account
• Certifications
• Contracts
• Financials
• Policies
Enterprise Data Domains
17. Satellite or Aerial Data
https://medium.com/the-downlinq/car-localization-and-counting-with-overhead-imagery-an-interactive-exploration-9d5a029a596b
18. Corporate Requirements > Data
• The split of the necessary AI/ML between the 'edge' of corporate
users and the software itself is still to be determined
• Math
– floating point arithmetic, deep statistics, and linear algebra
• GPUs
• Python
– easy to program and it good enough
– NumPy and pandas libraries are available
• TensorFlow
– adds a computational/symbolic graph to Python
• R and MATLAB
– optimized for math with features such as direct slice and dice of matrices
and rich libraries to draw from
• Java and Scala
– work well with Hadoop and Spark respectively
18
19. Data Curation for Artificial
Intelligence Strategies
Presented by: William McKnight
President, McKnight Consulting Group
@williammcknight
www.mcknightcg.com
(214) 514-1444