SlideShare a Scribd company logo
1 of 49
Big Data, First Steps
First Steps on Big Data
Alexandre Simundi
Apr 2016
Agenda
• What’s all this fuzz about?
• Technical Challenges & Solutions
• Big Data Analytics
• Where to find knowledge?
Software Engineer / Solutions Architect
Over 11 years of experience in the IT market
Passionate about distributed architecture and
big data technologies
Linkedin: https://cl.linkedin.com/in/simundi
Contact: simundi@bdatalabs.com
Twitter: @simundi
Alexandre
Simundi
1890s motion
picture invented
New experience through multiplication of data
New industry
New profession
New _ _ _ _ path
The real revolution is not in the machine
that calculates data but in the data itself
and how we use it
Big Data: a revolution that will transform how we live, work and think
5%
95% of data is unstructured
Name Country Football team Favorite Food
Alexandre Brazil Grêmio Barbeque
Structured
Unstructured
Hi, my name is Alexandre and
I’m from brazil. I support the
best football team in the world:
Grêmo and my favorite food is
barbeque.
If you can’t analyze all your data, you are blind to its
opportunities
Volume Variety
Velocity Veracity
Value
Source: http://www.halogensoftware.com/uploads/blog/whats-your-credit-score-using-predictive-analytics-with-hr-
data/_thumb/848x450/whats-your-credit-score-using-predictive-analytics-with-hr-data.png
Datafication
N = all
Farecast recorded 200 billion flight
information to predict the best
moment to buy a ticket.
accuracy: 75%
average saving: $50
Correlation
shows what, not why.
knowing what is often good enough.
Big Data, First Steps
Technical Challenges & Solutions
MapReduce Paper
http://research.google.com/archive/mapreduce-osdi04-slides/index.html
Mapreduce necessity:
Reduce cost of distributed
computation
• Distributed file system HDFS
• Computation framework (Map Reduce)
• Resource Management (Yarn)
Hadoop Ecosystem
Master
Worker
result
data
data
data
data
data
data
data
Big Data Databases
Big Data, First Steps
Big Data Analytics
“We are drowning in data, but
starving for knowledge!”
(John Naisbitt, 1982)
Data Mining
Combination of AI and statistical analysis to discover
information that is “hidden” in the data
Data Analysis Insights Decision Action
Business Value
Analytics Maturity Levels
Machine Learning
• Statistics + AI
• Predictive Methods
–Use some variables to
predict some unknown or
future values of other
variables
• Descriptive Methods
–Find human –
interpretable patterns
that describe the data
Supervised vs Unsupervised
• Supervised
– Learning in a presence of an expert/teacher
– Training data set is labeled with a class value
– Goal: Predict a class or value label
Unsupervised
– No knowledge of the output class/value
– Data is NOT labeled
– Goal: learn patterns/groupings
Cross Industry Standard Process for
Data Mining
Big Data, First Steps
Where to find knowledge?
https://www.coursera.org/specializations/big-data
Content:
• Introduction to Big Data
• Hadoop Platform and Application Framework
• Introduction to Big Data Analytics
• Machine Learning With Big Data
• Graph Analytics for Big Data
• Big Data - Capstone Project
Things to learn in the Hadoop Zoo
• HDFS
• Map Reduce
• Apache Pig
• Apache Hive
• Spark
http://www.cloudera.com/downloads/quickstart_vms/5-5.html
http://hortonworks.com/downloads/#sandbox
Machine Learning
• Python
• R https://www.r-project.org
• Mahout http://mahout.apache.org/
• MlLib http://spark.apache.org/mllib
• H2O http://www.h2o.ai
• Dato https://dato.com
• KNIME https://www.knime.org
• bdatalabs (soon) http://www.bdatalabs.com
https://www.coursera.org/specializations/machine-learning
Content:
• Machine Learning Foundations: A Case Study
Approach
• Machine Learning: Regression
• Machine Learning: Classification
• Machine Learning: Clustering & Retrieval
• Machine Learning: Recommender Systems &
Dimensionality Reduction
Google Machine Learning
Microsoft Azure
Open Datasets
• https://www.kaggle.com/datasets
• https://github.com/caesar0301/awesome-
public-datasets#machine-learning
https://www.kaggle.com/datasets
Take away
1. Big Data is a new section in the industry
(Photography vs Film)
2. N = All
3. Build up your strengths, catch up with your
weakness
4. Take advantage of open source power
5. Give it back
First Steps on Big Data

More Related Content

What's hot

HOW TO BECOME AN EFFECTIVE DATA SCIENTIST (WORKSHOP) - MARC WARNER
HOW TO BECOME AN EFFECTIVE DATA SCIENTIST (WORKSHOP) - MARC WARNERHOW TO BECOME AN EFFECTIVE DATA SCIENTIST (WORKSHOP) - MARC WARNER
HOW TO BECOME AN EFFECTIVE DATA SCIENTIST (WORKSHOP) - MARC WARNERBig Data Week
 
How to build a data science team 20115.03.13v6
How to build a data science team 20115.03.13v6How to build a data science team 20115.03.13v6
How to build a data science team 20115.03.13v6Zhihao Lin
 
Data Science Environment with R on openSUSE Leap 15.1
Data Science Environment with R on openSUSE Leap 15.1Data Science Environment with R on openSUSE Leap 15.1
Data Science Environment with R on openSUSE Leap 15.1Sabar Suwarsono
 
Top Hard Skills for Data Jobs - From top H1B sponsors in the U.S.
Top Hard Skills for Data Jobs - From top H1B sponsors in the U.S.Top Hard Skills for Data Jobs - From top H1B sponsors in the U.S.
Top Hard Skills for Data Jobs - From top H1B sponsors in the U.S.恺丽 陈
 
7 ideas on encouraging advanced analytics
7 ideas on encouraging advanced analytics7 ideas on encouraging advanced analytics
7 ideas on encouraging advanced analyticsMark Tabladillo
 
APPLIED DATA SCIENCE: HET ONTWIKKELEN VAN SLIMME ICT-PRODUCTEN DIE LEREN VAN ...
APPLIED DATA SCIENCE: HET ONTWIKKELEN VAN SLIMME ICT-PRODUCTEN DIE LEREN VAN ...APPLIED DATA SCIENCE: HET ONTWIKKELEN VAN SLIMME ICT-PRODUCTEN DIE LEREN VAN ...
APPLIED DATA SCIENCE: HET ONTWIKKELEN VAN SLIMME ICT-PRODUCTEN DIE LEREN VAN ...webwinkelvakdag
 
DataScienceResume
DataScienceResumeDataScienceResume
DataScienceResumeHutch Brock
 
Agile data science
Agile data scienceAgile data science
Agile data scienceJoel Horwitz
 
Big Data and Data Science for traditional Swiss companies
Big Data and Data Science for traditional Swiss companiesBig Data and Data Science for traditional Swiss companies
Big Data and Data Science for traditional Swiss companiesSwiss Big Data User Group
 
Big Data and HR - Talk @SwissHR Congress
Big Data and HR - Talk @SwissHR CongressBig Data and HR - Talk @SwissHR Congress
Big Data and HR - Talk @SwissHR CongressMarcel Blattner, PhD
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Thinkful
 
How to Become a Data Scientist
How to Become a Data ScientistHow to Become a Data Scientist
How to Become a Data Scientistryanorban
 
1140 track 1 weiss_using his mac
1140 track 1 weiss_using his mac1140 track 1 weiss_using his mac
1140 track 1 weiss_using his macRising Media, Inc.
 
Data science presentation 2nd CI day
Data science presentation 2nd CI dayData science presentation 2nd CI day
Data science presentation 2nd CI dayMohammed Barakat
 
Data Science presentation for elementary school students
Data Science presentation for elementary school studentsData Science presentation for elementary school students
Data Science presentation for elementary school studentsMelanie Manning, CFA
 
Life of a data scientist (pub)
Life of a data scientist (pub)Life of a data scientist (pub)
Life of a data scientist (pub)Buhwan Jeong
 
Different Career Paths in Data Science
Different Career Paths in Data ScienceDifferent Career Paths in Data Science
Different Career Paths in Data ScienceRoger Huang
 

What's hot (20)

HOW TO BECOME AN EFFECTIVE DATA SCIENTIST (WORKSHOP) - MARC WARNER
HOW TO BECOME AN EFFECTIVE DATA SCIENTIST (WORKSHOP) - MARC WARNERHOW TO BECOME AN EFFECTIVE DATA SCIENTIST (WORKSHOP) - MARC WARNER
HOW TO BECOME AN EFFECTIVE DATA SCIENTIST (WORKSHOP) - MARC WARNER
 
How to build a data science team 20115.03.13v6
How to build a data science team 20115.03.13v6How to build a data science team 20115.03.13v6
How to build a data science team 20115.03.13v6
 
Data Science Environment with R on openSUSE Leap 15.1
Data Science Environment with R on openSUSE Leap 15.1Data Science Environment with R on openSUSE Leap 15.1
Data Science Environment with R on openSUSE Leap 15.1
 
Top Hard Skills for Data Jobs - From top H1B sponsors in the U.S.
Top Hard Skills for Data Jobs - From top H1B sponsors in the U.S.Top Hard Skills for Data Jobs - From top H1B sponsors in the U.S.
Top Hard Skills for Data Jobs - From top H1B sponsors in the U.S.
 
7 ideas on encouraging advanced analytics
7 ideas on encouraging advanced analytics7 ideas on encouraging advanced analytics
7 ideas on encouraging advanced analytics
 
APPLIED DATA SCIENCE: HET ONTWIKKELEN VAN SLIMME ICT-PRODUCTEN DIE LEREN VAN ...
APPLIED DATA SCIENCE: HET ONTWIKKELEN VAN SLIMME ICT-PRODUCTEN DIE LEREN VAN ...APPLIED DATA SCIENCE: HET ONTWIKKELEN VAN SLIMME ICT-PRODUCTEN DIE LEREN VAN ...
APPLIED DATA SCIENCE: HET ONTWIKKELEN VAN SLIMME ICT-PRODUCTEN DIE LEREN VAN ...
 
DataScienceResume
DataScienceResumeDataScienceResume
DataScienceResume
 
Agile data science
Agile data scienceAgile data science
Agile data science
 
The Big Data Dream Team
The Big Data Dream TeamThe Big Data Dream Team
The Big Data Dream Team
 
Big Data and Data Science for traditional Swiss companies
Big Data and Data Science for traditional Swiss companiesBig Data and Data Science for traditional Swiss companies
Big Data and Data Science for traditional Swiss companies
 
Big Data and HR - Talk @SwissHR Congress
Big Data and HR - Talk @SwissHR CongressBig Data and HR - Talk @SwissHR Congress
Big Data and HR - Talk @SwissHR Congress
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
 
How to Become a Data Scientist
How to Become a Data ScientistHow to Become a Data Scientist
How to Become a Data Scientist
 
Using hadoop for big data
Using hadoop for big dataUsing hadoop for big data
Using hadoop for big data
 
1140 track 1 weiss_using his mac
1140 track 1 weiss_using his mac1140 track 1 weiss_using his mac
1140 track 1 weiss_using his mac
 
Data science presentation 2nd CI day
Data science presentation 2nd CI dayData science presentation 2nd CI day
Data science presentation 2nd CI day
 
Data Science presentation for elementary school students
Data Science presentation for elementary school studentsData Science presentation for elementary school students
Data Science presentation for elementary school students
 
Life of a data scientist (pub)
Life of a data scientist (pub)Life of a data scientist (pub)
Life of a data scientist (pub)
 
Data Science: Past, Present, and Future
Data Science: Past, Present, and FutureData Science: Past, Present, and Future
Data Science: Past, Present, and Future
 
Different Career Paths in Data Science
Different Career Paths in Data ScienceDifferent Career Paths in Data Science
Different Career Paths in Data Science
 

Similar to First Steps on Big Data

What Managers Need to Know about Data Science
What Managers Need to Know about Data ScienceWhat Managers Need to Know about Data Science
What Managers Need to Know about Data ScienceAnnie Flippo
 
Data Architecture Strategies: Artificial Intelligence - Real-World Applicatio...
Data Architecture Strategies: Artificial Intelligence - Real-World Applicatio...Data Architecture Strategies: Artificial Intelligence - Real-World Applicatio...
Data Architecture Strategies: Artificial Intelligence - Real-World Applicatio...DATAVERSITY
 
data science and business analytics
data science and business analyticsdata science and business analytics
data science and business analyticssunnypatil1778
 
The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products Dataiku
 
How to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPOHow to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPOProduct School
 
Product Management in the Era of Data Science
Product Management in the Era of Data ScienceProduct Management in the Era of Data Science
Product Management in the Era of Data ScienceMandar Parikh
 
Data Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdfData Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdfSujata Gupta
 
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...NadinaLisbon1
 
Building successful data science teams
Building successful data science teamsBuilding successful data science teams
Building successful data science teamsVenkatesh Umaashankar
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big DataIndu Khemchandani
 
Big Data Developer Career Path: Job & Interview Preparation
Big Data Developer Career Path: Job & Interview PreparationBig Data Developer Career Path: Job & Interview Preparation
Big Data Developer Career Path: Job & Interview PreparationIntellipaat
 
Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez Betacowork
 
AI Orange Belt - Session 3
AI Orange Belt - Session 3AI Orange Belt - Session 3
AI Orange Belt - Session 3AI Black Belt
 
It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...
 It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201... It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...
It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...Edgar Alejandro Villegas
 
Embracing data science
Embracing data scienceEmbracing data science
Embracing data scienceVipul Kalamkar
 
10 Tips From A Young Data Scientist
10 Tips From A Young Data Scientist10 Tips From A Young Data Scientist
10 Tips From A Young Data ScientistNuno Carneiro
 
Think Big | Enterprise Artificial Intelligence
Think Big | Enterprise Artificial IntelligenceThink Big | Enterprise Artificial Intelligence
Think Big | Enterprise Artificial IntelligenceData Science Milan
 
1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptx
1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptx1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptx
1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptxarpit206900
 

Similar to First Steps on Big Data (20)

So you want to be a Data Scientist?
So you want to be a Data Scientist?So you want to be a Data Scientist?
So you want to be a Data Scientist?
 
What Managers Need to Know about Data Science
What Managers Need to Know about Data ScienceWhat Managers Need to Know about Data Science
What Managers Need to Know about Data Science
 
Data Architecture Strategies: Artificial Intelligence - Real-World Applicatio...
Data Architecture Strategies: Artificial Intelligence - Real-World Applicatio...Data Architecture Strategies: Artificial Intelligence - Real-World Applicatio...
Data Architecture Strategies: Artificial Intelligence - Real-World Applicatio...
 
data science and business analytics
data science and business analyticsdata science and business analytics
data science and business analytics
 
Lean Analytics: How to get more out of your data science team
Lean Analytics: How to get more out of your data science teamLean Analytics: How to get more out of your data science team
Lean Analytics: How to get more out of your data science team
 
The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products
 
How to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPOHow to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPO
 
Product Management in the Era of Data Science
Product Management in the Era of Data ScienceProduct Management in the Era of Data Science
Product Management in the Era of Data Science
 
Data Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdfData Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdf
 
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
 
Building successful data science teams
Building successful data science teamsBuilding successful data science teams
Building successful data science teams
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big Data
 
Big Data Developer Career Path: Job & Interview Preparation
Big Data Developer Career Path: Job & Interview PreparationBig Data Developer Career Path: Job & Interview Preparation
Big Data Developer Career Path: Job & Interview Preparation
 
Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez
 
AI Orange Belt - Session 3
AI Orange Belt - Session 3AI Orange Belt - Session 3
AI Orange Belt - Session 3
 
It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...
 It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201... It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...
It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...
 
Embracing data science
Embracing data scienceEmbracing data science
Embracing data science
 
10 Tips From A Young Data Scientist
10 Tips From A Young Data Scientist10 Tips From A Young Data Scientist
10 Tips From A Young Data Scientist
 
Think Big | Enterprise Artificial Intelligence
Think Big | Enterprise Artificial IntelligenceThink Big | Enterprise Artificial Intelligence
Think Big | Enterprise Artificial Intelligence
 
1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptx
1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptx1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptx
1.-DE-LECTURE-1-INTRO-TO-DATA-ENGG.pptx
 

Recently uploaded

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 

First Steps on Big Data

Editor's Notes

  1. A little story.. In 1826, when the first time someoe managed to capture and record light. 1890, when motion picture cameras were invented and companies were established
  2. A little story.. In 1826, when the first time someoe managed to capture and record light. 1890, when motion picture cameras were invented and companies were established
  3. 90% of the world’s data was created in the last 2 years
  4. 90% of the world’s data was created in the last 2 years
  5. Photo with 5%http://science-all.com/images/mona-lisa/mona-lisa-06.jpg Tell about the structured and unstructured data. How power full
  6. Talk about the scenarios If you or your business have a facebook page/Blog/Trip Advisor, whatever page, where people make comments. You should be able to analyze this data If watching customers online and not watching what they are doing what they are wearing
  7. Flight tickets
  8. At its core, correlation quantifies the statistical relationship between two data values. A strong correlation means than when one of the data value changes, the other is highly likely to change as well With correlation there is no certainty, only probability. But if a correlatino is strong, the likelihood of a link is high. Precitions basd on correlations lie at the heart of big data
  9. What Is Data Mining? Combination of AI and statistical analysis to discover information that is “hidden” in the data History Emerged late 1980s Flourished in 1990s Roots traced back along three family lines Classical Statistics Artificial Intelligence Machine Learning AI uses heuristics to simulate humans brain Machine Learning Blends AI heuristics with advanced Statistical Analysis What can be hidden in data? Associations Sequences Classifications Forecasting Anomalies Grouping/Clusters/Segments
  10. Data Businesspeople are those that are most focused on the organization and how data projects yield profit. Data Creatives. We think of Data Creatives as the broadest of data scientists, those who excel at applying a wide range of tools and technologies to a problem, or creating innovative Data Developer. prototypes at hackathons — the quintessential Jack of All Trades. We think of Data Developers as people focused on the technical problem of managing data — how to get it, store it, and learn from it Data Researchers.  One of the interesting career paths that leads to a title like “data scientist” starts with academic research in the physical or social sciences, or in statistics