SlideShare ist ein Scribd-Unternehmen logo
1 von 23
Big Data Analytics from
a Practitioners view
Sep 2013
Raghu Kashyap
About Raghu Kashyap
page 1
Areas of Responsibility
 Data Insights Group (Site analytics,
Competitive Intelligence, Big Data)
 Orbitz India, supporting Analytics
and BI teams
 US, Europe, Australia(APAC)
Personal
 Director – Data Insights Group
 Strong background with technology(13
years) passion and experience with
analytics(4 years) and big data (3.5
year)
 Masters in Computer Science
 Golf, traveling, helping non-profit
organizations, spending time with my
wife and 2 boys
 Twitter: @ragskashyap
 Blog: http://kashyaps.com
 Email: raghu.kashyap@orbitz.com
Orbitz Worldwide
page 2
Challenges
 Lack of multi-dimensional capabilities
 Heavy investment on the tools
 Precision vs Accuracy
 Data Governance
continued….
 No data unification or uniform platform
across organizations and business
units
 No easy data extraction capabilities
Hadoop history at OWW
page 5
Web Analytics & Big Data
 OWW generates couple million air and hotel
searches every day.
 Massive amounts of data. Over hundred GB
of log data per day.
 Expensive and difficult to store and process
this data using existing data infrastructure.
Love Thy Hadoop
page 7
 Long term storage for
very large data sets.
 Open access to
developers and analysts.
 Allows for ad-hoc
querying of data and
rapid deployment of
reporting applications.
Hadoop Growth
page 8
Hadoop Cluster
page 9
Treemap of HDFS storage
page 10
Approach with Hadoop and ETL
Raw
logs
Flat files
Event Model
Map Reduce
ETL
External
Tables
Data Warehouse
(Greenplum)
GP Connector
Opportunities
page 12
Machine Learning
Site Analytics Data
PPC bidding efficiencies
Internal log analysis. Hgrep
MVT testing
Advanced Analytics
Show me the money
EFX – Every Friggin X
PPC bidding efficiencies
MAC vs. PC
Marketing Channel optimization
page 14
Orbitz.comDirect
Paid -
Brand
Paid –
Non
Brand
SEO –
Brand
SEO -
Non
Brand
Email
Meta
Travel
Research
Affiliates
Display
Ads
Hotel Rate Cache optimization
page 15
Data is collected as part of RCDC.
Includes every live rate search (aka
burst) performed by our hotel stack.
Raw data: ~200 GB, compressed, 108
records.
Extraction: <40 GB compressed, 109
records.
MVT
Analyze behavioral and Test data from our
MVT testing
page 16
DWH Log analysis
page 17
• Analysis of Greenplum DB logs within Hadoop
to analyze the data usage patterns.
• Impact analysis
• Hadoop usage for the last 30 days of DB log
analysis.
HIPPO is your best friend
• Expect organizational resistance from
unanticipated directions
• You can do wonders in the analytics area if
you get buy in.
Lessons Learnt
Analytics using Big Data comes with a price.
Data Governance
Senior Leadership buy in
I can't tell you the key to success, but the key
to failure is trying to please everyone." -Ed
Sheeran
page 19
How to capitalize on Big Data?
page 20
Learn from people who have already
done this.
DO NOT reinvent the wheel
Buy v/s Build balance
Build once and leverage mulitple
places.
Go where clients don’t want to go or
cant go in terms of execution.
What matters to Practitioners?
Things change dramatically in the
world of analytics
Being Agile is very important
Dashboards and Reports can take
you only to a certain level
Buy in from key groups is important
Grow business and impress Boss 
page 21
222222
Thank you

Weitere ähnliche Inhalte

Was ist angesagt?

Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0
Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0
Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0
Dr. Mohan K. Bavirisetty
 
Blueprint for integrating big data analytics and bi
Blueprint for integrating big data analytics and biBlueprint for integrating big data analytics and bi
Blueprint for integrating big data analytics and bi
DataWorks Summit
 
Predictive Analysis PowerPoint Presentation Slides
Predictive Analysis PowerPoint Presentation SlidesPredictive Analysis PowerPoint Presentation Slides
Predictive Analysis PowerPoint Presentation Slides
SlideTeam
 
Predictive analytics in action real-world examples and advice
Predictive analytics in action real-world examples and advicePredictive analytics in action real-world examples and advice
Predictive analytics in action real-world examples and advice
The Marketing Distillery
 

Was ist angesagt? (20)

Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0
Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0
Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0
 
Business intelligence, Data Analytics & Data Visualization
Business intelligence, Data Analytics & Data VisualizationBusiness intelligence, Data Analytics & Data Visualization
Business intelligence, Data Analytics & Data Visualization
 
Blueprint for integrating big data analytics and bi
Blueprint for integrating big data analytics and biBlueprint for integrating big data analytics and bi
Blueprint for integrating big data analytics and bi
 
What is "Next Generation" Analytics? How does it fit with my Business Vision?
What is "Next Generation" Analytics? How does it fit with my Business Vision?What is "Next Generation" Analytics? How does it fit with my Business Vision?
What is "Next Generation" Analytics? How does it fit with my Business Vision?
 
Data Science Salon: Adopting Machine Learning to Drive Revenue and Market Share
Data Science Salon: Adopting Machine Learning to Drive Revenue and Market ShareData Science Salon: Adopting Machine Learning to Drive Revenue and Market Share
Data Science Salon: Adopting Machine Learning to Drive Revenue and Market Share
 
[우리가 데이터를 쓰는 법] 글로벌 스타트업/기업의 데이터 활용 현황 - 트레저데이터 이은철 한국 지사장
[우리가 데이터를 쓰는 법] 글로벌 스타트업/기업의 데이터 활용 현황 - 트레저데이터 이은철 한국 지사장[우리가 데이터를 쓰는 법] 글로벌 스타트업/기업의 데이터 활용 현황 - 트레저데이터 이은철 한국 지사장
[우리가 데이터를 쓰는 법] 글로벌 스타트업/기업의 데이터 활용 현황 - 트레저데이터 이은철 한국 지사장
 
Evaluating Big Data Predictive Analytics Platforms
Evaluating Big Data Predictive Analytics PlatformsEvaluating Big Data Predictive Analytics Platforms
Evaluating Big Data Predictive Analytics Platforms
 
Austin fraser brochure
Austin fraser  brochureAustin fraser  brochure
Austin fraser brochure
 
Predictive Analysis PowerPoint Presentation Slides
Predictive Analysis PowerPoint Presentation SlidesPredictive Analysis PowerPoint Presentation Slides
Predictive Analysis PowerPoint Presentation Slides
 
Predictive Data Analytics and Artificial Intelligence by 40°
Predictive Data Analytics and Artificial Intelligence by 40°Predictive Data Analytics and Artificial Intelligence by 40°
Predictive Data Analytics and Artificial Intelligence by 40°
 
Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)
 
Predictive analytics in action real-world examples and advice
Predictive analytics in action real-world examples and advicePredictive analytics in action real-world examples and advice
Predictive analytics in action real-world examples and advice
 
Borys Pratsiuk "How to be NVidia partner"
Borys Pratsiuk "How to be NVidia partner"Borys Pratsiuk "How to be NVidia partner"
Borys Pratsiuk "How to be NVidia partner"
 
Data Science Salon: Digital Transformation: The Data Science Catalyst
Data Science Salon: Digital Transformation: The Data Science CatalystData Science Salon: Digital Transformation: The Data Science Catalyst
Data Science Salon: Digital Transformation: The Data Science Catalyst
 
From Foundation to Mastery – Building a Mature Analytics Roadmap - Manav Misra
From Foundation to Mastery – Building a Mature Analytics Roadmap - Manav MisraFrom Foundation to Mastery – Building a Mature Analytics Roadmap - Manav Misra
From Foundation to Mastery – Building a Mature Analytics Roadmap - Manav Misra
 
The Softer Skills Analysts need to make an impact
The Softer Skills Analysts need to make an impactThe Softer Skills Analysts need to make an impact
The Softer Skills Analysts need to make an impact
 
Data Science Salon: Enabling self-service predictive analytics at Bidtellect
Data Science Salon: Enabling self-service predictive analytics at BidtellectData Science Salon: Enabling self-service predictive analytics at Bidtellect
Data Science Salon: Enabling self-service predictive analytics at Bidtellect
 
Bi Strategy Roadmap
Bi Strategy RoadmapBi Strategy Roadmap
Bi Strategy Roadmap
 
Welcome to Data Science
Welcome to Data ScienceWelcome to Data Science
Welcome to Data Science
 
Rb wilmer peres
Rb wilmer peresRb wilmer peres
Rb wilmer peres
 

Andere mochten auch

Icc agile analytics overview
Icc agile analytics overviewIcc agile analytics overview
Icc agile analytics overview
Don Jackson
 
Wallchart - Data Warehouse Documentation Roadmap
Wallchart - Data Warehouse Documentation RoadmapWallchart - Data Warehouse Documentation Roadmap
Wallchart - Data Warehouse Documentation Roadmap
David Walker
 
Agile in a Nutshell - Portia Tung
Agile in a Nutshell - Portia TungAgile in a Nutshell - Portia Tung
Agile in a Nutshell - Portia Tung
IIBA UK Chapter
 
(In Agile) Where Do All The Managers Go?
(In Agile) Where Do All The Managers Go?(In Agile) Where Do All The Managers Go?
(In Agile) Where Do All The Managers Go?
Scott W. Ambler
 
Agile data warehouse
Agile data warehouseAgile data warehouse
Agile data warehouse
Dao Vo
 
White Paper - Data Warehouse Documentation Roadmap
White Paper -  Data Warehouse Documentation RoadmapWhite Paper -  Data Warehouse Documentation Roadmap
White Paper - Data Warehouse Documentation Roadmap
David Walker
 
SAP BI Requirements Gathering Process
SAP BI Requirements Gathering ProcessSAP BI Requirements Gathering Process
SAP BI Requirements Gathering Process
silvaft
 

Andere mochten auch (20)

Agile Business Intelligence
Agile Business IntelligenceAgile Business Intelligence
Agile Business Intelligence
 
Scrum for BI
Scrum for BIScrum for BI
Scrum for BI
 
Décisionnel Agile : les conditions du succès
Décisionnel Agile : les conditions du succèsDécisionnel Agile : les conditions du succès
Décisionnel Agile : les conditions du succès
 
4 estequiometria s (1) estequiometria
4 estequiometria s (1) estequiometria4 estequiometria s (1) estequiometria
4 estequiometria s (1) estequiometria
 
3 Lecciones aprendidas de aplicar Scrum a proyectos de BI
3 Lecciones aprendidas de aplicar Scrum a proyectos de BI3 Lecciones aprendidas de aplicar Scrum a proyectos de BI
3 Lecciones aprendidas de aplicar Scrum a proyectos de BI
 
Metodologías Ágiles
Metodologías ÁgilesMetodologías Ágiles
Metodologías Ágiles
 
Icc agile analytics overview
Icc agile analytics overviewIcc agile analytics overview
Icc agile analytics overview
 
Agile dwh
Agile dwhAgile dwh
Agile dwh
 
Traditional BI or Disruptive BI?
Traditional BI or Disruptive BI?Traditional BI or Disruptive BI?
Traditional BI or Disruptive BI?
 
Wallchart - Data Warehouse Documentation Roadmap
Wallchart - Data Warehouse Documentation RoadmapWallchart - Data Warehouse Documentation Roadmap
Wallchart - Data Warehouse Documentation Roadmap
 
Agile in a Nutshell - Portia Tung
Agile in a Nutshell - Portia TungAgile in a Nutshell - Portia Tung
Agile in a Nutshell - Portia Tung
 
Is BI/Analytics and Agile an Oxymoron?
Is BI/Analytics and Agile an Oxymoron?Is BI/Analytics and Agile an Oxymoron?
Is BI/Analytics and Agile an Oxymoron?
 
(In Agile) Where Do All The Managers Go?
(In Agile) Where Do All The Managers Go?(In Agile) Where Do All The Managers Go?
(In Agile) Where Do All The Managers Go?
 
Agile data warehouse
Agile data warehouseAgile data warehouse
Agile data warehouse
 
White Paper - Overview Architecture For Enterprise Data Warehouses
White Paper -  Overview Architecture For Enterprise Data WarehousesWhite Paper -  Overview Architecture For Enterprise Data Warehouses
White Paper - Overview Architecture For Enterprise Data Warehouses
 
White Paper - Data Warehouse Documentation Roadmap
White Paper -  Data Warehouse Documentation RoadmapWhite Paper -  Data Warehouse Documentation Roadmap
White Paper - Data Warehouse Documentation Roadmap
 
ASAP 8.0 Methodology
ASAP 8.0 MethodologyASAP 8.0 Methodology
ASAP 8.0 Methodology
 
SAP BI Requirements Gathering Process
SAP BI Requirements Gathering ProcessSAP BI Requirements Gathering Process
SAP BI Requirements Gathering Process
 
Microsoft Data Warehouse Business Intelligence Lifecycle - The Kimball Approach
Microsoft Data Warehouse Business Intelligence Lifecycle - The Kimball ApproachMicrosoft Data Warehouse Business Intelligence Lifecycle - The Kimball Approach
Microsoft Data Warehouse Business Intelligence Lifecycle - The Kimball Approach
 
Capturing Business Requirements For Scorecards, Dashboards And Reports
Capturing Business Requirements For Scorecards, Dashboards And ReportsCapturing Business Requirements For Scorecards, Dashboards And Reports
Capturing Business Requirements For Scorecards, Dashboards And Reports
 

Ähnlich wie Big Data Analytics from a Practitioners View

Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011
Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011
Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011
Jonathan Seidman
 
Gartner peer forum sept 2011 orbitz
Gartner peer forum sept 2011   orbitzGartner peer forum sept 2011   orbitz
Gartner peer forum sept 2011 orbitz
Raghu Kashyap
 
Hadoop Webinar 28July15
Hadoop Webinar 28July15Hadoop Webinar 28July15
Hadoop Webinar 28July15
Edureka!
 
Data-Ed Webinar: Data Warehouse Strategies
Data-Ed Webinar: Data Warehouse StrategiesData-Ed Webinar: Data Warehouse Strategies
Data-Ed Webinar: Data Warehouse Strategies
DATAVERSITY
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big data
Raul Chong
 
Chicago Data Summit: Extending the Enterprise Data Warehouse with Hadoop
Chicago Data Summit: Extending the Enterprise Data Warehouse with HadoopChicago Data Summit: Extending the Enterprise Data Warehouse with Hadoop
Chicago Data Summit: Extending the Enterprise Data Warehouse with Hadoop
Cloudera, Inc.
 

Ähnlich wie Big Data Analytics from a Practitioners View (20)

Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011
Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011
Architecting for Big Data - Gartner Innovation Peer Forum Sept 2011
 
Gartner peer forum sept 2011 orbitz
Gartner peer forum sept 2011   orbitzGartner peer forum sept 2011   orbitz
Gartner peer forum sept 2011 orbitz
 
Hadoop Demo eConvergence
Hadoop Demo eConvergenceHadoop Demo eConvergence
Hadoop Demo eConvergence
 
Hadoop Webinar 28July15
Hadoop Webinar 28July15Hadoop Webinar 28July15
Hadoop Webinar 28July15
 
Is It A Right Time For Me To Learn Hadoop. Find out ?
Is It A Right Time For Me To Learn Hadoop. Find out ?Is It A Right Time For Me To Learn Hadoop. Find out ?
Is It A Right Time For Me To Learn Hadoop. Find out ?
 
Hadoop World 2011: Extending Enterprise Data Warehouse with Hadoop - Jonathan...
Hadoop World 2011: Extending Enterprise Data Warehouse with Hadoop - Jonathan...Hadoop World 2011: Extending Enterprise Data Warehouse with Hadoop - Jonathan...
Hadoop World 2011: Extending Enterprise Data Warehouse with Hadoop - Jonathan...
 
Big data and data mining
Big data and data miningBig data and data mining
Big data and data mining
 
It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...
 It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201... It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...
It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...
 
Data-Ed Webinar: Data Warehouse Strategies
Data-Ed Webinar: Data Warehouse StrategiesData-Ed Webinar: Data Warehouse Strategies
Data-Ed Webinar: Data Warehouse Strategies
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big data
 
Hadoop’s Impact on Recruit Company
Hadoop’s Impact on Recruit CompanyHadoop’s Impact on Recruit Company
Hadoop’s Impact on Recruit Company
 
Chicago Data Summit: Extending the Enterprise Data Warehouse with Hadoop
Chicago Data Summit: Extending the Enterprise Data Warehouse with HadoopChicago Data Summit: Extending the Enterprise Data Warehouse with Hadoop
Chicago Data Summit: Extending the Enterprise Data Warehouse with Hadoop
 
Using big data_to_your_advantage
Using big data_to_your_advantageUsing big data_to_your_advantage
Using big data_to_your_advantage
 
Choosing the Right Database - Facebook DevC Malang Hackdays 2017
Choosing the Right Database - Facebook DevC Malang Hackdays 2017Choosing the Right Database - Facebook DevC Malang Hackdays 2017
Choosing the Right Database - Facebook DevC Malang Hackdays 2017
 
Big data introduction, Hadoop in details
Big data introduction, Hadoop in detailsBig data introduction, Hadoop in details
Big data introduction, Hadoop in details
 
Capgemini’s Data WARP: Accelerate your Journey to Insights
Capgemini’s Data WARP: Accelerate your Journey to InsightsCapgemini’s Data WARP: Accelerate your Journey to Insights
Capgemini’s Data WARP: Accelerate your Journey to Insights
 
Oh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG DataOh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG Data
 
DAS Slides: Graph Databases — Practical Use Cases
DAS Slides: Graph Databases — Practical Use CasesDAS Slides: Graph Databases — Practical Use Cases
DAS Slides: Graph Databases — Practical Use Cases
 
How to build and run a big data platform in the 21st century
How to build and run a big data platform in the 21st centuryHow to build and run a big data platform in the 21st century
How to build and run a big data platform in the 21st century
 
Big data
Big dataBig data
Big data
 

Mehr von Raghu Kashyap (6)

Agile 2017 Lean Product Development
Agile 2017 Lean Product DevelopmentAgile 2017 Lean Product Development
Agile 2017 Lean Product Development
 
Idiots guide to stocks
Idiots guide to stocksIdiots guide to stocks
Idiots guide to stocks
 
Orbitz fifth elephant_2015_conference_orbitz_presentation
Orbitz fifth elephant_2015_conference_orbitz_presentationOrbitz fifth elephant_2015_conference_orbitz_presentation
Orbitz fifth elephant_2015_conference_orbitz_presentation
 
Big Data redefines Enterprise Data Warehouse @Bangalore
Big Data redefines Enterprise Data Warehouse @BangaloreBig Data redefines Enterprise Data Warehouse @Bangalore
Big Data redefines Enterprise Data Warehouse @Bangalore
 
Accelerate 2012 chicago - orbitz
Accelerate   2012 chicago - orbitzAccelerate   2012 chicago - orbitz
Accelerate 2012 chicago - orbitz
 
Web analyticsandbigdata techweek2011
Web analyticsandbigdata techweek2011Web analyticsandbigdata techweek2011
Web analyticsandbigdata techweek2011
 

Kürzlich hochgeladen

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 

Big Data Analytics from a Practitioners View

  • 1. Big Data Analytics from a Practitioners view Sep 2013 Raghu Kashyap
  • 2. About Raghu Kashyap page 1 Areas of Responsibility  Data Insights Group (Site analytics, Competitive Intelligence, Big Data)  Orbitz India, supporting Analytics and BI teams  US, Europe, Australia(APAC) Personal  Director – Data Insights Group  Strong background with technology(13 years) passion and experience with analytics(4 years) and big data (3.5 year)  Masters in Computer Science  Golf, traveling, helping non-profit organizations, spending time with my wife and 2 boys  Twitter: @ragskashyap  Blog: http://kashyaps.com  Email: raghu.kashyap@orbitz.com
  • 4. Challenges  Lack of multi-dimensional capabilities  Heavy investment on the tools  Precision vs Accuracy  Data Governance
  • 5. continued….  No data unification or uniform platform across organizations and business units  No easy data extraction capabilities
  • 6. Hadoop history at OWW page 5
  • 7. Web Analytics & Big Data  OWW generates couple million air and hotel searches every day.  Massive amounts of data. Over hundred GB of log data per day.  Expensive and difficult to store and process this data using existing data infrastructure.
  • 8. Love Thy Hadoop page 7  Long term storage for very large data sets.  Open access to developers and analysts.  Allows for ad-hoc querying of data and rapid deployment of reporting applications.
  • 11. Treemap of HDFS storage page 10
  • 12. Approach with Hadoop and ETL Raw logs Flat files Event Model Map Reduce ETL External Tables Data Warehouse (Greenplum) GP Connector
  • 13. Opportunities page 12 Machine Learning Site Analytics Data PPC bidding efficiencies Internal log analysis. Hgrep MVT testing Advanced Analytics
  • 14. Show me the money EFX – Every Friggin X PPC bidding efficiencies MAC vs. PC
  • 15. Marketing Channel optimization page 14 Orbitz.comDirect Paid - Brand Paid – Non Brand SEO – Brand SEO - Non Brand Email Meta Travel Research Affiliates Display Ads
  • 16. Hotel Rate Cache optimization page 15 Data is collected as part of RCDC. Includes every live rate search (aka burst) performed by our hotel stack. Raw data: ~200 GB, compressed, 108 records. Extraction: <40 GB compressed, 109 records.
  • 17. MVT Analyze behavioral and Test data from our MVT testing page 16
  • 18. DWH Log analysis page 17 • Analysis of Greenplum DB logs within Hadoop to analyze the data usage patterns. • Impact analysis • Hadoop usage for the last 30 days of DB log analysis.
  • 19. HIPPO is your best friend • Expect organizational resistance from unanticipated directions • You can do wonders in the analytics area if you get buy in.
  • 20. Lessons Learnt Analytics using Big Data comes with a price. Data Governance Senior Leadership buy in I can't tell you the key to success, but the key to failure is trying to please everyone." -Ed Sheeran page 19
  • 21. How to capitalize on Big Data? page 20 Learn from people who have already done this. DO NOT reinvent the wheel Buy v/s Build balance Build once and leverage mulitple places. Go where clients don’t want to go or cant go in terms of execution.
  • 22. What matters to Practitioners? Things change dramatically in the world of analytics Being Agile is very important Dashboards and Reports can take you only to a certain level Buy in from key groups is important Grow business and impress Boss  page 21

Hinweis der Redaktion

  1. A website is just like a store You have millions of people visiting you every day and shopping on your site 3.So where does Analytics fit here? 4. Web Analytics is the invisible shopper who goes around the store and watches everyone&apos;s behavior.5. Web Analytics takes the behavioral attributes and helps business with insights! 6. Know the travel details such as how many travelers, what kind of travelers, any preferred carrier or hotels? 7. Understand the shopping patterns. Does he want to shop only on weekends or else only on Thursdays. 8. Focus on Visit Patterns. How many times does he come to the site before he buys anything 9. Learn the page navigation. I.e does he see 100 pages every time he comes or does he know exactly what to look at
  2. Big Old Elephant: http://wallszone.com/wallpapers/Big_Old_Elephant_HD_1080P_1920x1080_2916.jpg
  3. 1. Last year at EMetrics I had interesting tweet exchanges with few folks. In essence we were talking about the importance of visitor level granularity of data and how it will impact personalization.Here I have 3 use cases which is being enabled through Big Data at OWW.2. Our CEO affectionately calls this EFX – We use Hadoop to analyze the attributes from Site Analytics, Internal logs(Consists of multiple application logs and NOT just weblogs), MVT logs. All these in essence will funnel our regression models. One of the key wins from the Machine Learning team was to analyze, build and implement the recommendation engine for out hotel search. The data was from our Site Analytics, and some other internal application logs. was analyzed using MR Hadoop jobs. The results we saw was astonishing. 7% interaction rate37% had a likely chances to continue deep in the funnel2.6% increase in booking path engagement.The beauty of this is the Big Data Analytics is fed into a machine and it learns and changes as time progresses. 3. PPC bidding based on Site Analytics data. EX in turn funneling out PPC channels. The results are very encouraging. Helps us with regression analysis4. The final use case is where we learned that Mac users tend to in general spend more money on our sites 
  4. We perform around 100s of tests on our site. We need to analyze the consumer patterns and behaviors which will enable us to make the consumer experience very goodOne of the test we perform is around the search results. Based on the visitor we usually alter the search results order to make it more personal to them.This enables the user to find the right hotel results very easilyImagine you as a user take a yearly vacation in Honololu Hawaii and always stay at the same hotel. If we know this information about you we can serve you better to show that hotel at the top.We also do a lot of testing around layout, colors, button placementEventually lot of this data flows through our Hadoop infrastructure which will enable us to to perform modeling exercise and analysis of our control and test groups.
  5. 1. We faced organizational resistance to deploying Hadoop.Not from management, but from other technical teams.Required persistence to convince them that we needed to introduce a new hardware spec to support Hadoop.CTO was a big believer and provided technical guidance with Big Data. This helped a lot in making this a success at organization. Its not everyday you have CMO and GVP asking their team members to get data out of Hadoop 
  6. 1. Here are some key learning&apos;s from our experience and some thoughts for you to consider 2. If you have the strength of technology go for it. 3. This needs heavy investment from time and resource perspective 4. Like I mentioned many times data without analysis is worthless 5. Senior Leadership buy in is a must. We had huge support from our CTO, CEO and CMO6. Data Governance is a must7. Lastly if you want to succeed then you need to fight the tough battles and make the tough choices.