SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Downloaden Sie, um offline zu lesen
Yahoo! Product Intelligence
                                                       MicroStrategy World 2008


           Amr A. Awadallah, PhD
           VP Product Intelligence Engineering
           Jan 2008

Yahoo! Inc – Amr A. Awadallah – MicroStrategy World 2008
About Yahoo! and it’s purpose.

     Yahoo! who?
     Yahoo! is the world’s largest global online network of integrated
     services and is one of the most trafficked Internet destinations
     worldwide. For more than 12 years, Yahoo! has been changing
     the way people communicate with each other, conduct
     transactions, and access/share/create information.

     Our Purpose:
     Yahoo! powers and delights our communities of users,
     advertisers, and publishers – all of us united in creating
     indispensable experiences, and fueled by trust.

Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   -1-
Yahoo! Fast Facts
     Our Name:
     Yahoo! = Yet Another Hierarchical Officious Oracle!
     Dictionary definition= quot;rude, unsophisticated, uncouth”

     Employees:
     14,000 Yahoos worldwide

     Global presence:
     Operations in over 20 markets and regions around the world
     Available in over 20 languages to 477M users.

     Milestones:
     Founded in 1994
     Incorporated in 1995
     Public in 1996

     Proud to be:
     Fortune 500 Company
     Fortune ‘100 Best Company to Work For’
Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   -2-
Yahoo! Pillars and Strategic Objectives

        Three Strategic Pillars:                                  Three Strategic Objectives:

        •    Insights will be leveraged to                        •   Online starting point for most
             deliver 10x relevant experiences                         consumers
        •    Open, allow publishers to use our                    •   Platforms that attract the most
             content/services, and vice versa                         developers and publishers
        •    Partner-of-Choice                                    •   A must buy for most advertisers



       •     We believe the focus on relevance as a measure will create a unifying
             focus to our work and drive increased value in everything we do.
       •     We are building the largest content, services, and advertising exchange.
       •     Example Strategic Partners: eBay, AT&T, Comcast, Newspaper
             Consortium, Bebo, WebMD, Cars.com, Forbes.com, Ziff Davis Media, DivX,
             Hearst.


Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   -3-
Product Intelligence Engineering (PIE)



            Continuously generate and
          leverage insights to maximize
             sustainable value created
           through interactions within the
                Yahoo! ecosystem

Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   -4-
Why Measure?




       • “If you can’t measure it, you can’t fix it”
       • “If you can’t measure it, you can’t grow it”
       • “If you can’t measure it, you can’t build it”




Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   -5-
We Support More Than a Dozen Datamarts


       • Yahoo MY/Frontpage                                       • Flickr
       • Yahoo                            Search                  • Yahoo GrouP
       • Yahoo Mail                                               • answers
       • Yahoo Toolbar                                            • Yahoo News
       • Yahoo MesnGr                                             • Yahoo Financ
       • Yahoo Local                                              • Yahoo Travel
       • Yahoo video                                              • Yahoo ShoP
       • Yahoo SPorts                                             • Yahoo Ent
Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   -6-
Our Data Stack

                                             MicroStrategy Standard Reporting

                                            MicroStrategy Advanced Reporting

                                                 MicroStrategy Data Modeling

                                         Data Mart Database (currently Oracle)     200GB/day
                                          Data Mart ETL phase 2 (mixed tools)

                                   Data Mart ETL phase 1 (aggregations in grid)

                                       Foundational Warehouse (Link Octopus)       10TB/day
                                                            Warehouse ETL

                                                Log Collection (Data Highway)

                                       Instrumentation (Universal Link Tracking)


Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008        -7-
Groups That Use Our Systems

       • Business Operations/Finance
       • Product Management
       • Research and Development
       • User Experience and Design
       • Product Marketing
       • Advertising Sales
Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   -8-
Example Dashboards




Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   -9-
Case Study: A/B testing for Shopping.yahoo.com
 Is it better to show items in a list or in a grid?



                                                                     Test Bucket A: List




 Test Bucket B: Grid

Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   - 10 -
Dashboard Advice

   • Its very hard (impossible) to reach a single
     universal metric that summarizes how the
     product is doing
   • Proper visualization tools are very important
     since there are a lot of numbers to examine
   • Averages are nice, but histograms tell the real
     story
   • Trends are your friend, keep history, lots of it
   • Sampling is very dangerous if not used properly

Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   - 11 -
Example Metrics & Dimensions

          Example Metrics:
          • Click Probability (Cookies Clicked/Cookies Visited)
          • Click Yield (Clicks Per Thousand Cookies)
          • RPC (Revenue Per Cookie)
          • Sessions/Cookie/Week (or Month)
          • Time/Cookie/Week (or Month)
          • Retention Rate (percent of Cookies that returned)
          Example Dimensions:
          • Demographics (Gender, Age, Income, Tenure)
          • Geographics (Country, State, Zip, DMA)
          • Content ID
          • Access Modality (PC, PDA, Cell Phone, Net Speed, Browser, OS)
          • Traffic Source (Organic, SEM, Affiliate, Marketing Campaign)
          • Bucket Test ID
Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   - 12 -
Evolution of Our BI systems

  • Started with grep ☺ and generated
    static html dashboards
  • Evolved to load a few aggregates into
    MySQL with a dynamic Perl dashboard
  • Today we load a lot of aggregates for
    many metrics/dimensions into Oracle
    and use MicroStrategy for reporting.
  • Next?
Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   - 13 -
Next: A Unified Datamart (aka EDW)

   • We currently have 18+ separate datamart
     silos across all of Yahoo!’s products.
   • We will bring all of these datamarts under
     the same umbrella so that we can easily do
     cross-Yahoo! analytics in the datamart.
   • The data-model will need to be a hybrid
     data-model that supports horizontal
     uniformity but allows for vertically deep
     metrics and dimensions based on the
     product (e.g. mail sends is unique to mail)
Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   - 14 -
Next: Click and View-stream Analytics
  • Our current datamarts have aggregate data only
    which limits the number of questions that can be
    asked (we can still answer these questions from
    the LinkOctopus warehouse, but this requires an
    engineer to develop SQL due to complex schema).
  • We will expand our datamarts to include event-
    level data (both click and view-stream events), this
    will cause a large explosion in size and number
    of rows (from 200GB/day to 10TB/day).
  • The data-model will need to be a hybrid data-
    model that supports event level data but also
    aggregates (for performance and longevity)
Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   - 15 -
Challenges we have


     • Load data before new business day starts
     • Operational stability
     • Data quality: Bot filtering, Cookie churn
     • Instrumentation Automation
     • Columnar access control
     • Scalable dimension, segmentation, and
       event-level processing

Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   - 16 -
MicroStrategy: Things we like, Things we want


       Things We Like:                                         Things We Want:
       • It writes SQL for us ☺                                • Cross-mart dashboarding
       • It creates Web GUIs ☺                                 • URL functionality to send
       • We love the new flash                                   out report links
         functionality                                         • Better Portal SDK
       • We like Personalized  • Clickstream visualization
         Page Execution in NCS • Better NCS Debugging

                                                               • Intelligent Prompts
                                                               • Better search on
                                                                 support.microstrategy.com
Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008   - 17 -

Weitere ähnliche Inhalte

Was ist angesagt?

Why hadoop for data science?
Why hadoop for data science?Why hadoop for data science?
Why hadoop for data science?Hortonworks
 
Introduction to BIg Data and Hadoop
Introduction to BIg Data and HadoopIntroduction to BIg Data and Hadoop
Introduction to BIg Data and HadoopAmir Shaikh
 
Extending the EDW with Hadoop - Chicago Data Summit 2011
Extending the EDW with Hadoop - Chicago Data Summit 2011Extending the EDW with Hadoop - Chicago Data Summit 2011
Extending the EDW with Hadoop - Chicago Data Summit 2011Jonathan Seidman
 
Hadoop: An Industry Perspective
Hadoop: An Industry PerspectiveHadoop: An Industry Perspective
Hadoop: An Industry PerspectiveCloudera, Inc.
 
Introduction to Big Data & Hadoop
Introduction to Big Data & HadoopIntroduction to Big Data & Hadoop
Introduction to Big Data & HadoopEdureka!
 
Introduction to Big data & Hadoop -I
Introduction to Big data & Hadoop -IIntroduction to Big data & Hadoop -I
Introduction to Big data & Hadoop -IEdureka!
 
Next Generation Hadoop Introduction
Next Generation Hadoop IntroductionNext Generation Hadoop Introduction
Next Generation Hadoop IntroductionAdam Muise
 
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | EdurekaHadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | EdurekaEdureka!
 
Extending the Data Warehouse with Hadoop - Hadoop world 2011
Extending the Data Warehouse with Hadoop - Hadoop world 2011Extending the Data Warehouse with Hadoop - Hadoop world 2011
Extending the Data Warehouse with Hadoop - Hadoop world 2011Jonathan Seidman
 
Rob peglar introduction_analytics _big data_hadoop
Rob peglar introduction_analytics _big data_hadoopRob peglar introduction_analytics _big data_hadoop
Rob peglar introduction_analytics _big data_hadoopGhassan Al-Yafie
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and HadoopFebiyan Rachman
 
Big data technologies and Hadoop infrastructure
Big data technologies and Hadoop infrastructureBig data technologies and Hadoop infrastructure
Big data technologies and Hadoop infrastructureRoman Nikitchenko
 
Big Data and Hadoop Basics
Big Data and Hadoop BasicsBig Data and Hadoop Basics
Big Data and Hadoop BasicsSonal Tiwari
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 
Hadoop Basics - Apache hadoop Bigdata training by Design Pathshala
Hadoop Basics - Apache hadoop Bigdata training by Design Pathshala Hadoop Basics - Apache hadoop Bigdata training by Design Pathshala
Hadoop Basics - Apache hadoop Bigdata training by Design Pathshala Desing Pathshala
 
Big Data/Hadoop Infrastructure Considerations
Big Data/Hadoop Infrastructure ConsiderationsBig Data/Hadoop Infrastructure Considerations
Big Data/Hadoop Infrastructure ConsiderationsRichard McDougall
 
Introduction To Big Data & Hadoop
Introduction To Big Data & HadoopIntroduction To Big Data & Hadoop
Introduction To Big Data & HadoopBlackvard
 
Hadoop,Big Data Analytics and More
Hadoop,Big Data Analytics and MoreHadoop,Big Data Analytics and More
Hadoop,Big Data Analytics and MoreTrendwise Analytics
 
Spark tutorial @ KCC 2015
Spark tutorial @ KCC 2015Spark tutorial @ KCC 2015
Spark tutorial @ KCC 2015Jongwook Woo
 

Was ist angesagt? (20)

Why hadoop for data science?
Why hadoop for data science?Why hadoop for data science?
Why hadoop for data science?
 
Introduction to BIg Data and Hadoop
Introduction to BIg Data and HadoopIntroduction to BIg Data and Hadoop
Introduction to BIg Data and Hadoop
 
Extending the EDW with Hadoop - Chicago Data Summit 2011
Extending the EDW with Hadoop - Chicago Data Summit 2011Extending the EDW with Hadoop - Chicago Data Summit 2011
Extending the EDW with Hadoop - Chicago Data Summit 2011
 
Hadoop: An Industry Perspective
Hadoop: An Industry PerspectiveHadoop: An Industry Perspective
Hadoop: An Industry Perspective
 
Introduction to Big Data & Hadoop
Introduction to Big Data & HadoopIntroduction to Big Data & Hadoop
Introduction to Big Data & Hadoop
 
Introduction to Big data & Hadoop -I
Introduction to Big data & Hadoop -IIntroduction to Big data & Hadoop -I
Introduction to Big data & Hadoop -I
 
Next Generation Hadoop Introduction
Next Generation Hadoop IntroductionNext Generation Hadoop Introduction
Next Generation Hadoop Introduction
 
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | EdurekaHadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
 
Extending the Data Warehouse with Hadoop - Hadoop world 2011
Extending the Data Warehouse with Hadoop - Hadoop world 2011Extending the Data Warehouse with Hadoop - Hadoop world 2011
Extending the Data Warehouse with Hadoop - Hadoop world 2011
 
Rob peglar introduction_analytics _big data_hadoop
Rob peglar introduction_analytics _big data_hadoopRob peglar introduction_analytics _big data_hadoop
Rob peglar introduction_analytics _big data_hadoop
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and Hadoop
 
Big data technologies and Hadoop infrastructure
Big data technologies and Hadoop infrastructureBig data technologies and Hadoop infrastructure
Big data technologies and Hadoop infrastructure
 
Big Data and Hadoop Basics
Big Data and Hadoop BasicsBig Data and Hadoop Basics
Big Data and Hadoop Basics
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Hadoop Basics - Apache hadoop Bigdata training by Design Pathshala
Hadoop Basics - Apache hadoop Bigdata training by Design Pathshala Hadoop Basics - Apache hadoop Bigdata training by Design Pathshala
Hadoop Basics - Apache hadoop Bigdata training by Design Pathshala
 
Big Data/Hadoop Infrastructure Considerations
Big Data/Hadoop Infrastructure ConsiderationsBig Data/Hadoop Infrastructure Considerations
Big Data/Hadoop Infrastructure Considerations
 
Introduction To Big Data & Hadoop
Introduction To Big Data & HadoopIntroduction To Big Data & Hadoop
Introduction To Big Data & Hadoop
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
 
Hadoop,Big Data Analytics and More
Hadoop,Big Data Analytics and MoreHadoop,Big Data Analytics and More
Hadoop,Big Data Analytics and More
 
Spark tutorial @ KCC 2015
Spark tutorial @ KCC 2015Spark tutorial @ KCC 2015
Spark tutorial @ KCC 2015
 

Ähnlich wie Yahoo Microstrategy 2008

Web 2.0 Managerial Economics
Web 2.0 Managerial EconomicsWeb 2.0 Managerial Economics
Web 2.0 Managerial EconomicsAvinash Singh
 
H2O.ai - Road Ahead - keynote presentation by Sri Ambati
H2O.ai - Road Ahead - keynote presentation by Sri AmbatiH2O.ai - Road Ahead - keynote presentation by Sri Ambati
H2O.ai - Road Ahead - keynote presentation by Sri AmbatiSri Ambati
 
Fostering An Open Alliance Among Competitors The Itanium Solutions Alliance
Fostering An Open Alliance Among Competitors   The Itanium Solutions AllianceFostering An Open Alliance Among Competitors   The Itanium Solutions Alliance
Fostering An Open Alliance Among Competitors The Itanium Solutions AllianceAndrew Masland
 
An Innovative Big-Data Web Scraping Tech Company
An Innovative Big-Data Web Scraping Tech CompanyAn Innovative Big-Data Web Scraping Tech Company
An Innovative Big-Data Web Scraping Tech CompanyRoger Giuffre
 
OneSpring: 5 Myths of Rich Internet Applications
OneSpring:  5 Myths of Rich Internet ApplicationsOneSpring:  5 Myths of Rich Internet Applications
OneSpring: 5 Myths of Rich Internet ApplicationsOneSpring LLC
 
An Innovative Big-Data Web Scraping Tech Company
An Innovative Big-Data Web Scraping Tech CompanyAn Innovative Big-Data Web Scraping Tech Company
An Innovative Big-Data Web Scraping Tech CompanyRoger Giuffre
 
IoT and sustainable development - United Nations
IoT and sustainable development - United NationsIoT and sustainable development - United Nations
IoT and sustainable development - United NationsGaya Branderhorst
 
RoMT - Part 2 Marketing Technology Webinar
RoMT - Part 2 Marketing Technology WebinarRoMT - Part 2 Marketing Technology Webinar
RoMT - Part 2 Marketing Technology WebinarSmart Insights
 
01. Portal Business Overview
01. Portal Business Overview01. Portal Business Overview
01. Portal Business OverviewNick Davis
 
Where is the S in SOA?
Where is the S in SOA?Where is the S in SOA?
Where is the S in SOA?Kris Tuttle
 
IMCSummit 2015 - Day 2 Keynote - In-Memory Computing and the Emergence of Tie...
IMCSummit 2015 - Day 2 Keynote - In-Memory Computing and the Emergence of Tie...IMCSummit 2015 - Day 2 Keynote - In-Memory Computing and the Emergence of Tie...
IMCSummit 2015 - Day 2 Keynote - In-Memory Computing and the Emergence of Tie...In-Memory Computing Summit
 
[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...
[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...
[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...Scott Abel
 
Analyzing Your Deliverables: Developing the Optimal Documentation Library
Analyzing Your Deliverables: Developing the Optimal Documentation LibraryAnalyzing Your Deliverables: Developing the Optimal Documentation Library
Analyzing Your Deliverables: Developing the Optimal Documentation LibraryScott Abel
 
Presentación ComsCore Internet en SudAmerica y Argentina
Presentación ComsCore Internet en SudAmerica y ArgentinaPresentación ComsCore Internet en SudAmerica y Argentina
Presentación ComsCore Internet en SudAmerica y ArgentinaDiego Montesano
 
Apache Spark: The modern data analytics platform
Apache Spark: The modern data analytics platformApache Spark: The modern data analytics platform
Apache Spark: The modern data analytics platformMáté Gulyás
 
Day 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_pressDay 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_pressIntelAPAC
 
Getting Started with Big Data for Business Managers
Getting Started with Big Data for Business ManagersGetting Started with Big Data for Business Managers
Getting Started with Big Data for Business ManagersDatameer
 
Designing a Successful Governed Citizen Data Science Strategy
Designing a Successful Governed Citizen Data Science StrategyDesigning a Successful Governed Citizen Data Science Strategy
Designing a Successful Governed Citizen Data Science StrategyDATAVERSITY
 

Ähnlich wie Yahoo Microstrategy 2008 (20)

Web 2.0 Managerial Economics
Web 2.0 Managerial EconomicsWeb 2.0 Managerial Economics
Web 2.0 Managerial Economics
 
H2O.ai - Road Ahead - keynote presentation by Sri Ambati
H2O.ai - Road Ahead - keynote presentation by Sri AmbatiH2O.ai - Road Ahead - keynote presentation by Sri Ambati
H2O.ai - Road Ahead - keynote presentation by Sri Ambati
 
Fostering An Open Alliance Among Competitors The Itanium Solutions Alliance
Fostering An Open Alliance Among Competitors   The Itanium Solutions AllianceFostering An Open Alliance Among Competitors   The Itanium Solutions Alliance
Fostering An Open Alliance Among Competitors The Itanium Solutions Alliance
 
An Innovative Big-Data Web Scraping Tech Company
An Innovative Big-Data Web Scraping Tech CompanyAn Innovative Big-Data Web Scraping Tech Company
An Innovative Big-Data Web Scraping Tech Company
 
OneSpring: 5 Myths of Rich Internet Applications
OneSpring:  5 Myths of Rich Internet ApplicationsOneSpring:  5 Myths of Rich Internet Applications
OneSpring: 5 Myths of Rich Internet Applications
 
An Innovative Big-Data Web Scraping Tech Company
An Innovative Big-Data Web Scraping Tech CompanyAn Innovative Big-Data Web Scraping Tech Company
An Innovative Big-Data Web Scraping Tech Company
 
IoT and sustainable development - United Nations
IoT and sustainable development - United NationsIoT and sustainable development - United Nations
IoT and sustainable development - United Nations
 
RoMT - Part 2 Marketing Technology Webinar
RoMT - Part 2 Marketing Technology WebinarRoMT - Part 2 Marketing Technology Webinar
RoMT - Part 2 Marketing Technology Webinar
 
01. Portal Business Overview
01. Portal Business Overview01. Portal Business Overview
01. Portal Business Overview
 
Where is the S in SOA?
Where is the S in SOA?Where is the S in SOA?
Where is the S in SOA?
 
IMCSummit 2015 - Day 2 Keynote - In-Memory Computing and the Emergence of Tie...
IMCSummit 2015 - Day 2 Keynote - In-Memory Computing and the Emergence of Tie...IMCSummit 2015 - Day 2 Keynote - In-Memory Computing and the Emergence of Tie...
IMCSummit 2015 - Day 2 Keynote - In-Memory Computing and the Emergence of Tie...
 
[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...
[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...
[Workshop] Analyzing Your Deliverables: Developing the Optimal Documentation ...
 
Analyzing Your Deliverables: Developing the Optimal Documentation Library
Analyzing Your Deliverables: Developing the Optimal Documentation LibraryAnalyzing Your Deliverables: Developing the Optimal Documentation Library
Analyzing Your Deliverables: Developing the Optimal Documentation Library
 
Presentación ComsCore Internet en SudAmerica y Argentina
Presentación ComsCore Internet en SudAmerica y ArgentinaPresentación ComsCore Internet en SudAmerica y Argentina
Presentación ComsCore Internet en SudAmerica y Argentina
 
Apache Spark: The modern data analytics platform
Apache Spark: The modern data analytics platformApache Spark: The modern data analytics platform
Apache Spark: The modern data analytics platform
 
Convergence 2
Convergence 2Convergence 2
Convergence 2
 
Day 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_pressDay 2 aziz apj aziz_big_datakeynote_press
Day 2 aziz apj aziz_big_datakeynote_press
 
Getting Started with Big Data for Business Managers
Getting Started with Big Data for Business ManagersGetting Started with Big Data for Business Managers
Getting Started with Big Data for Business Managers
 
Designing a Successful Governed Citizen Data Science Strategy
Designing a Successful Governed Citizen Data Science StrategyDesigning a Successful Governed Citizen Data Science Strategy
Designing a Successful Governed Citizen Data Science Strategy
 
Ms Emerging Tech2008
Ms Emerging Tech2008Ms Emerging Tech2008
Ms Emerging Tech2008
 

Kürzlich hochgeladen

React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 

Kürzlich hochgeladen (20)

React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 

Yahoo Microstrategy 2008

  • 1. Yahoo! Product Intelligence MicroStrategy World 2008 Amr A. Awadallah, PhD VP Product Intelligence Engineering Jan 2008 Yahoo! Inc – Amr A. Awadallah – MicroStrategy World 2008
  • 2. About Yahoo! and it’s purpose. Yahoo! who? Yahoo! is the world’s largest global online network of integrated services and is one of the most trafficked Internet destinations worldwide. For more than 12 years, Yahoo! has been changing the way people communicate with each other, conduct transactions, and access/share/create information. Our Purpose: Yahoo! powers and delights our communities of users, advertisers, and publishers – all of us united in creating indispensable experiences, and fueled by trust. Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 -1-
  • 3. Yahoo! Fast Facts Our Name: Yahoo! = Yet Another Hierarchical Officious Oracle! Dictionary definition= quot;rude, unsophisticated, uncouth” Employees: 14,000 Yahoos worldwide Global presence: Operations in over 20 markets and regions around the world Available in over 20 languages to 477M users. Milestones: Founded in 1994 Incorporated in 1995 Public in 1996 Proud to be: Fortune 500 Company Fortune ‘100 Best Company to Work For’ Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 -2-
  • 4. Yahoo! Pillars and Strategic Objectives Three Strategic Pillars: Three Strategic Objectives: • Insights will be leveraged to • Online starting point for most deliver 10x relevant experiences consumers • Open, allow publishers to use our • Platforms that attract the most content/services, and vice versa developers and publishers • Partner-of-Choice • A must buy for most advertisers • We believe the focus on relevance as a measure will create a unifying focus to our work and drive increased value in everything we do. • We are building the largest content, services, and advertising exchange. • Example Strategic Partners: eBay, AT&T, Comcast, Newspaper Consortium, Bebo, WebMD, Cars.com, Forbes.com, Ziff Davis Media, DivX, Hearst. Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 -3-
  • 5. Product Intelligence Engineering (PIE) Continuously generate and leverage insights to maximize sustainable value created through interactions within the Yahoo! ecosystem Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 -4-
  • 6. Why Measure? • “If you can’t measure it, you can’t fix it” • “If you can’t measure it, you can’t grow it” • “If you can’t measure it, you can’t build it” Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 -5-
  • 7. We Support More Than a Dozen Datamarts • Yahoo MY/Frontpage • Flickr • Yahoo Search • Yahoo GrouP • Yahoo Mail • answers • Yahoo Toolbar • Yahoo News • Yahoo MesnGr • Yahoo Financ • Yahoo Local • Yahoo Travel • Yahoo video • Yahoo ShoP • Yahoo SPorts • Yahoo Ent Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 -6-
  • 8. Our Data Stack MicroStrategy Standard Reporting MicroStrategy Advanced Reporting MicroStrategy Data Modeling Data Mart Database (currently Oracle) 200GB/day Data Mart ETL phase 2 (mixed tools) Data Mart ETL phase 1 (aggregations in grid) Foundational Warehouse (Link Octopus) 10TB/day Warehouse ETL Log Collection (Data Highway) Instrumentation (Universal Link Tracking) Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 -7-
  • 9. Groups That Use Our Systems • Business Operations/Finance • Product Management • Research and Development • User Experience and Design • Product Marketing • Advertising Sales Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 -8-
  • 10. Example Dashboards Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 -9-
  • 11. Case Study: A/B testing for Shopping.yahoo.com Is it better to show items in a list or in a grid? Test Bucket A: List Test Bucket B: Grid Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 - 10 -
  • 12. Dashboard Advice • Its very hard (impossible) to reach a single universal metric that summarizes how the product is doing • Proper visualization tools are very important since there are a lot of numbers to examine • Averages are nice, but histograms tell the real story • Trends are your friend, keep history, lots of it • Sampling is very dangerous if not used properly Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 - 11 -
  • 13. Example Metrics & Dimensions Example Metrics: • Click Probability (Cookies Clicked/Cookies Visited) • Click Yield (Clicks Per Thousand Cookies) • RPC (Revenue Per Cookie) • Sessions/Cookie/Week (or Month) • Time/Cookie/Week (or Month) • Retention Rate (percent of Cookies that returned) Example Dimensions: • Demographics (Gender, Age, Income, Tenure) • Geographics (Country, State, Zip, DMA) • Content ID • Access Modality (PC, PDA, Cell Phone, Net Speed, Browser, OS) • Traffic Source (Organic, SEM, Affiliate, Marketing Campaign) • Bucket Test ID Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 - 12 -
  • 14. Evolution of Our BI systems • Started with grep ☺ and generated static html dashboards • Evolved to load a few aggregates into MySQL with a dynamic Perl dashboard • Today we load a lot of aggregates for many metrics/dimensions into Oracle and use MicroStrategy for reporting. • Next? Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 - 13 -
  • 15. Next: A Unified Datamart (aka EDW) • We currently have 18+ separate datamart silos across all of Yahoo!’s products. • We will bring all of these datamarts under the same umbrella so that we can easily do cross-Yahoo! analytics in the datamart. • The data-model will need to be a hybrid data-model that supports horizontal uniformity but allows for vertically deep metrics and dimensions based on the product (e.g. mail sends is unique to mail) Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 - 14 -
  • 16. Next: Click and View-stream Analytics • Our current datamarts have aggregate data only which limits the number of questions that can be asked (we can still answer these questions from the LinkOctopus warehouse, but this requires an engineer to develop SQL due to complex schema). • We will expand our datamarts to include event- level data (both click and view-stream events), this will cause a large explosion in size and number of rows (from 200GB/day to 10TB/day). • The data-model will need to be a hybrid data- model that supports event level data but also aggregates (for performance and longevity) Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 - 15 -
  • 17. Challenges we have • Load data before new business day starts • Operational stability • Data quality: Bot filtering, Cookie churn • Instrumentation Automation • Columnar access control • Scalable dimension, segmentation, and event-level processing Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 - 16 -
  • 18. MicroStrategy: Things we like, Things we want Things We Like: Things We Want: • It writes SQL for us ☺ • Cross-mart dashboarding • It creates Web GUIs ☺ • URL functionality to send • We love the new flash out report links functionality • Better Portal SDK • We like Personalized • Clickstream visualization Page Execution in NCS • Better NCS Debugging • Intelligent Prompts • Better search on support.microstrategy.com Yahoo! Inc. – Amr A. Awadallah – MicroStrategy World 2008 - 17 -