SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Downloaden Sie, um offline zu lesen
How Big Data Can Help Your
            Business:
Case Studies from ReadWriteWeb
                      David Strom
                    StampedeCon
                    August 1, 2012
                  david@strom.com
  Download this here: http://slideshare.net/davidstrom
My publications
Editorial management positions:
Some oddball stuff
•   Planes, trains and automobiles
•   Fun with maps
•   Big and little ovens
•   Lessons learned from P&G
•   Noteworthy scientists
•   And of course sex!
StartupCompass.co
The reason behind
Three skills for big data CEOs
• Strategic data planning. Data is the new raw
  material for any business.
• Analytical skills. CEOs should be incredibly
  smart about asking the right questions.
• Technology skills. Embrace the technology
  and make it a key part of your CEO skill set.
More from Jeff Jonas



         vs.
Mason’s 5-step Big Data process
•   Obtain
•   Scrub
•   Explore
•   Model
•   Interpret
Questions?
           David Strom
     david@strom.com
          314 277 7832
     @dstrom (Twitter)
http://strominator.com

Weitere ähnliche Inhalte

Andere mochten auch

Why NoSQL and MongoDB for Big Data
Why NoSQL and MongoDB for Big DataWhy NoSQL and MongoDB for Big Data
Why NoSQL and MongoDB for Big DataWilliam LaForest
 
Introduction to MongoDB with PHP
Introduction to MongoDB with PHPIntroduction to MongoDB with PHP
Introduction to MongoDB with PHPfwso
 
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...NoSQLmatters
 
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...MongoDB
 
An Introduction to Map/Reduce with MongoDB
An Introduction to Map/Reduce with MongoDBAn Introduction to Map/Reduce with MongoDB
An Introduction to Map/Reduce with MongoDBRainforest QA
 
MongoDB & Machine Learning
MongoDB & Machine LearningMongoDB & Machine Learning
MongoDB & Machine LearningTom Maiaroto
 

Andere mochten auch (8)

MongoDB - Ekino PHP
MongoDB - Ekino PHPMongoDB - Ekino PHP
MongoDB - Ekino PHP
 
Why NoSQL and MongoDB for Big Data
Why NoSQL and MongoDB for Big DataWhy NoSQL and MongoDB for Big Data
Why NoSQL and MongoDB for Big Data
 
MongoDB
MongoDBMongoDB
MongoDB
 
Introduction to MongoDB with PHP
Introduction to MongoDB with PHPIntroduction to MongoDB with PHP
Introduction to MongoDB with PHP
 
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
 
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
 
An Introduction to Map/Reduce with MongoDB
An Introduction to Map/Reduce with MongoDBAn Introduction to Map/Reduce with MongoDB
An Introduction to Map/Reduce with MongoDB
 
MongoDB & Machine Learning
MongoDB & Machine LearningMongoDB & Machine Learning
MongoDB & Machine Learning
 

Mehr von David Strom

Spark Twitter fails Mar2023
Spark Twitter fails Mar2023Spark Twitter fails Mar2023
Spark Twitter fails Mar2023David Strom
 
Getting Your First Cybersecurity Job
Getting Your First Cybersecurity JobGetting Your First Cybersecurity Job
Getting Your First Cybersecurity JobDavid Strom
 
Understanding passwordless technologies
Understanding passwordless technologiesUnderstanding passwordless technologies
Understanding passwordless technologiesDavid Strom
 
What endpoint protection solutions are available on the market today?
What endpoint protection solutions are available on the market today?What endpoint protection solutions are available on the market today?
What endpoint protection solutions are available on the market today?David Strom
 
Fears and fulfillment with IT security
Fears and fulfillment with IT securityFears and fulfillment with IT security
Fears and fulfillment with IT securityDavid Strom
 
Protecting your digital and online privacy
Protecting your digital and online privacyProtecting your digital and online privacy
Protecting your digital and online privacyDavid Strom
 
AI and cyber security: new directions, old fears
AI and cyber security: new directions, old fearsAI and cyber security: new directions, old fears
AI and cyber security: new directions, old fearsDavid Strom
 
The legalities of hacking back
The legalities of  hacking backThe legalities of  hacking back
The legalities of hacking backDavid Strom
 
How to market your book in today's social media world
How to market your book in today's social media worldHow to market your book in today's social media world
How to market your book in today's social media worldDavid Strom
 
​Understanding the Internet of Things
​Understanding the Internet of Things​Understanding the Internet of Things
​Understanding the Internet of ThingsDavid Strom
 
How to make your mobile phone safe from hackers
How to make your mobile phone safe from hackersHow to make your mobile phone safe from hackers
How to make your mobile phone safe from hackersDavid Strom
 
Implications and response to large security breaches
Implications and response to large security breaches Implications and response to large security breaches
Implications and response to large security breaches David Strom
 
Using social networks to find your next job (2017)
Using social networks to find your next job (2017)Using social networks to find your next job (2017)
Using social networks to find your next job (2017)David Strom
 
Security v. Privacy: the great debate
Security v. Privacy: the great debateSecurity v. Privacy: the great debate
Security v. Privacy: the great debateDavid Strom
 
Using OpenStack to Control VM Chaos
Using OpenStack to Control VM ChaosUsing OpenStack to Control VM Chaos
Using OpenStack to Control VM ChaosDavid Strom
 
Notable Twitter fails
Notable Twitter failsNotable Twitter fails
Notable Twitter failsDavid Strom
 
How to make the move towards hybrid cloud computing
How to make the move towards hybrid cloud computingHow to make the move towards hybrid cloud computing
How to make the move towards hybrid cloud computingDavid Strom
 
Listen to Your Customers: How IT Can Provide Better Support
Listen to Your Customers: How IT Can Provide Better SupportListen to Your Customers: How IT Can Provide Better Support
Listen to Your Customers: How IT Can Provide Better SupportDavid Strom
 
Network security practice: then and now
Network security practice: then and nowNetwork security practice: then and now
Network security practice: then and nowDavid Strom
 
Biggest startup mistakes
Biggest startup mistakesBiggest startup mistakes
Biggest startup mistakesDavid Strom
 

Mehr von David Strom (20)

Spark Twitter fails Mar2023
Spark Twitter fails Mar2023Spark Twitter fails Mar2023
Spark Twitter fails Mar2023
 
Getting Your First Cybersecurity Job
Getting Your First Cybersecurity JobGetting Your First Cybersecurity Job
Getting Your First Cybersecurity Job
 
Understanding passwordless technologies
Understanding passwordless technologiesUnderstanding passwordless technologies
Understanding passwordless technologies
 
What endpoint protection solutions are available on the market today?
What endpoint protection solutions are available on the market today?What endpoint protection solutions are available on the market today?
What endpoint protection solutions are available on the market today?
 
Fears and fulfillment with IT security
Fears and fulfillment with IT securityFears and fulfillment with IT security
Fears and fulfillment with IT security
 
Protecting your digital and online privacy
Protecting your digital and online privacyProtecting your digital and online privacy
Protecting your digital and online privacy
 
AI and cyber security: new directions, old fears
AI and cyber security: new directions, old fearsAI and cyber security: new directions, old fears
AI and cyber security: new directions, old fears
 
The legalities of hacking back
The legalities of  hacking backThe legalities of  hacking back
The legalities of hacking back
 
How to market your book in today's social media world
How to market your book in today's social media worldHow to market your book in today's social media world
How to market your book in today's social media world
 
​Understanding the Internet of Things
​Understanding the Internet of Things​Understanding the Internet of Things
​Understanding the Internet of Things
 
How to make your mobile phone safe from hackers
How to make your mobile phone safe from hackersHow to make your mobile phone safe from hackers
How to make your mobile phone safe from hackers
 
Implications and response to large security breaches
Implications and response to large security breaches Implications and response to large security breaches
Implications and response to large security breaches
 
Using social networks to find your next job (2017)
Using social networks to find your next job (2017)Using social networks to find your next job (2017)
Using social networks to find your next job (2017)
 
Security v. Privacy: the great debate
Security v. Privacy: the great debateSecurity v. Privacy: the great debate
Security v. Privacy: the great debate
 
Using OpenStack to Control VM Chaos
Using OpenStack to Control VM ChaosUsing OpenStack to Control VM Chaos
Using OpenStack to Control VM Chaos
 
Notable Twitter fails
Notable Twitter failsNotable Twitter fails
Notable Twitter fails
 
How to make the move towards hybrid cloud computing
How to make the move towards hybrid cloud computingHow to make the move towards hybrid cloud computing
How to make the move towards hybrid cloud computing
 
Listen to Your Customers: How IT Can Provide Better Support
Listen to Your Customers: How IT Can Provide Better SupportListen to Your Customers: How IT Can Provide Better Support
Listen to Your Customers: How IT Can Provide Better Support
 
Network security practice: then and now
Network security practice: then and nowNetwork security practice: then and now
Network security practice: then and now
 
Biggest startup mistakes
Biggest startup mistakesBiggest startup mistakes
Biggest startup mistakes
 

Kürzlich hochgeladen

Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 

Kürzlich hochgeladen (20)

Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 

Big Data examples

  • 1. How Big Data Can Help Your Business: Case Studies from ReadWriteWeb David Strom StampedeCon August 1, 2012 david@strom.com Download this here: http://slideshare.net/davidstrom
  • 3. Some oddball stuff • Planes, trains and automobiles • Fun with maps • Big and little ovens • Lessons learned from P&G • Noteworthy scientists • And of course sex!
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 17.
  • 19. Three skills for big data CEOs • Strategic data planning. Data is the new raw material for any business. • Analytical skills. CEOs should be incredibly smart about asking the right questions. • Technology skills. Embrace the technology and make it a key part of your CEO skill set.
  • 20.
  • 21. More from Jeff Jonas vs.
  • 22.
  • 23. Mason’s 5-step Big Data process • Obtain • Scrub • Explore • Model • Interpret
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35. Questions? David Strom david@strom.com 314 277 7832 @dstrom (Twitter) http://strominator.com

Hinweis der Redaktion

  1. Let’s look at planes, trains and automobiles first.
  2. http://www.inside-r.org/howto/mining-twitter-airline-consumer-sentimentMeanwhile, the immediacy and accessibility of Twitter provides a real-time glimpse into consumer's frustration as you can see in this collection of just three tweets. Jeffrey Breen of Cambridge Aviation Research put this together to show sentiment analysis.
  3. Here is his flowchart of how it put this all together, using R and various other data collection tools.
  4. http://www.forbes.com/sites/toddwoody/2012/05/23/fedex-delivers-on-green-goals-with-electric-trucks/To tackle what is essentially a Big Data dilemma, FedEx is collaborating with General Electric – which is providing the company with commercial charging stations – utility Con Edison and Columbia University researchers, who are developing artificial intelligence programs to manage when and where the electric trucks charge in a 10-vehicle pilot project.“We’re collecting data on what is the load on the facility, what is the load of each truck, how many miles does that truck drive,” says Sondhi. “The algorithms from Columbia will identify that a truck is going to drive 16 miles tomorrow, so don’t give it 30 amps, give it 8 amps so we minimize the load on the entire facility.”
  5. http://www.wired.com/autopia/2012/05/ford-sync-insurance/Currently, Ford collects and aggregates data from the 4 million vehicles that use in-car sensing and remote app management software to create a virtuous cycle of information. The data allows Ford engineers to glean information on a range of issues, from how drivers are using their vehicles, to the driving environment, to electromagnetic forces affecting the vehicle, and feedback on other road conditions that could help them improve the quality, safety, fuel economy and emissions of the vehicle. Here you see a typical Sync dash of a Ford sedan.Drivers willing to share how many miles they’ve traveled could get discounts between 10 and 40 percent in exchange for providing State Farm with a more accurate picture of their vehicle-use habits, which they obtain from directly accessing the Sync telematics systems in the cars electronically.Your car has become a data hub, with USB ports, a SD card reader, Bluetooth connections to your phone and even a mobile Wifi hotspot.
  6. http://transport.wspgroup.fi/hklkartta/defaultEn.aspxYou can watch the positions for the various trains in Helsinki as they move about the map here.
  7. Speaking of maps, there are thousands of big data mapping apps. Google Maps is certainly popular, but another site makes it even easier called Crowdmap. Here is a map of sexual violence against Syrian women that was found using that service at https://womenundersiegesyria.crowdmap.com/
  8. http://geospaced.blogspot.co.uk/2012/07/world-wine-web.htmlDavid Smith put this together from about 400 wineries in the Napa Valley area. Not only can you scroll and zoom the map, but clicking on one of the winery markers will tell you its address and whether an appointment is required for tastings. He worked with Barry Rowlingson who used OpenStreetMaps and his own R package to build this map:
  9. http://www.inside-r.org/howto/quantifying-uncertainty-it-estimatesAccurate estimates of IT work effort are critical for deciding where in technology a business should invest. Lacking experience with similar projects, the business is often at a loss for hard data. In this article, we describe our benefit from the power and convenience of R in the elicitation task, or, in other words, in quantifying the uncertainty around IT project lifespans using probability distributions. We show how R's built in functionality makes the elicitation task painless, while demonstrating how the methodology can be implemented in a user-friendly format. The power of R's probability toolbox allowed us to rapidly prototype an application which transported the basic concepts of elicitation to the IT project management space.
  10. http://www.inside-r.org/howto/towards-ideal-steel-plant-online-liquid-steel-temperature-prediction-using-rR seems a suitable means for solving the task of providing accurate, understandable and automatable models for the desired temperature predictions. The R-project has proved to be most useful for the implementation of the calculated results, the same as the external control of its functionalities in a process automation environment. The presented mathematical approach and the developed R-code and framework program enable steel plant production engineers and technical staff to plan, carry out and adjust their tasks and doings on the basis of highly stable and precise temperature preset-values. Instead of adding off-sets and thresholds to the assumed heat target temperatures and by that adding extra processing time and extra energy during each processing step, to be on the safe side and rather deliver the melt above the final casting temperature than below, the new temperature prediction model will allow for the optimization of process stability, throughput and material quality in the steel plant, especially in ladle treatment.
  11. We are looking at a hospital autoclave, which is used for sterilizing instruments. This is just one type of Industrial equipment which are among the products that Axeda is working with other companies to rig with sensors and cellular connections. Each of these devices has an IP address and an Internet connection, so that use of those devices can then be monitored remotely, so that their supply, maintenance and management can all be optimized, without having to go and look at the machines themselves. "Typically engineers would find logs through customer tickets and it would take months to find trends based on call center traffic,” You can collect data about uptime, need for repairs, machine run completion and detergent levels into a smartphone app that hospital employees can use.
  12. Startup Compass collects data from tens of thousands of startups around the world. It collects lots of data, then creates best practices, recommendations and benchmarks to help entrepreneurs make better product and business decisions. Startups can learn which key performance indicators actually matter. Most startups don’t even know which KPIs they should track or why they should track them. Second, they learn how their KPIs compare to other companies’ KPIs so they will know if they’re on the right track. See, for example, their customer acquisition costs. The third thing they learn is what actions they need to be taking. We help businesses take the next steps.”
  13. http://practicalanalytics.wordpress.com/2012/02/28/proctor-gamble-quadrupling-analytics-expertise/This is Proctor and Gamble’s Business Sphere big data situation room in their Cincinnati HQ. A big data analyst drives these large screens that display data visualizations on sales, market share, ad spending and the like, so everyone in the meeting is seeing the same information based on 4 billion daily transactions of P&G products. P&G isn’t after new data types; it still wants to share and analyze point-of-sale, inventory, ad spending, and shipment data. What’s new is the higher frequency and speed at which P&G gets that data, and the finer granularity. Even with all this gear, P&G has about two-thirds of the real-time data it needs.
  14. They are trying to come to address the reason behind Why? was it a bad TV ad, out-of-stock shelves, or a competitor’s new product or price cut that caused a problem? Right now, the P&G IT team is working on automating analysis of the why, so employees get alerts when key events like a supply chain snafu or rival product launch happen. Their data visualizations can answer things such as -- Is a sales dip in detergent in France because of one retailer, so that’s where to focus?   - Is that retailer buying less only in France, or across Europe? 
  15. http://www.readwriteweb.com/cloud/2012/02/strata-2012-3-essential-skills.phpDiego Saenz of Data Driven CEO
  16. Jeff Jonas is a data scientist that now works for IBM. One of his jobs was designing the casino security systems in Las Vegas, where he currently lives. He worked for the surveillance intelligence group of several casinos, and automated various manual processes, adding facial recognition software that was key to slowing down the MIT card counting group. "We built [another] system to immediately identify risk in real time so they could get these people out of the casino quickly." This software is still offered by IBM as its InfoSphere Identity Insight event processing and identity tracking technology.
  17. If someone has three phone numbers - no big deal. On the other hand, if someone has five different dates of birth, that just doesn't seem quite right does it? That would be confusing. Why is this important? Well, if you are looking to analytics to make important decisions, wouldn't you want to know during the decision making process if there was related confusion ... before [any] action is taken."
  18. http://www.readwriteweb.com/enterprise/2011/09/measuring-the-lifespan-of-shar.phpHilary Mason analyzed shortened links posted to Twitter have a mean half life of 2.8 hours. Facebook boosts that to 3.2 hours, and direct sharing has a half-life of 3.4 hours. YouTube, however, beats them all hands down with a half life of 7.4 hours. In other words, you might get a slight edge by posting to Facebook versus Twitter (if you don't do both) but the content matters most. Good (or controversial) stuff rises to the top and has a longer life. Uninteresting stuff sinks quickly.
  19. you need to start thinking about how to make your data sets smaller. "Big Data usually refers to a data set that is too big to fit into your available memory, or too big to store on your own hard drive, or too big to fit into an Excel spreadsheet," says Mason. This is the "scrub" section. The smaller the dataset, the easier it is to manipulate.
  20. Mason and others have mentioned the now iconic Enron email archive that has since passed into the public domain and is used by a number of big data researchers to test their email algorithms and is available from a number of online academic websites.
  21. http://strataconf.com/strata2012/public/schedule/detail/22449Jesper Andersen gave this talk at Strata eariler this year and showed how to integrate basic public data from the city, street and mapping data from Open Street Maps, real estate and rental listings data, data from social services like Foursquare, Yelp and Instagram, and analyze photographs of streets from mapping services to create a holistic view of a very famous street in San Francisco, Haight Street. Surprisingly, you'll find a lot of Swedish folks on the upper half of Haight Street. Not surprisingly for San Francisco, many people on Haight speak Spanish or Japanese. Tweet stream analysis found that more negative sentiment on the lower part of the street, which corresponds with higher crime stats.
  22. The Associated Press has launched a content analysis tool that is used to search the millions of articles in their archives to create custom archive products for their customers. Users can query for particular keywords, and the AP can use the search query traffic to see trending topics and deliver article collections to particular B2B customers. For example, they could create references on a particular subject or moment in time. The project makes use of a solution from MarkLogic. AP Creates New Big Data Approach to its Article ArchiveDavid Strom· March 19th, 2012 3 Comments58inShareIf you are looking for large content repositories, you probably can't get much larger than the article archive of the Associated Press. Today they announced they have launched a content analysis tool that is used to search the millions of articles in their archives to create custom archive products for their customers. Users can query for particular keywords, and the AP can use the search query traffic to see trending topics and deliver article collections to particular B2B customers. For example, they could create references on a particular subject or moment in time. The project makes use of a solution from MarkLogic, a major Big Data enabler that is used by many different kinds of publishers for this type of purpose, such as Lexis/Nexis. We have written about prior efforts by the AP to help modernize their archives, such as this project to provide non-profits with free information feeds.The AP didn't start out by using the MarkLogic solution, but tried to implement a more traditional relational database structure only to run into problems. Their archives are in XML, which was difficult to design the right kind of data structures. Plus, they didn't have a consistent metadata collection across the archives. The MarkLogic implementation took 16 weeks from start to finish and was the first time that the AP had made use of their services. It enables them to run complex, Boolean searches across millions of articles in our content archive and get back precise returns in seconds or minutes instead of days or weeks. This much quicker response time is already transforming their B2B product offerings and help them to manage searching for unstructured content in near real-time
  23. http://www.readwriteweb.com/hack/2012/02/data-scraping-comes-of-age-wit.phpThe company is called ScraperWiki.com and was started by Julian Todd and Aidan McGuire, two U.K.-based analysts who have been long involved in opening up government data to the public.
  24. This is showing data that was mined from the UN peacekeeping troop levels, as one example of what you can do with the scraperwiki site. They have lots of public data sets that are available for anyone to analyze and try to help journalists publish the information.
  25. Appistry FedEx's logistics apps, Sprint's fraud detection services, and at defense contractor Northrop Grumman. San Francisco-based Presidio Health used a variety of products to boost its cloud performance. "Presidio had to handle a 16 times increase in data volume in a year and replace some aging hardware," says its CTO Thomas Gregory. It was able to increase its computing power by 70% without increasing the costs of its IT equipment. "We didn't want a lot of capital expense, and we wanted an environment that was safe and could spread our risk around." The company uses a combination of Eclipse and Spring-based open source software and Appistry for handling its cloud services management. "Appistry has integration with Spring, it was easy to use and saved us months of effort to move our software into this environment," he said. "Plus we don't have to expose any of our services externally."
  26. http://blog.okcupid.com/index.php/gay-sex-vs-straight-sex/Ok, on to sex. The dating site Okcupid looked through more than 4 million matches that they have made to find out patterns about gay and straight sexual preferences. The median number of sexual partners for both men and women are six, exploding the myth that gays are more promiscuous,
  27. Here are straight people who either have had or would like to have a same-sex experience in the continental U.S. and lower Canada. You can see some sharp geographic divides.Awesomely, the mountain West lives up to its Brokeback reputation, and Canada is orange nearly coast-to-coast. Even in the yellow and blue areas, you can see pockets of gay curiosity in interesting places: Austin, Madison, Asheville. Anywhere soy milk is served, basically. This is based on millions of responses, On averageactive users have answered about 3000 questions; they've hidden the profiles of several thousand users they aren't interested in; they've voted for about 4000 profiles.
  28. When OKCupid asked its members for factual questions, this is how they sorted out by sexual preference and gender. We always knew that women were smarter.
  29. Kaggle routinely hosts various big data contests and this one that concluded last month was a way for Facebook to evaluate prospective employees. More than 400 people submitted entries.
  30. http://www.theatlantic.com/technology/archive/2012/05/the-perfect-milk-machine-how-big-data-transformed-the-dairy-industry/256423/Still think big data is a lot of bull? Well, not according to the USDA. 8 million Holstein dairy cows in the United States, there is exactly one bull that has been scientifically calculated to be the very best in the land. He goes by the name of Badger-Bluff Fanny Freddie, who has 346 daughters who are on the books already. Their equations predicted from his DNA that he would be the best bullUSDA research geneticist reviewed pedigree records and looked at things such as milk production and fat and protein content to optimize the breed. To give you an idea of how this industry has changed, In 1942 the average dairy cow produced less than 5,000 pounds of milk in its lifetime. Now, the average cow produces over 21,000 pounds of milk.