AWS and Mechanical Turk for the automotive industry. Contains AWS automotive case studies, AWS overview, Mechanical Turk use case and application examples in automotive industry, Mechanical Turk background.
2. It’s about the Customer
AWS Customers & Use Cases
Innovation
Amazon Mechanical Turk
Flexible Workforce
AWS Overview
Overview, Solution of Choice, Workloads
Daniel Gray
Amazon Mechanical Turk
Principal, Business Development
Seattle, WA
4. Volkswagen
Case Study #1: Volkswagen – real-time supply chain analytics
Design
Chain
Supply
Chain
Upstream
(Supplier, Design Partners)
Downstream
(Distributors, Dealers,
Leasing/Rental Companies,
Customers)
Real-time
Supply Chain
Analytics
Connected Car
Mobile Apps
Dynamic Web
Applications
Rendering
Value Proposition
• Scalability
• Agility
• Automation
• Global Coverage
Engineering
Workplace
PLM
5. Case Study #1: Volkswagen - cont‘d
• Input batches
• Processing
temperature,
humidity, pressure
• Dimensions
• Weight
• Strength
• Date of shipment
• Carrier
• Shipping service
• Batch ID
Commodity
• Input batches
• Processing
temperature,
humidity, pressure
• Dimensions
• Weight
• Strength
• Date of shipment
• Carrier
• Shipping service
• Serial no.
Part
• Input serial no.
• Processing
temperature,
humidity, pressure
• Dimensions
• Weight
• Date of shipment
• Carrier
• Shipping service
• Serial no.
Module
• Model
• Configuration
• Input serial no.
• Processing
temperature,
humidity, pressure
• Dimensions
• Weight
• Destination
• Dealer, customer
• Chassis no.
Car
• Tracking ID
• Label
• Pick-up station
• Delivery station
• Current location
• Environmental
data
• Tracking ID
• Label
• Pick-up station
• Delivery station
• Current location
• Environmental
data
• Tracking ID
• Label
• Pick-up station
• Delivery station
• Current location
• Environmental
data
AWS Kinesis
6. Case Study #1: cont‘d
Amazon Web Services
AZ AZ AZ
Durable, highly consistent storage replicates data
across three data centers (availability zones)
Aggregate and
archive to S3
Millions of
sources producing
100s of terabytes
per hour
Front
End
Authentication
Authorization
Ordered stream
of events supports
multiple readers
Real-time
dashboards
and alarms
Machine learning
algorithms or
sliding window
analytics
Aggregate analysis
in Hadoop or a
data warehouse
Inexpensive: $0.028 per million puts
Worldwide realtime Data Ingestion Service
7. Case Study #2: Siemens – PLM & Teamcenter
http://www.plm.automation.siemens.com/en_us/products/teamcenter/cloud-plm.shtml
• Hybrid or Full Cloud
• Training, Dev &
Test, HA Production,
Disaster Recovery
8. Case Study #3: Lamborghini - Dynamic Web Applications
Reduced
infrastructure
costs by 50%
Reduced time to
market to near Zero
9. Mechanical Turk is a
marketplace to
access a workforce
of over 500K
Workers in 190
countries.
02Flexible Workforce
10. Crowdsourcing is
this decade’s cloud
computing.
Wired Magazine
Crowdsourcing.org 2011 study of crowdsourcing by vertical
11. Automotive use cases on AWS Mechanical Turk
Market Research
• Data collection
• Survey
• Web research
• Sentiment analysis
• Data training &
labeling
• User studies
• List assembly
Data Cleansing
• Verification
• Deduping
• Categorization
• Merging
• Moderation
• Normalization
Document Processing
• Structured, semi,
unstructured
• Transcription &
extraction
• Validation/ verification
• Data entry
• OCR text correction
Automotive parts supplier
• Data enrichment and normalization of
product ASINs
• Leveraged Mechanical Turk’s scalable
Workforce to scale in-line with
weekly/monthly workloads
One of world's largest automotive
manufacturers
• Collects training data for machine learning
algorithms.
• Used Mechanical Turk to recruit Workers
in order to categorize data.
• Adding Mechanical Turk to the R&D
toolkit for unstructured data and text
analytics, sentiment analysis, and data
mining.
Leading foreign automotive manufacturer
• Survey, market research and data collection
• Using Mechanical Turk to construct qualified
panels of Workers, collected market
feedback to better guide product
development in both design and marketing
Example applications
17. Amazon
S3
Amazon
Mechanical Turk
Requester
client server
(1 Application
Master, 14
Application
instances)
Mobile Client
Mobile Client
MySQL DB
Instance
MySQL DB
Instance
Amazon
Mechanical Turk
server (1
Application Master,
8 Application
Instances)
RabbitMQ
WorkersAssignments
Human
Intelligence
Tasks (HIT) Requester
20. 2. Pace of Innovation
New Service
Announcements & Updates
20122011201020092008 2013
24
48 61
82
159
280
3. Global Infrastructure
10 Regions
25 Availability Zones
51 Edge Locations
“AWS is the overwhelming
market share leader, with
more than five times
the compute capacity
in use than the aggregate
total of the other fourteen
providers.”
4. Capacity
Increased agility has become the #1 reason businesses choose the AWS cloud…
1. Workload Support
AWS Global Infrastructure
Application Services
Networking
Deployment & Administration
DatabaseStorageCompute
21. Lower Costs with AWS Up-Front and Increase Savings as Your
Usage Grows
Source: IDC Whitepaper, sponsored by
Amazon, “The Business Value of Amazon
Web Services Accelerates Over Time.”
July 2012
1
“Average of 400 servers
replaced per customer”
Replace up-front capital
expense with low
variable cost
2
38 Price
Reductions
Economies of scale allow
us to continually lower
costs
3
Pricing model choice to
support variable &
stable workloads
4
Save more money as
you grow bigger
On-demand
Reserved
Spot
Tiered Pricing
Volume Discounts
Custom Pricing
22. Enterprises Can’t Afford to be Slow
Add New Dev Environment
Add New Prod Environment
Add New Environment in Japan
Add 1,000 Servers
Remove 1,000 Servers
Deploy 1 PB Data Warehouse
Shut down 1 PB Data Warehouse
AWS:
Infrastructure in Minutes
Old World:
Infrastructure in Weeks
Everything changes with this kind of agility
23. A culture of Innovation: Experiment Often & Fail without Risk
On-Premises
Experiment Infrequently
Failure is expensive
Less Innovation
Experiment Often
Fail quickly at a low cost
More Innovation
$ Millions
Nearly $0
24. Many Enterprises Worry That These are the Only Two Choices
Build a
“Private”
Cloud
Rip everything out
and move to
AWS
#1 #2
25. The Good News is that Cloud isn’t an ‘All or Nothing’ Choice
Corporate
Data Centers
On-Premises
Resources
Cloud
Resources
Integration
26. Active Directory
Network Configuration
Encryption
Backup Appliances
Your On-Premises
Apps
Corporate
Data Centers
Users & Access Rules (IAM)
Your Private Network (VPC)
Encryption (S3, RDS, HSM)
Backups (Storage Gateway)
Your Cloud Apps
AWS Direct Connect
Integrating AWS with your existing On-Premises Infrastructure
27. Tools to help customers manage resources across environments
Single Pane of Glass
Management Tool Partners
28. Engage with us if…
1. You’re processing human-intensive workloads that can be micro-tasked…
2. You’re depending on internal FTE’s and/or outsource/offshore vendors…
3. Your workloads are on/off, fast growth, predictable/variable peaks, backlog…
4. You’re considering outsourcing part or all of your workloads requiring human
touch (i.e. judgment)…
Editor's Notes
My name is Daniel Gray, Principal Sales with Amazon Mechanical Turk, based in Seattle, WA. During the next 30-45mins, I’d like to help illustrate how financial services Requestors are shifting their human-intensive use cases such as data collection, language/translation processing, unstructured data processing, and many other use cases to Mechanical Turk’s flexible Workforce.
INNOVATE: Requesters with multiple use cases that can be broken down into repeatable tasks for the Workforce to perform; rather than taxing internal resources or even vendors with spiky workloads. Mechanical Turk is ideal as an on-demand utility for ‘human judgment.’
Today’s session will cover 3 areas:
1. The Customer: how traditional work models are unable to efficiently process the volume, velocity and variety of data needs that organizations have today.
2. Mechanical Turk: The marketplace to access a flexible workforce of over 500K Workers across 190 countries. Requesters can access the Workforce direct or indirectly, via WebUI and/or API.
3. AWS Overview: Why is AWS selected the Solution of Choice.
AWS is solving problems for big organizations across many verticals and geographies. We’re extremely proud of our customer list and happy to know that we’re providing good outcomes and better results for some of the best firms in the world
The AWS Premier Consulting Partner designation highlights the top APN Consulting Partners globally that have distinguished themselves by investing significantly in their AWS practice, growing their AWS business, providing exceptional customer service and helping a large number of customers run their applications on AWS. We have announced 22 consulting partners as 2014 premier partners.
You might have questions about security in the cloud, but our biggest and most conservative customers have found that we’re able to meet their security requirements, and often we can provide a better security profile than what they can deliver internally. The AWS cloud infrastructure has been designed and managed in alignment with regulations, standards, and best-practices including HIPAA and ISO 27001.
Recently we announced AWS CloudTrail, a service that records API calls made on your account and delivers log files to your Amazon S3 bucket. CloudTrail provides increased visibility into AWS user activity that occurs within an AWS account and allows you to track changes that were made to AWS resources. This allows enterprises to run comprehensive security analysis, but better manage their governance and compliance efforts.
MECHANICAL TURK HISTORY
Amazon had millions of Web pages that described individual products, but it needed to weed out duplicate pages.
Software could help, but algorithmically eliminating all the duplicates was impossible.
Born was Mechanical Turk - a Web site where people would look at product pages and be paid a few cents for every duplicate page they correctly identified.
Mr. Bezos figured that what had been useful to Amazon would be valuable to other businesses, too. In November 2005, Amazon made Mechanical Turk’s API public.
Mechanical Turk Overview
1. Crowdsourcing, and specifically Mechanical Turk, gives businesses access to on-demand, scalable resources to solve their business problems. Requestors are typically seeking to fix / accomplish / avoid the following business goals via Mechanical Turk:
- Reduce cost, transform fixed costs to variable expense
- Improve scalability or elasticity (i.e. bursting up & down) in line with workload type (i.e. on-demand)
- Accelerate time-to-market
- Improve quality / accuracy
- Increase revenue
2. Workforce: The ability to tap into a workforce of over 500,000 people around the world can enable you to move faster.
3. Fixed costs: By shifting from a “fixed cost” model to an “on-demand” model, companies outsource work without making long term commitments, and they can iterate faster. Requestors are typically in one or both of the following situations:
Seeking to supplement, or shift from, using FTE’s, staff augmentation, or traditional outsourcing (i.e. on-the-bench) to perform routine, human-intensive tasks.
Seeking to train/test their algorithms by quickly developing large sets of data
4. Iterate faster: Especially in an agile product development environment, if you can iterate faster/fail cheaply, = innovate faster, and focus on your core business vs. operational challenges like data cleansing.
A. Empowering developers to build against your platform doesn’t just create value for partners
B. it expands the ecosystem, increases retention, and drives up the value of the platform.
C. Most importantly, end customers win…when all their products work seamlessly together.
Let’s synch on the definition of ‘crowdsourcing’ because it can actually mean different things. The Umbrella of crowdsourcing breaks it out into 4 distinct groups:
Microtasking: Mechanical Turk is optimized for microtasking; here, you’re breaking down a larger project into atomic level tasks in order for an army of Workers to perform at scale. Instead of paying Workers by time, you’re shifting the paradigm to paying Workers by result.
Macrotasking:
Crowdfunding:
Contests:
Crowdsourcing.org did a study in 2011 that broke down the use of ‘crowdsourcing’ by vertical; since then, we’ve only seen an increase in the quantity of companies across all verticals (some verticals more quickly than others) using Mechanical Turk for a variety of use cases
Mechanical Turk: Story continues
We already talked about how Amazon needed to solve the ‘data cleansing’ problem by finding and eliminating duplicates.
Using Amazon.com pages for illustration purposes, let me share 2 more typical use case examples
Think of Mechanical Turk use cases in 2 primary buckets:
Data collection: In data collection, the crowd collects data for you, in the form of cleansing, aggregating, moderating, categorizing, transcribing, rating, authoring, surveys/research, attributing or tagging.
Data training: In data training, the crowd helps you quickly develop large sets of training data to help train your algorithms
4. [CLICK] This shows an example of attribution, where missing product data is added by Workers, improving searchability.
…and here’s another example use case of categorization where Worker’s identify relevant product results to improve search relevancy
Now I’ve shared some examples of data problems that Amazon needs to tackle at scale (deduplication, attribution, categorization). What are the Work Model options to do this?
Insourcing: Is a work model that uses in-house staff to perform work. Fixed resources impact cost, speed and scalability, while ideally achieving the greatest savings in quality. [CLICK]
Outsourcing: Is a work model that contracts a service provider to perform work, using workers ‘on-the-bench’. Savings are improved, but efficiencies are still not as optimized as possible. [CLICK]
Crowdsourcing: Is a distributed work model that breaks work down to the most efficient task-level, and accesses Workers directly, on-demand. Accuracy or quality is the biggest misunderstanding about Crowdsourcing…
Is NOT an Open Call: This means building and managing a qualified and screened community of individuals to complete the work, NOT launching work into the unknown.
Is Secure and Reliable: because it’s working on a successful infrastructure which includes Worker quality, workflows, and best practices.
Is Scalable: Whether you have workload types that fit the i) on/off, ii) fast growth, iii) variable peaks, or iv) predictable peaks, the crowd can burst up/scale down on-demand, and on-task.
Is not the silver bullet…for all projects or tasks. Some projects are better kept in-house and/or outsourced. But more and more enterprises are shifting human-intensive, routine work to Mechanical Turk…and leveraging the Human API call.
Cost of Ownership
A Requester’s cost of ownership in the marketplace is comprised of i) Worker fees, and ii) Amazon fee. Amazon Mechanical Turk collects a 10% commission on top of the reward amount you set for Workers. For example, if a HIT reward is set to $0.20, Amazon Mechanical Turk collects $0.02 for each assignment. The minimum commission charged is $0.005 per assignment. When you grant a bonus, Amazon Mechanical Turk collects 10% of the bonus amount, or a minimum of $0.005 per bonus payment. If you choose to send HITs exclusively to Photo Moderation or Categorization Masters, an additional 20% fee applies.
Mechanical Turk is the marketplace that gives you PROGRAMMATIC access to a cost-effective, scalable, global workforce of over 500K Workers in 190 countries.
Like AWS’ ‘cloud’ offerings, which provide access to scalable computing power, Mechanical Turk provides access to scalable human power / or human judgment. In other words, there are tasks humans can do, better, that computing technology cannot do alone, so think of Mechanical Turk as the ‘human API’ call.
[CLICK]
Mechanical Turk’s Partner network is comprised of Consulting Partners, and Technology Partners, intended to ease access and usage of the Mechanical Turk marketplace. As part of the Partner network, Ed from Top Image Systems will profile their technology, its value to you, and how it integrates Mechanical Turk.
Let’s highlight 3 Partners, and their value-add proposition to the Requester.
[CLICK]
[CLICK]
[CLICK]
Let’s take a closer look at how Mechanical Turk works at a high-level, and then illustrate how some Requesters structure their Mechanical Turk workflows for success.
Begin with a project…and define the goals & key components of your project. For example, your goal might be to clean your business listing database so that you have accurate information for consumers. The sub-components of your project might be to categorize the businesses by listing type (i.e., restaurant or service) and verify that the related address and phone number are current.
Break it into tasks and design your HIT…so many Workers can work in parallel and faster. For example, if you have 1,000 listings to verify, each listing could be an individual task. Next, design your Human Intelligence Tasks (HITs) by writing crisp and clear instructions, identifying the specific outputs and inputs desired and how much you will pay to have work completed. Calculating reward is a function of defining a competitive effective hourly rate, prorating based on task completion time, competitive marketplace rates, and throttling your cost/accuracy/productivity levers relative to your target performance metrics.
[CLICK]
Publish HITs to the marketplace…hundreds, thousands, even millions at a time. For example, each HIT can have multiple assignments so that different Workers can provide answers to the same set of questions and you can compare the results to form an agreed-upon answer.
Workers accept assignments…for special skills, you can Qualify the Workforce. For example, if Workers need special skills, specific geography, or specific marketplace rating to complete your tasks, you can require that they pass a Qualification test before they are allowed to work on your HITs.
[CLICK]
Workers submit assignments for review. When a Worker completes your HIT, he or she submits an assignment for you to review.
Approve or reject assignments…you pay only for approved work. When your work items have been completed, you can review the results and approve or reject them.
Complete your project…Congratulations; your project has been completed and your Workers paid!
Going directly to the crowd….
…requires companies to define their tasks more precisely….so that anyone who reads their instructions can successfully complete the task.
There’s no one “right” way to structure your work. However, approaching Mechanical Turk is similar to my story – do you jump right into PPT and start creating slides…or do you storyboard first? Establishing the blueprints for your architecture, & workflows, and learning market dynamics & best practices increases the probability of achieving accuracy, cost-efficiency and productivity at scale.
Because Mechanical Turk is on-demand…makes it easy to spin up an project, measure it, and optimize based on the results.
Here’s an example of accessing Mechanical Turk via a Partner. I want to highlight a couple things here:
Multiple AWS technologies working together (S3 and Mechanical Turk)
Best practices, including a defined adjudication strategy (qualification, quality control, plurality, known answers), market dynamics, HIT ergonomics
Workflow, designed and tested prior to scaling work.
This is a simple view of the set of services that we offer. At the core is the compute, storage and data services that are the heart of our offering. We then surround these offerings with a range of supporting components like management tools, networking services and application services. All these capabilities are hosted within our global data center footprint that allows you to consume services without having to build out your own facilities or procure hardware equipment.
This view shows the number of new services and features launched since our inception. In 2010, we launched 61 significant services and features, in 2012 it was 159, and this year alone, we have launched 245 services and features. The pace of innovation is accelerating at AWS.
Our data center footprint is global, spanning 5 continents with highly redundant clusters of data centers in each region. Our footprint is expanding continuously as we increase capacity, redundancy and add locations to meet the needs of our customers around the world.
AWS has been named a leader in the Gartner MQ for Cloud IaaS third year in a row. Not only that Gartner notes that AWS is the overwhelming market share leader, with more than five times the compute capacity in use than the aggregate total of the other fourteen providers
Cost is the conversation starter when it comes to cloud. There are many pieces to cost conversation when it comes to AWS and your own infrastructure. The first advantage you get in the cloud is that you don’t have to lay out capital expense for hardware and infrastructure before you know the demand. In essence you convert your capital expense into variable expense. And then that variable expense on AWS is lower than what most companies can do on their own because AWS runs at a massive scale and we pass that scale to our customers in the form of lower pricing. There are multiple pricing models in AWS, so you can optimize your spend depending on what your workloads requirements are. And the more you use AWS, the less your costs are. We have tiered pricing and for customers doing large data center migrations, we have negotiated custom pricing to make their transitions cost-effective.
Enterprises cannot afford to be slow, but if you can ask an enterprise leader as to how long does it take to get a server for running a workload, the typical time frame is 10 to 18 weeks. In the cloud you can spin thousands of servers in minutes and experiment quickly. If the experiment doesn’t work out, you can spin down those instances and stop paying for them.
This is a big difference from the old world. In the cloud, you can instantly spin up and down clusters, Petabyte size data warehouses and new production or dev. Environments. Everything changes with this kind of agility.
We see our customers do amazing things when they reduce the cost of experimentation- it moves IT from being a roadblock, where each idea costs lots of money and takes lots of time, to being an enabler where you can launch a speculative project quickly and cheaply. It allows firms to take more chances on ideas, and gives them a shot at winning big, as opposed to being scared to even try.
Many enterprises understand the value proposition of cloud, but worry that using a cloud or on-premises infrastructure is a binary choice. It is not. We understand that enterprises have a number of on-premises data centers that they are not ready to retire yet; what they really want is the ability to use their on-premises data centers easily with AWS.
We have spent last couple of years making this integration simpler and easier and this is an area where we’ll be spending significant resources in the future.
We have launched several features to support this vision of integrating your on-premises infrastructure with AWS. For identity federation we have the ability to integrate with Active Directory and SAML. We have built a number of network capabilities, including Amazon Virtual Private Cloud that allows you to practically cordon off part of our network and deploy AWS resources into it. Many enterprises have deployed VPC as an extension of their existing data centers.
We also have AWS Direct Connect, which allows private connections between your data center and AWS. We continue to encrypt all our persistent data. We also have Storage Gateway, a virtual appliance that allows you to store your your primary data in Amazon S3 and retain your frequently accessed data locally or store your primary data locally, and asynchronously back up point-in-time snapshots of this data to Amazon S3.
We have also worked with a number of third party providers to provide an easier view so that you can have a single pane of glass to manage your applications. This lets you view you deployments in on-premises and AWS environments in one view. We work with BMC and CA and others to make this easier for customers.
To summarize, AWS is a great fit for you if you’re building new applications, facing a technical refresh during the next year, or planning to add capacity for your growing workloads.