Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Cloudera Academic Partnership: Teaching Hadoop to the Next Generation of Data Professionals
1. 1
Mark Morrissey | Director of Education Programs
May 2013
Cloudera Academic Partnership
Teaching Hadoop to the Next Generation of Data Professionals
3. 3
The Big Data Problem Is Getting BiggerDataGrowth
END-USER
APPLICATIONS
THE INTERNET
MOBILE DEVICES
SOPHISTICATED
MACHINES
STRUCTURED DATA – 10%
COMPLEX DATA – 90%
1980 TODAY
4. 4
THE HADOOP WAYTHE OLD WAY
Single data platform for
storage, BI, analytics, and
app serving
Multiple platforms
for multiple workloads
SIMPLE, UNIFIED, EFFICIENT
• Data stored on scalable, low-cost platform
• Hadoop handles a variety of workloads
• Perform end-to-end workflows on single system
• Provides broad data access across departments
COMPLEX, FRAGMENTED, COSTLY
• Data siloed by department or line of business
• Mostly stored in expensive specialized systems
• Analysts only retrieve data into EDW when needed
• Nobody in the organization has a complete view
5. 5
Public Sector & Government
The intelligence community securely manages, audits, and analyzes massive video data sets to fight terrorism
Hadoop Powers Big Data Breakthroughs
Healthcare & Biotech
Scalably index and cross-query patient, pharma, and sensor data to diagnose many illnesses prior to onset
Agriculture & Sustainability
Scientists and farmers around the world collaborate on seed genome experiments to improve crop yield
Technology & E-Commerce
Take a 360o view of online behaviors to customize interactions, including real-time recommendations
Financial Services & Insurance
Banks ingest, transform, and scrutinize data from multiple silos to detect/prevent fraud and mitigate risk
7. 7
32,000trained professionals by 2015
Rising demand for Big Data
and analytics experts but a
DEFICIENCY OF TALENT
will result in a shortfall of
Source: Accenture “Analytics in Action,“ March 2013.
8. 25%
$115K
8
Big Data Talent Shortage: Build or Buy?
An engineer with Hadoop skills requires a min. salary premium of
Hadoop developers are now the top paid in tech, starting at
Sources: Business Insider, “10 Tech Skills That Will Instantly Net You A $100,000+ Salary,” 11 August 2012.
Business Insider, “30 Tech Skills That Will Instantly Net You A $100,000+ Salary,” 21 February 2013.
GigaOm, “Big Data Skills Bring Big Dough,” 17 February 2012.
$300K
Compensation for a very senior Data Scientist opens at
10. 10
1 Dedicated CAP Courseware
Focus on real-world developer and admin roles
2
3
Discounted Technical Training
Lowest price available for live Hadoop classes
6 Ongoing Learning
Video tutorials and e-learning complement training
7 Most Relevant Platform & User Base
CDH deployed more than all other distributions combined
8 Align with the Leading Name in Hadoop
Cloudera widely recognized as a major Big Data innovator
Leader in Certification
Over 5,000 accredited Cloudera professionals
4 State-of-the-Art Curriculum
Courses updated regularly as Hadoop evolves 9 Global Community of Hadoop Experts
Access to Cloudera’s private forums on tech and curriculum
Why Partner with Cloudera University?
5 Depth of Training Material
Hands-on labs and VMs support instruction10University Software License
Free access to enterprise-grade Cloudera Manager
11. 11
The Cloudera Academic Partnership
Colleges, Universities, Trade Schools
Technical Degree from Accredited Institution
Cloudera Certified Hadoop Professional Credential
Cloudera University Training and Materials
New Generation of Qualified Data Professionals
12. 12
Goals of the Partnership
• Train engineering students on Apache Hadoop to future-proof their
core technical education and address the Big Data talent gap
• Provide high-quality curricula, instruction, and courseware to
universities who want to improve graduate competitiveness
• Certify students with Hadoop professional credentials to
complement their degrees and increase hireability
• Supply research departments state-of-the-art technology
to promote the next wave of breakthroughs in Big Data
• Offer colleges and universities options for data
management and analytics degree program requirements
13. 13
We were able to jumpstart an Introduction to Big
Data Analytics course thanks to the support of
Cloudera. The materials provided, including the
lab setup, are integral to the class.
14. 14
Partnership Roles and Responsibilities
• Program management
• Authorized course materials
• Cloudera Manager license
• Discounted teacher training
• Discounted certification exams
Cloudera
• Authorized faculty instructor
• Classroom facilities and labs
• Recruitment and registration
• Course materials distribution
• High-quality learning experience
Academic Partners
15. 15
Hadoop Course Overview
Duration
One semester: 12 three-hour sessions
Prerequisites
Java programming course or experience
Learning Objectives
• Hadoop technology and ecosystem overview
• How HDFS and MapReduce work
• Develop and unit test MapReduce applications
• MapReduce combiners, partitioners, and the distributed cache
• MapReduce algorithms, debugging, and best practices
• Integrate Hadoop into the data center
• Hardware configurations and network considerations for the Hadoop cluster
• Optimize Hadoop settings for cluster performance
• FairScheduler and multiple-user SLAs
• Hadoop cluster maintenance and management
16. 16
My goal is to cut up to 12 months out of the
training that graduates need when they get
hired. But there is not great clarity in the
academic world about what Big Data is.
Cloudera has helped tremendously with that.
18. 18
CDH (Cloudera’s Distribution including Apache Hadoop)
Store
Land structured and complex data in a scalable,
cost-effective repository
1
Process & Analyze
Transform data in parallel and query at the speed
of thought
2
Integrate
Interoperate with existing platforms, systems, and
applications: BI, ETL, RDBMS, analytics, visualization
3
19. 19
Cloudera Manager
Deploy
Install, configure, and start your cluster in three
simple steps
1
Configure & Optimize
Ensure optimal settings for all hosts and services2
Integrate
Find and fix problems quickly; view current and
historical activity and resource usage
3
20. 20
Legacy systems were preventing our labs from
mapping their genome sequences in a timely
manner. Our partnership with Cloudera will cut
the time required by scientists to deliver data
from weeks to days and, eventually, to hours.
21. 21
Everything You Need to Bring Hadoop to Campus
Curriculum
Materials
Lab
Resources
Virtual
Machines
CDH
Downloads
Software
License
Teacher
Training
22.
23. • Submit questions in the Q&A panel
• Watch on-demand video of this webinar
at http://cloudera.com
• Contact CAP at
academic_partnerships@cloudera.com
• Follow Cloudera University @ClouderaU
• Thank you for attending!
Learn more about the Cloudera
Academic Partnership and join at
http://cloudera.com/cap
Register for Cloudera training at
http://university.cloudera.com
Use discount code CAP_10 to save 10%
on new enrollments in Cloudera-
delivered training classes until July 1
Use discount code 15off2 to save 15% on
enrollments in two or more Cloudera-
delivered training classes until July 1