Data Con LA 2020
Description
You just got hired by a large "tech startup". They're a hip travel agency like Kayak, "revolutionizing the airline industry" by developing an A/I that negotiates best airline deals on behalf of passengers. But in reality they are developing the AI to jack up ticket prices as it finds the passengers' preferences. They run their tech on the latest Google Cloud technologies, so you figured it's a great place to sharpen your skills as a Data Engineer despite the company's broken ethical compass. We teach Cloud Data Engineering to beginner/intermediate developers via a fun and engaging story. You will build a complete data-driven A/I pipeline. Ingest 6 years worth of real flight records, profile 30M+ user profiles and process 100M+ live streaming events while learning tools such as BigQuery, Dataflow (Apache Beam), DataProc (Apache Spark), Pub/Sub (Kafka), BigTable, and Airflow (Cloud Composer). During our talk, we will:
*Discuss the latest Serverless Data Architecture on GCP
*Explore the architectural decisions behind our Data Pipeline
*Run a live demo from our course
Speaker
Parham Parvizi, Tura Labs, Founder / Data Engineer
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Building Modern Data Pipelines on GCP via a FREE online Bootcamp
1. The FREE Data Engineering Bootcamp
Building Modern
Serverless Data Pipelines
2. Empower Your Inner Data Engineer!
We strive to create a fun learning environment
that produces competent and confident Data
Engineers.
Our Mission:
○ Empower Early Career Engineers
○ Low Barrier Accessibility for a Highly
Technical Field
○ Create Excitement to Continue
Learning
○ Teaching a Modern Stack
3. Who’s this for?
○ Data Enthusiasts
○ Developers Transitioning to Big Data &
Data Science
○ Cloud Engineers
○ Data Pipeline Developers
○ Lunch Break Learners
○ After Work Over Achievers
○ Need to Expand My Horizon-ers
○ The Forever Learner & Backend
Lovers
4. Our Team
The team behind Tura Labs is curious,
collaborative, and eager to work on the
cutting-edge of Cloud and Big Data
technologies. Our developers thrive to make
life for other developers easier.
5. Technologies
Google Cloud Platform - $300 Cloud Credit / Big Data Tools
○ Cloud Storage (HDFS)
○ BigQuery (Hive)
○ BigTable (HBase)
○ DataFlow (Beam)
○ DataProc (Spark)
○ PubSub (Kafka)
○ Cloud Composer (Airflow)
○ Cloud Run (Docker)
○ Cloud Functions
○ Cloud Spanner
○ Cloud ML
○ Data Catalog
○ App Engine
○ Looker
8. Chapter 1
Loading referential data onto Google
BigQuery using python pandas.
Technologies:
Pandas
SQL
Google Cloud Storage
Google Cloud BigQuery
9. Chapter 2
Parallel loading flight records using
Cloud Dataflow (Apache Beam)
Technologies:
Google Cloud Storage
Google Cloud BigQuery
Google Cloud Dataflow (Apache
Beam)
12. Chapter 3
Parallel loading flight records
uLearning Google Cloud Dataproc
(Apache Spark) to process 30M+
historical records.
Technologies:
Google Cloud Dataproc (Apache
Spark)
13. Chapter 4
Flexing our architectural muscles.
Designing data models and pipelines
while becoming familiar with design
best practices and guidelines.
Technologies:
Common data architect tools &
techniques
14. Chapter 5
Exploring real-time data processing.
Stream processing ingestion of
website logs via Cloud Pub/Sub
(Apache Kafka) and Cloud Dataflow
(Apache Beam)
Technologies:
Google Cloud Dataflow (Apache
Beam)
Google Cloud Pub/Sub (Apache Kafka
15. Chapter 6
Developing an OLTP system to
monitor live ticket sales via Pub/Sub
and Google BigTable (Apache
HBase).
Technologies:
Google Cloud Dataflow (Apache
Beam)
Google Cloud BigTable (Apache
HBase)
16. Chapter 7
Advanced analytics using Google
BigQuery. Preparing intelligence for
our AI.
Technologies:
Google Cloud Bigquery
17. Chapter 8
Building the Evil price-gouging AI
utilizing our complex data pipelines. A
Continuously running AI to keep
updating ticket prices based on
supply/demand.
Technologies:
Machine Learning
Google Cloud BigQuery ML
18. Chapter 9
Pipeline automation, monitoring, and
metrics with Cloud Composer (Apache
Airflow). The glue to keep everything
together.
Technologies:
Google Cloud Composer (Apache
Airflow)
19. Chapter 10
Creating a Data Hub and exposing our
AI via REST API. Building a Data-
Driven backend.
Technologies:
Flask (Python)
REST API
Google AppEngine