This document provides instructions for building a big data application on AWS that collects and analyzes web server logs. It discusses using Amazon Kinesis to collect logs with a Firehose delivery stream into an S3 bucket. It then covers using Kinesis Analytics to process the logs in real-time by writing SQL queries that compute metrics and detect anomalies. Finally, it discusses loading the processed logs into Amazon Redshift for interactive querying and visualizing insights with Amazon QuickSight.
81. Activity 4B: ETL Job in Glue
• Close Script Editor tips
window (if it appears)
• In the Glue Script Editor,
copy the ETL code by
clicking on the “Open
Glue ETL Code” link in
Student Resources
• Ensure that the database
name (db_name) and
table name reflect the
database and table name
created by the Glue
Crawler
99. Activity 5C: Interactive Querying with Amazon Athena
• Run interactive queries (copy SQL queries from “Athena SQL” in
Student Resources) and see the results on the console
114. Activity 6A: Open the Zeppelin interface
1. Copy the Zeppelin end point in
Student Resources section in
qwiklabs
2. Click on the “Open Zeppelin
Notebook” link in Student
Resources section to open the
zeppelin link into a new window.
3. Download the file (or Copy and
save it to file with .json extension)
4. Import the Notebook using the
Import Note link on Zeppelin
interface
130. Activity 7B: Connect to Amazon Redshift
Note: Use ”dbadmin” as the
username. You can get the Amazon
Redshift database password from
qwikLABS by navigating to the
“Connection details” section (see
below)