SlideShare ist ein Scribd-Unternehmen logo
1 von 48
Downloaden Sie, um offline zu lesen
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Rahul Pathak (@rahulpathak)
GM Amazon Athena & Amazon EMR
December 1, 2016
BDA303
NEW LAUNCH!
Intro to Amazon Athena
What to Expect from the Session
• Overview of Amazon Athena
• Key Features
• Customer Examples
• Q&A
Challenges Customers Faced
• Significant amount of work required to analyze data in
Amazon S3
• Users often only have access to aggregated data sets
• Managing a Hadoop cluster or data warehouse requires
expertise
Introducing Amazon Athena
Amazon Athena is an interactive query service
that makes it easy to analyze data directly from
Amazon S3 using Standard SQL
Athena is Serverless
• No Infrastructure or
administration
• Zero Spin up time
• Transparent upgrades
Amazon Athena is Easy To Use
• Log into the Console
• Create a table
• Type in a Hive DDL Statement
• Use the console Add Table wizard
• Start querying
Amazon Athena is Highly Available
• You connect to a service endpoint or log into the console
• Athena uses warm compute pools across multiple
Availability Zones
• Your data is in Amazon S3, which is also highly available
and designed for 99.999999999% durability
Query Data Directly from Amazon S3
• No loading of data
• Query data in its raw format
• Text, CSV, JSON, weblogs, AWS service logs
• Convert to an optimized form like ORC or Parquet for the best
performance and lowest cost
• No ETL required
• Stream data from directly from Amazon S3
• Take advantage of Amazon S3 durability and availability
Use ANSI SQL
• Start writing ANSI SQL
• Support for complex joins, nested
queries & window functions
• Support for complex data types
(arrays, structs)
• Support for partitioning of data by
any key
• (date, time, custom keys)
• e.g., Year, Month, Day, Hour or
Customer Key, Date
Familiar Technologies Under the Covers
Used for SQL Queries
In-memory distributed query engine
ANSI-SQL compatible with extensions
Used for DDL functionality
Complex data types
Multitude of formats
Supports data partitioning
Amazon Athena Supports Multiple Data Formats
• Text files, e.g., CSV, raw logs
• Apache Web Logs, TSV files
• JSON (simple, nested)
• Compressed files
• Columnar formats such as Apache Parquet & Apache ORC
• AVRO support – coming soon
Amazon Athena is Fast
• Tuned for performance
• Automatically parallelizes
queries
• Results are streamed to console
• Results also stored in S3
• Improve Query performance
• Compress your data
• Use columnar formats
Amazon Athena is Cost Effective
• Pay per query
• $5 per TB scanned from S3
• DDL Queries and failed queries are free
• Save by using compression, columnar formats, partitions
A Sample Pipeline
A Sample Pipeline
Ad-hoc access to raw data using SQL
A Sample Pipeline
Ad-hoc access to data using Athena
Athena can query
aggregated datasets as well
Re-visiting Challenges
Significant amount of work required to analyze data in Amazon S3
No ETL required. No loading of data. Query data where it lives
Users often only have access to aggregated data sets
Query data at whatever granularity you want
Managing a Hadoop cluster or data warehouse requires expertise
No infrastructure to manage
Accessing Amazon Athena
Simple Query
editor with key
bindings
Autocomplete
functionality
Catalog
Tables and columns
Can also see a detailed view
in the catalog tab
You can also check the
properties. Note the location.
Use the JDBC Driver
QuickSight allows you to connect to data from a wide variety of AWS, third-party, and on-
premises sources including Amazon Athena
Amazon RDS
Amazon S3
Amazon Redshift
Amazon Athena
Using Amazon Athena with Amazon QuickSight
JDBC also Provides Programmatic Access
/* Setup the driver */
Properties info = new Properties();
info.put("user", "AWSAccessKey");
info.put("password", "AWSSecretAccessKey");
info.put("s3_staging_dir", "s3://S3 Bucket Location/");
Class.forName("com.amazonaws.athena.jdbc.AthenaDriver");
Connection connection =
DriverManager.getConnection("jdbc:awsathena://athena.us-east-
1.amazonaws.com:443/", info);
Creating a Table and Executing a Query
/* Create a table */
Statement statement = connection.createStatement();
ResultSet queryResults = statement.executeQuery("CREATE EXTERNAL TABLE
tableName ( Col1 String ) LOCATION ‘s3://bucket/tableLocation");
/* Execute a Query */
Statement statement = connection.createStatement();
ResultSet queryResults = statement.executeQuery("SELECT * FROM
cloudfront_logs");
Creating Tables and Querying Data
Creating Tables - Concepts
• Create Table Statements (or DDL) are written in Hive
• High degree of flexibility
• Schema on Read
• Hive is SQL like but allows other concepts such “external
tables” and partitioning of data
• Data formats supported – JSON, TXT, CSV, TSV, Parquet
and ORC (via Serdes)
• Data in stored in Amazon S3
• Metadata is stored in an a metadata store
Athena’s Internal Metadata Store
• Stores Metadata
• Table definition, column names, partitions
• Highly available and durable
• Requires no management
• Access via DDL statements
• Similar to a Hive Metastore
Running Queries is Simple
Run time
and data
scanned
PARQUET
• Columnar format
• Schema segregated into footer
• Column major format
• All data is pushed to the leaf
• Integrated compression and
indexes
• Support for predicate
pushdown
Apache Parquet and Apache ORC – Columnar Formats
ORC
• Apache Top level project
• Schema segregated into footer
• Column major with stripes
• Integrated compression,
indexes, and stats
• Support for Predicate
Pushdown
Converting to ORC and PARQUET
• You can use Hive CTAS to convert data
• CREATE TABLE new_key_value_store
• STORED AS PARQUET
• AS
• SELECT col_1, col2, col3 FROM noncolumartable
• SORT BY new_key, key_value_pair;
• You can also use Spark to convert the file into PARQUET / ORC
• 20 lines of Pyspark code, running on EMR
• Converts 1TB of text data into 130 GB of Parquet with snappy conversion
• Total cost $5
https://github.com/awslabs/aws-big-data-blog/tree/master/aws-blog-spark-parquet-conversion
Pay By the Query - $5/TB Scanned
• Pay by the amount of data scanned per query
• Ways to save costs
• Compress
• Convert to Columnar format
• Use partitioning
• Free: DDL Queries, Failed Queries
Dataset Size on Amazon S3 Query Run time Data Scanned Cost
Logs stored as Text
files
1 TB 237 seconds 1.15TB $5.75
Logs stored in
Apache Parquet
format*
130 GB 5.13 seconds 2.69 GB $0.013
Savings 87% less with Parquet 34x faster 99% less data scanned 99.7% cheaper
Use Cases
Athena Complements Amazon Redshift & Amazon EMR
Amazon S3
EMR Athena
QuickSight
Redshift
Customers Using Athena
DataXu – 180TB of Log Data per Day
CDN
Real Time
Bidding
Retargeting
Platform
Kinesis Attribution & ML
S3
Reporting
Data Visualization
Data
Pipeline
ETL(Spark SQL)
Ecosystem of tools and services
Amazon Athena
Thank you!
Remember to complete
your evaluations!

Weitere ähnliche Inhalte

Was ist angesagt?

Architetture Serverless: concentrarsi sull'idea, non sull'infrastruttura
Architetture Serverless: concentrarsi sull'idea, non sull'infrastrutturaArchitetture Serverless: concentrarsi sull'idea, non sull'infrastruttura
Architetture Serverless: concentrarsi sull'idea, non sull'infrastrutturaAmazon Web Services
 
Database Migration: Simple, Cross-Engine and Cross-Platform Migrations with M...
Database Migration: Simple, Cross-Engine and Cross-Platform Migrations with M...Database Migration: Simple, Cross-Engine and Cross-Platform Migrations with M...
Database Migration: Simple, Cross-Engine and Cross-Platform Migrations with M...Amazon Web Services
 
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesBDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesAmazon Web Services
 
Introduction to AWS Step Functions
Introduction to AWS Step FunctionsIntroduction to AWS Step Functions
Introduction to AWS Step FunctionsAmazon Web Services
 
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...Amazon Web Services
 
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMRBDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMRAmazon Web Services
 
AWS Innovate: Build a Data Lake on AWS- Johnathon Meichtry
AWS Innovate: Build a Data Lake on AWS- Johnathon MeichtryAWS Innovate: Build a Data Lake on AWS- Johnathon Meichtry
AWS Innovate: Build a Data Lake on AWS- Johnathon MeichtryAmazon Web Services Korea
 
使用 AWS Step Functions 開發 Serverless 服務
使用 AWS Step Functions 開發 Serverless 服務使用 AWS Step Functions 開發 Serverless 服務
使用 AWS Step Functions 開發 Serverless 服務Amazon Web Services
 
AWS January 2016 Webinar Series - Getting Started with Big Data on AWS
AWS January 2016 Webinar Series - Getting Started with Big Data on AWSAWS January 2016 Webinar Series - Getting Started with Big Data on AWS
AWS January 2016 Webinar Series - Getting Started with Big Data on AWSAmazon Web Services
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSAmazon Web Services
 
Securing Serverless Architecture
Securing Serverless ArchitectureSecuring Serverless Architecture
Securing Serverless ArchitectureAmazon Web Services
 
AWS re:Invent 2016: Workshop: Building Your First Big Data Application with A...
AWS re:Invent 2016: Workshop: Building Your First Big Data Application with A...AWS re:Invent 2016: Workshop: Building Your First Big Data Application with A...
AWS re:Invent 2016: Workshop: Building Your First Big Data Application with A...Amazon Web Services
 
Cost Optimisation with AWS
Cost Optimisation with AWSCost Optimisation with AWS
Cost Optimisation with AWSIan Massingham
 
ENT302 Deep Dive on AWS Management Tools
ENT302 Deep Dive on AWS Management ToolsENT302 Deep Dive on AWS Management Tools
ENT302 Deep Dive on AWS Management ToolsAmazon Web Services
 
Getting Started with AWS Database Migration Service
Getting Started with AWS Database Migration ServiceGetting Started with AWS Database Migration Service
Getting Started with AWS Database Migration ServiceAmazon Web Services
 
BDA303 Serverless big data architectures: Design patterns and best practices
BDA303 Serverless big data architectures: Design patterns and best practicesBDA303 Serverless big data architectures: Design patterns and best practices
BDA303 Serverless big data architectures: Design patterns and best practicesAmazon Web Services
 
Your First Data Lake on AWS_Simon Elisha
Your First Data Lake on AWS_Simon ElishaYour First Data Lake on AWS_Simon Elisha
Your First Data Lake on AWS_Simon ElishaHelen Rogers
 

Was ist angesagt? (20)

Architetture Serverless: concentrarsi sull'idea, non sull'infrastruttura
Architetture Serverless: concentrarsi sull'idea, non sull'infrastrutturaArchitetture Serverless: concentrarsi sull'idea, non sull'infrastruttura
Architetture Serverless: concentrarsi sull'idea, non sull'infrastruttura
 
Database Migration: Simple, Cross-Engine and Cross-Platform Migrations with M...
Database Migration: Simple, Cross-Engine and Cross-Platform Migrations with M...Database Migration: Simple, Cross-Engine and Cross-Platform Migrations with M...
Database Migration: Simple, Cross-Engine and Cross-Platform Migrations with M...
 
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use CasesBDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
BDA307 Real-time Streaming Applications on AWS, Patterns and Use Cases
 
The Best of re:invent 2016
The Best of re:invent 2016The Best of re:invent 2016
The Best of re:invent 2016
 
Introduction to AWS Step Functions
Introduction to AWS Step FunctionsIntroduction to AWS Step Functions
Introduction to AWS Step Functions
 
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
 
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMRBDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
BDA302 Deep Dive on Migrating Big Data Workloads to Amazon EMR
 
Almacenamiento en la nube con AWS
Almacenamiento en la nube con AWSAlmacenamiento en la nube con AWS
Almacenamiento en la nube con AWS
 
AWS Innovate: Build a Data Lake on AWS- Johnathon Meichtry
AWS Innovate: Build a Data Lake on AWS- Johnathon MeichtryAWS Innovate: Build a Data Lake on AWS- Johnathon Meichtry
AWS Innovate: Build a Data Lake on AWS- Johnathon Meichtry
 
使用 AWS Step Functions 開發 Serverless 服務
使用 AWS Step Functions 開發 Serverless 服務使用 AWS Step Functions 開發 Serverless 服務
使用 AWS Step Functions 開發 Serverless 服務
 
AWS January 2016 Webinar Series - Getting Started with Big Data on AWS
AWS January 2016 Webinar Series - Getting Started with Big Data on AWSAWS January 2016 Webinar Series - Getting Started with Big Data on AWS
AWS January 2016 Webinar Series - Getting Started with Big Data on AWS
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWS
 
Securing Serverless Architecture
Securing Serverless ArchitectureSecuring Serverless Architecture
Securing Serverless Architecture
 
AWS re:Invent 2016: Workshop: Building Your First Big Data Application with A...
AWS re:Invent 2016: Workshop: Building Your First Big Data Application with A...AWS re:Invent 2016: Workshop: Building Your First Big Data Application with A...
AWS re:Invent 2016: Workshop: Building Your First Big Data Application with A...
 
Cost Optimisation with AWS
Cost Optimisation with AWSCost Optimisation with AWS
Cost Optimisation with AWS
 
ENT302 Deep Dive on AWS Management Tools
ENT302 Deep Dive on AWS Management ToolsENT302 Deep Dive on AWS Management Tools
ENT302 Deep Dive on AWS Management Tools
 
Getting Started with AWS Database Migration Service
Getting Started with AWS Database Migration ServiceGetting Started with AWS Database Migration Service
Getting Started with AWS Database Migration Service
 
BDA303 Serverless big data architectures: Design patterns and best practices
BDA303 Serverless big data architectures: Design patterns and best practicesBDA303 Serverless big data architectures: Design patterns and best practices
BDA303 Serverless big data architectures: Design patterns and best practices
 
Your First Data Lake on AWS_Simon Elisha
Your First Data Lake on AWS_Simon ElishaYour First Data Lake on AWS_Simon Elisha
Your First Data Lake on AWS_Simon Elisha
 
Databases
DatabasesDatabases
Databases
 

Andere mochten auch

An overview of Amazon Athena
An overview of Amazon AthenaAn overview of Amazon Athena
An overview of Amazon AthenaJulien SIMON
 
AWS re:Invent 2016: Managing and Supporting the Windows Platform on AWS (GPSS...
AWS re:Invent 2016: Managing and Supporting the Windows Platform on AWS (GPSS...AWS re:Invent 2016: Managing and Supporting the Windows Platform on AWS (GPSS...
AWS re:Invent 2016: Managing and Supporting the Windows Platform on AWS (GPSS...Amazon Web Services
 
NEW LAUNCH! Delivering Powerful Graphics-Intensive Applications from the AWS ...
NEW LAUNCH! Delivering Powerful Graphics-Intensive Applications from the AWS ...NEW LAUNCH! Delivering Powerful Graphics-Intensive Applications from the AWS ...
NEW LAUNCH! Delivering Powerful Graphics-Intensive Applications from the AWS ...Amazon Web Services
 
NEW LAUNCH! Introducing PostgreSQL compatibility for Amazon Aurora
NEW LAUNCH! Introducing PostgreSQL compatibility for Amazon AuroraNEW LAUNCH! Introducing PostgreSQL compatibility for Amazon Aurora
NEW LAUNCH! Introducing PostgreSQL compatibility for Amazon AuroraAmazon Web Services
 
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...Amazon Web Services
 
AWS re:Invent 2016: The Secret to SaaS (Hint: It's Identity) (GPSSI404)
AWS re:Invent 2016: The Secret to SaaS (Hint: It's Identity) (GPSSI404)AWS re:Invent 2016: The Secret to SaaS (Hint: It's Identity) (GPSSI404)
AWS re:Invent 2016: The Secret to SaaS (Hint: It's Identity) (GPSSI404)Amazon Web Services
 
Visualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightVisualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightAmazon Web Services
 
NEW LAUNCH! Bringing AWS Lambda to the Edge
NEW LAUNCH! Bringing AWS Lambda to the EdgeNEW LAUNCH! Bringing AWS Lambda to the Edge
NEW LAUNCH! Bringing AWS Lambda to the EdgeAmazon Web Services
 
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...Amazon Web Services
 
AWS Athena vs. Google BigQuery for interactive SQL Queries
AWS Athena vs. Google BigQuery for interactive SQL QueriesAWS Athena vs. Google BigQuery for interactive SQL Queries
AWS Athena vs. Google BigQuery for interactive SQL QueriesDoiT International
 
AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Lex (MAC304)
AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Lex (MAC304)AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Lex (MAC304)
AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Lex (MAC304)Amazon Web Services
 
NEW LAUNCH! Intro to Amazon Athena. Analyze data in S3, using SQL
NEW LAUNCH! Intro to Amazon Athena. Analyze data in S3, using SQLNEW LAUNCH! Intro to Amazon Athena. Analyze data in S3, using SQL
NEW LAUNCH! Intro to Amazon Athena. Analyze data in S3, using SQLAmazon Web Services
 
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Amazon Web Services
 
Big Data answers in seconds with Amazon Athena
Big Data answers in seconds with Amazon AthenaBig Data answers in seconds with Amazon Athena
Big Data answers in seconds with Amazon AthenaJulien SIMON
 
Building a Server-less Data Lake on AWS - Technical 301
Building a Server-less Data Lake on AWS - Technical 301Building a Server-less Data Lake on AWS - Technical 301
Building a Server-less Data Lake on AWS - Technical 301Amazon Web Services
 
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...Amazon Web Services
 
Programming in C [Module One]
Programming in C [Module One]Programming in C [Module One]
Programming in C [Module One]Abhishek Sinha
 
Concepts of C [Module 2]
Concepts of C [Module 2]Concepts of C [Module 2]
Concepts of C [Module 2]Abhishek Sinha
 
AWS re:Invent 2016: The Cloud Is Enterprise Ready: How the University of Ariz...
AWS re:Invent 2016: The Cloud Is Enterprise Ready: How the University of Ariz...AWS re:Invent 2016: The Cloud Is Enterprise Ready: How the University of Ariz...
AWS re:Invent 2016: The Cloud Is Enterprise Ready: How the University of Ariz...Amazon Web Services
 

Andere mochten auch (20)

An overview of Amazon Athena
An overview of Amazon AthenaAn overview of Amazon Athena
An overview of Amazon Athena
 
AWS re:Invent 2016: Managing and Supporting the Windows Platform on AWS (GPSS...
AWS re:Invent 2016: Managing and Supporting the Windows Platform on AWS (GPSS...AWS re:Invent 2016: Managing and Supporting the Windows Platform on AWS (GPSS...
AWS re:Invent 2016: Managing and Supporting the Windows Platform on AWS (GPSS...
 
NEW LAUNCH! Delivering Powerful Graphics-Intensive Applications from the AWS ...
NEW LAUNCH! Delivering Powerful Graphics-Intensive Applications from the AWS ...NEW LAUNCH! Delivering Powerful Graphics-Intensive Applications from the AWS ...
NEW LAUNCH! Delivering Powerful Graphics-Intensive Applications from the AWS ...
 
NEW LAUNCH! Introducing PostgreSQL compatibility for Amazon Aurora
NEW LAUNCH! Introducing PostgreSQL compatibility for Amazon AuroraNEW LAUNCH! Introducing PostgreSQL compatibility for Amazon Aurora
NEW LAUNCH! Introducing PostgreSQL compatibility for Amazon Aurora
 
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...
NEW LAUNCH! Introducing AWS Batch: Easy and efficient batch computing on Amaz...
 
AWS re:Invent 2016: The Secret to SaaS (Hint: It's Identity) (GPSSI404)
AWS re:Invent 2016: The Secret to SaaS (Hint: It's Identity) (GPSSI404)AWS re:Invent 2016: The Secret to SaaS (Hint: It's Identity) (GPSSI404)
AWS re:Invent 2016: The Secret to SaaS (Hint: It's Identity) (GPSSI404)
 
Visualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSightVisualizing Big Data Insights with Amazon QuickSight
Visualizing Big Data Insights with Amazon QuickSight
 
NEW LAUNCH! Bringing AWS Lambda to the Edge
NEW LAUNCH! Bringing AWS Lambda to the EdgeNEW LAUNCH! Bringing AWS Lambda to the Edge
NEW LAUNCH! Bringing AWS Lambda to the Edge
 
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
AWS re:Invent 2016: Building Big Data Applications with the AWS Big Data Plat...
 
AWS Athena vs. Google BigQuery for interactive SQL Queries
AWS Athena vs. Google BigQuery for interactive SQL QueriesAWS Athena vs. Google BigQuery for interactive SQL Queries
AWS Athena vs. Google BigQuery for interactive SQL Queries
 
AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Lex (MAC304)
AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Lex (MAC304)AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Lex (MAC304)
AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Lex (MAC304)
 
NEW LAUNCH! Intro to Amazon Athena. Analyze data in S3, using SQL
NEW LAUNCH! Intro to Amazon Athena. Analyze data in S3, using SQLNEW LAUNCH! Intro to Amazon Athena. Analyze data in S3, using SQL
NEW LAUNCH! Intro to Amazon Athena. Analyze data in S3, using SQL
 
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
Best Practices for Building a Data Lake with Amazon S3 - August 2016 Monthly ...
 
Big Data answers in seconds with Amazon Athena
Big Data answers in seconds with Amazon AthenaBig Data answers in seconds with Amazon Athena
Big Data answers in seconds with Amazon Athena
 
Building a Server-less Data Lake on AWS - Technical 301
Building a Server-less Data Lake on AWS - Technical 301Building a Server-less Data Lake on AWS - Technical 301
Building a Server-less Data Lake on AWS - Technical 301
 
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
 
Programming in C [Module One]
Programming in C [Module One]Programming in C [Module One]
Programming in C [Module One]
 
Concepts of C [Module 2]
Concepts of C [Module 2]Concepts of C [Module 2]
Concepts of C [Module 2]
 
AWS re:Invent 2016: The Cloud Is Enterprise Ready: How the University of Ariz...
AWS re:Invent 2016: The Cloud Is Enterprise Ready: How the University of Ariz...AWS re:Invent 2016: The Cloud Is Enterprise Ready: How the University of Ariz...
AWS re:Invent 2016: The Cloud Is Enterprise Ready: How the University of Ariz...
 
Clickstream & Social Media Analysis using Apache Spark
Clickstream & Social Media Analysis using Apache SparkClickstream & Social Media Analysis using Apache Spark
Clickstream & Social Media Analysis using Apache Spark
 

Ähnlich wie NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.

Announcing Amazon Athena - Instantly Analyze Your Data in S3 Using SQL
Announcing Amazon Athena - Instantly Analyze Your Data in S3 Using SQLAnnouncing Amazon Athena - Instantly Analyze Your Data in S3 Using SQL
Announcing Amazon Athena - Instantly Analyze Your Data in S3 Using SQLAmazon Web Services
 
使用 Amazon Athena 直接分析儲存於 S3 的巨量資料
使用 Amazon Athena 直接分析儲存於 S3 的巨量資料使用 Amazon Athena 直接分析儲存於 S3 的巨量資料
使用 Amazon Athena 直接分析儲存於 S3 的巨量資料Amazon Web Services
 
Los Angeles AWS Users Group - Athena Deep Dive
Los Angeles AWS Users Group - Athena Deep DiveLos Angeles AWS Users Group - Athena Deep Dive
Los Angeles AWS Users Group - Athena Deep DiveKevin Epstein
 
Serverlesss Big Data Analytics with Amazon Athena and Quicksight
Serverlesss Big Data Analytics with Amazon Athena and QuicksightServerlesss Big Data Analytics with Amazon Athena and Quicksight
Serverlesss Big Data Analytics with Amazon Athena and QuicksightAmazon Web Services
 
Interactive Analytics on AWS - AWS Summit Tel Aviv 2017
Interactive Analytics on AWS - AWS Summit Tel Aviv 2017Interactive Analytics on AWS - AWS Summit Tel Aviv 2017
Interactive Analytics on AWS - AWS Summit Tel Aviv 2017Amazon Web Services
 
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)Amazon Web Services Korea
 
Amazon Athena Capabilities and Use Cases Overview
Amazon Athena Capabilities and Use Cases Overview Amazon Athena Capabilities and Use Cases Overview
Amazon Athena Capabilities and Use Cases Overview Amazon Web Services
 
Denver AWS Users' Group meeting - September 2017
Denver AWS Users' Group meeting - September 2017Denver AWS Users' Group meeting - September 2017
Denver AWS Users' Group meeting - September 2017David McDaniel
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWSAmazon Web Services
 
Querying and Analyzing Data in Amazon S3
Querying and Analyzing Data in Amazon S3Querying and Analyzing Data in Amazon S3
Querying and Analyzing Data in Amazon S3Amazon Web Services
 
What's New with Big Data Analytics
What's New with Big Data AnalyticsWhat's New with Big Data Analytics
What's New with Big Data AnalyticsAmazon Web Services
 
AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS Amazon Web Services
 
Interactively Querying Large-scale Datasets on Amazon S3
Interactively Querying Large-scale Datasets on Amazon S3Interactively Querying Large-scale Datasets on Amazon S3
Interactively Querying Large-scale Datasets on Amazon S3Amazon Web Services
 
Building Data Lakes in the AWS Cloud
Building Data Lakes in the AWS CloudBuilding Data Lakes in the AWS Cloud
Building Data Lakes in the AWS CloudAmazon Web Services
 
Serverless Big Data Analytics with Amazon Athena and QuickSight
Serverless Big Data Analytics with Amazon Athena and QuickSightServerless Big Data Analytics with Amazon Athena and QuickSight
Serverless Big Data Analytics with Amazon Athena and QuickSightAmazon Web Services
 

Ähnlich wie NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL. (20)

Introduction to Amazon Athena
Introduction to Amazon AthenaIntroduction to Amazon Athena
Introduction to Amazon Athena
 
Announcing Amazon Athena - Instantly Analyze Your Data in S3 Using SQL
Announcing Amazon Athena - Instantly Analyze Your Data in S3 Using SQLAnnouncing Amazon Athena - Instantly Analyze Your Data in S3 Using SQL
Announcing Amazon Athena - Instantly Analyze Your Data in S3 Using SQL
 
使用 Amazon Athena 直接分析儲存於 S3 的巨量資料
使用 Amazon Athena 直接分析儲存於 S3 的巨量資料使用 Amazon Athena 直接分析儲存於 S3 的巨量資料
使用 Amazon Athena 直接分析儲存於 S3 的巨量資料
 
Los Angeles AWS Users Group - Athena Deep Dive
Los Angeles AWS Users Group - Athena Deep DiveLos Angeles AWS Users Group - Athena Deep Dive
Los Angeles AWS Users Group - Athena Deep Dive
 
Introduction to Amazon Athena
Introduction to Amazon AthenaIntroduction to Amazon Athena
Introduction to Amazon Athena
 
Serverlesss Big Data Analytics with Amazon Athena and Quicksight
Serverlesss Big Data Analytics with Amazon Athena and QuicksightServerlesss Big Data Analytics with Amazon Athena and Quicksight
Serverlesss Big Data Analytics with Amazon Athena and Quicksight
 
Interactive Analytics on AWS - AWS Summit Tel Aviv 2017
Interactive Analytics on AWS - AWS Summit Tel Aviv 2017Interactive Analytics on AWS - AWS Summit Tel Aviv 2017
Interactive Analytics on AWS - AWS Summit Tel Aviv 2017
 
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
AWS CLOUD 2017 - Amazon Athena 및 Glue를 통한 빠른 데이터 질의 및 처리 기능 소개 (김상필 솔루션즈 아키텍트)
 
Introduction to Amazon Athena
Introduction to Amazon AthenaIntroduction to Amazon Athena
Introduction to Amazon Athena
 
Amazon Athena Capabilities and Use Cases Overview
Amazon Athena Capabilities and Use Cases Overview Amazon Athena Capabilities and Use Cases Overview
Amazon Athena Capabilities and Use Cases Overview
 
Introduction to Amazon Athena
Introduction to Amazon AthenaIntroduction to Amazon Athena
Introduction to Amazon Athena
 
Denver AWS Users' Group meeting - September 2017
Denver AWS Users' Group meeting - September 2017Denver AWS Users' Group meeting - September 2017
Denver AWS Users' Group meeting - September 2017
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
 
Querying and Analyzing Data in Amazon S3
Querying and Analyzing Data in Amazon S3Querying and Analyzing Data in Amazon S3
Querying and Analyzing Data in Amazon S3
 
What is Amazon Athena
What is Amazon AthenaWhat is Amazon Athena
What is Amazon Athena
 
What's New with Big Data Analytics
What's New with Big Data AnalyticsWhat's New with Big Data Analytics
What's New with Big Data Analytics
 
AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS
 
Interactively Querying Large-scale Datasets on Amazon S3
Interactively Querying Large-scale Datasets on Amazon S3Interactively Querying Large-scale Datasets on Amazon S3
Interactively Querying Large-scale Datasets on Amazon S3
 
Building Data Lakes in the AWS Cloud
Building Data Lakes in the AWS CloudBuilding Data Lakes in the AWS Cloud
Building Data Lakes in the AWS Cloud
 
Serverless Big Data Analytics with Amazon Athena and QuickSight
Serverless Big Data Analytics with Amazon Athena and QuickSightServerless Big Data Analytics with Amazon Athena and QuickSight
Serverless Big Data Analytics with Amazon Athena and QuickSight
 

Mehr von Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Kürzlich hochgeladen

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 

Kürzlich hochgeladen (20)

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.

  • 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Rahul Pathak (@rahulpathak) GM Amazon Athena & Amazon EMR December 1, 2016 BDA303 NEW LAUNCH! Intro to Amazon Athena
  • 2. What to Expect from the Session • Overview of Amazon Athena • Key Features • Customer Examples • Q&A
  • 3. Challenges Customers Faced • Significant amount of work required to analyze data in Amazon S3 • Users often only have access to aggregated data sets • Managing a Hadoop cluster or data warehouse requires expertise
  • 4. Introducing Amazon Athena Amazon Athena is an interactive query service that makes it easy to analyze data directly from Amazon S3 using Standard SQL
  • 5. Athena is Serverless • No Infrastructure or administration • Zero Spin up time • Transparent upgrades
  • 6. Amazon Athena is Easy To Use • Log into the Console • Create a table • Type in a Hive DDL Statement • Use the console Add Table wizard • Start querying
  • 7. Amazon Athena is Highly Available • You connect to a service endpoint or log into the console • Athena uses warm compute pools across multiple Availability Zones • Your data is in Amazon S3, which is also highly available and designed for 99.999999999% durability
  • 8. Query Data Directly from Amazon S3 • No loading of data • Query data in its raw format • Text, CSV, JSON, weblogs, AWS service logs • Convert to an optimized form like ORC or Parquet for the best performance and lowest cost • No ETL required • Stream data from directly from Amazon S3 • Take advantage of Amazon S3 durability and availability
  • 9. Use ANSI SQL • Start writing ANSI SQL • Support for complex joins, nested queries & window functions • Support for complex data types (arrays, structs) • Support for partitioning of data by any key • (date, time, custom keys) • e.g., Year, Month, Day, Hour or Customer Key, Date
  • 10. Familiar Technologies Under the Covers Used for SQL Queries In-memory distributed query engine ANSI-SQL compatible with extensions Used for DDL functionality Complex data types Multitude of formats Supports data partitioning
  • 11. Amazon Athena Supports Multiple Data Formats • Text files, e.g., CSV, raw logs • Apache Web Logs, TSV files • JSON (simple, nested) • Compressed files • Columnar formats such as Apache Parquet & Apache ORC • AVRO support – coming soon
  • 12. Amazon Athena is Fast • Tuned for performance • Automatically parallelizes queries • Results are streamed to console • Results also stored in S3 • Improve Query performance • Compress your data • Use columnar formats
  • 13. Amazon Athena is Cost Effective • Pay per query • $5 per TB scanned from S3 • DDL Queries and failed queries are free • Save by using compression, columnar formats, partitions
  • 15. A Sample Pipeline Ad-hoc access to raw data using SQL
  • 16. A Sample Pipeline Ad-hoc access to data using Athena Athena can query aggregated datasets as well
  • 17. Re-visiting Challenges Significant amount of work required to analyze data in Amazon S3 No ETL required. No loading of data. Query data where it lives Users often only have access to aggregated data sets Query data at whatever granularity you want Managing a Hadoop cluster or data warehouse requires expertise No infrastructure to manage
  • 19. Simple Query editor with key bindings
  • 23. Can also see a detailed view in the catalog tab
  • 24. You can also check the properties. Note the location.
  • 25.
  • 26. Use the JDBC Driver
  • 27. QuickSight allows you to connect to data from a wide variety of AWS, third-party, and on- premises sources including Amazon Athena Amazon RDS Amazon S3 Amazon Redshift Amazon Athena Using Amazon Athena with Amazon QuickSight
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33. JDBC also Provides Programmatic Access /* Setup the driver */ Properties info = new Properties(); info.put("user", "AWSAccessKey"); info.put("password", "AWSSecretAccessKey"); info.put("s3_staging_dir", "s3://S3 Bucket Location/"); Class.forName("com.amazonaws.athena.jdbc.AthenaDriver"); Connection connection = DriverManager.getConnection("jdbc:awsathena://athena.us-east- 1.amazonaws.com:443/", info);
  • 34. Creating a Table and Executing a Query /* Create a table */ Statement statement = connection.createStatement(); ResultSet queryResults = statement.executeQuery("CREATE EXTERNAL TABLE tableName ( Col1 String ) LOCATION ‘s3://bucket/tableLocation"); /* Execute a Query */ Statement statement = connection.createStatement(); ResultSet queryResults = statement.executeQuery("SELECT * FROM cloudfront_logs");
  • 35. Creating Tables and Querying Data
  • 36. Creating Tables - Concepts • Create Table Statements (or DDL) are written in Hive • High degree of flexibility • Schema on Read • Hive is SQL like but allows other concepts such “external tables” and partitioning of data • Data formats supported – JSON, TXT, CSV, TSV, Parquet and ORC (via Serdes) • Data in stored in Amazon S3 • Metadata is stored in an a metadata store
  • 37. Athena’s Internal Metadata Store • Stores Metadata • Table definition, column names, partitions • Highly available and durable • Requires no management • Access via DDL statements • Similar to a Hive Metastore
  • 38. Running Queries is Simple Run time and data scanned
  • 39. PARQUET • Columnar format • Schema segregated into footer • Column major format • All data is pushed to the leaf • Integrated compression and indexes • Support for predicate pushdown Apache Parquet and Apache ORC – Columnar Formats ORC • Apache Top level project • Schema segregated into footer • Column major with stripes • Integrated compression, indexes, and stats • Support for Predicate Pushdown
  • 40. Converting to ORC and PARQUET • You can use Hive CTAS to convert data • CREATE TABLE new_key_value_store • STORED AS PARQUET • AS • SELECT col_1, col2, col3 FROM noncolumartable • SORT BY new_key, key_value_pair; • You can also use Spark to convert the file into PARQUET / ORC • 20 lines of Pyspark code, running on EMR • Converts 1TB of text data into 130 GB of Parquet with snappy conversion • Total cost $5 https://github.com/awslabs/aws-big-data-blog/tree/master/aws-blog-spark-parquet-conversion
  • 41. Pay By the Query - $5/TB Scanned • Pay by the amount of data scanned per query • Ways to save costs • Compress • Convert to Columnar format • Use partitioning • Free: DDL Queries, Failed Queries Dataset Size on Amazon S3 Query Run time Data Scanned Cost Logs stored as Text files 1 TB 237 seconds 1.15TB $5.75 Logs stored in Apache Parquet format* 130 GB 5.13 seconds 2.69 GB $0.013 Savings 87% less with Parquet 34x faster 99% less data scanned 99.7% cheaper
  • 43. Athena Complements Amazon Redshift & Amazon EMR Amazon S3 EMR Athena QuickSight Redshift
  • 45. DataXu – 180TB of Log Data per Day CDN Real Time Bidding Retargeting Platform Kinesis Attribution & ML S3 Reporting Data Visualization Data Pipeline ETL(Spark SQL) Ecosystem of tools and services Amazon Athena
  • 46.