SlideShare ist ein Scribd-Unternehmen logo
1 von 47
Visualising Big Data
Big Data Visualisation with Hadoop, Hive and
Excel 2013
Sponsors
Explore Everything PASS Has to Offer
Free SQL Server and BI Web Events

Free 1-day Training Events

Regional Event

This is Community

Business Analytics Training

Local User Groups Around the World

Session Recordings

PASS Newsletter

Free Online Technical Training

3
About me
 Director-At-Large (Elect) PASS Board from
Jan 2014
 SQL Server MVP
 Blogger, data strategist, public speaker,
technologist
 Joint owner of Copper Blue Consulting Ltd

4 |
Agenda





5 |

Overview of Big Data Technologies
Data Visualisation with Office365 and PowerBI
Hive
Visualising Big Data with Microsoft
Big Data.
HDInsight Ecosystem
ODBC

Distributed Processing
(Map Reduce)

Distributed Storage
(HDFS)
(Azure Data
Marketplace)

Windows Azure
Storage
What is Hadoop?

“Flexible and Available
Architecture for Large Scale
computation and data
processing on a network of
highly available commodity
hardware.”
Hadoop’s Lineage

* Resource: Kerberos Konference (Yahoo) – 2010
Data Visualisation Background
We have the tools. All we’ve
got to
do is imagine what could be.
We can reinvent the present;
we can transform the world
around us.
Jason Silva

10
Almost 50% of your
brain is dedicated to
visual processing.
David van Essen

Researchers found that colour
visuals increase the willingness to

read by

11

80%

About 70% of your
sensory receptors are in
your eyes.
Why is Data Visualisation Important?

 It’s clearly a
budget. It has a
lot of numbers in
it. George W Bush

I could never figure out
where the decimal
point went. (Lord
Randolph Churchill)
The Unknown Unknowns
 That is to say, there are things that we
know we don't know. But there are also
unknown unknowns. There are things
we don't know we don't know. (Donald
Rumsfeld)
What is the purpose of Hive?
Hive is a solution to a business problem:
How do you analyse large amounts of data?

Data Scientists want to study data
Communicate with the data

Businesses want to reap benefits of data
Results that make sense of the data

16
17
What is the purpose of Hive?
Hive is a data warehousing system for Hadoop
To meet the needs of businesses, data scientists, analysts and BI
professionals

Data, Summarized
Fit a structure onto data

Data, Analyzed
Analysis of Large Datasets stored in Hadoop File Systems
SQL-Like language called HiveQL
Custom mappers and reduces when HiveQL isn’t enough

18
Agenda
 Hive solves the business problem of analysing large amounts of
data

•
•
•
•

19

What is the purpose of Hive?
Why Hive?
A history of Hive
What are Hive’s constituents
Why Hive?
Can’t Hadoop be used to solve these problems?
Why is there a need for Hive?

Writing MR jobs in Java can be difficult
You don’t know it’s wrong until it’s fallen over!

Joining Large Datasets can be difficult
Learning Curve

20
Agenda
 Hive solves the business problem of analysing large amounts of
data

•
•
•
•

21

What is the purpose of Hive?
Why Hive?
A history of Hive
What are Hive’s constituents
Hive History

22
Hive History

23
What can Hive offer you?
 Hive can help with a range of business problems:

•
•
•
•

24

Log Processing
Predictive Modelling
Hypothesis testing
And Business Intelligence
Hive is not a replacement for SQL
 So don’t throw out your SQL Server instances!

• Hive is for processing large data sets that may span
hundreds, or even thousands, of machines
• Hive as a high overhead for starting a job. It translates queries
to MR so it takes time
• Hive does not cache data, like SQL Server
• Hive performance tuning is mainly Hadoop performance
tuning
• Similarity of the query engine, but different architectures for
different purposes

25
Agenda
 Hive solves the business problem of analysing large amounts of
data

•
•
•
•

What is the purpose of Hive?
Why Hive?
A history of Hive
What are Hive’s constituents?
 Hive as a SQL-like Language Query Tool
 Hive as a Translation Tool
 Hive as a Structuring Tool

26
HiveQL
Hive QL is a SQL-like language
It outputs naturally occurring groups for further analysis

Easy Data Summarization
Large Datasets, summarized
Fit a structure onto data

Analysis of Large Datasets stored in Hadoop file systems
SQL-Like language called HiveQL
Custom mappers and reduces when HiveQL isn’t enough

27
HiveQL Queries like SQL Queries?
Similarities in Syntax and Features
Similar features

SELECT
FROM
WHERE
GROUP BY / HAVING
Table Aliases
Computed Columns

28
HiveQL Queries like SQL Queries?
Similarities in Syntax and Features
Similar features

Aggregate Functions
Nested Select
CASE
LIKE / RLIKE
JOIN
ORDER BY / SORT BY

29
How does Hive work?
Hive as a Translation Tool
Compiles and executes queries

Hive translates the SQL Query to a Map Reduce Job
These are chained together
Queries are compiled and executed

30
How does Hive work?
Hive as a structuring Tool
Creates a schema around the data
Tables stored in Directories

Hive Tables
Rows and columns, like SQL tables

Hive Metastore
Namespace with a set of tables
Holds table definitions
Physical Layout
Column Types
Partition Information

31
Hive and SQL Data Types
Hive

SQL

Tinyint

Tinyint

SmallInt

Smallint

Int

Int

BigInt

BigInt

Boolean

Bit (setting as NOT NULL)

Float

Float

Double

Real

BigDecimal

Decimal

33
Hive and SQL Data Types
HEADING

HEADING

String

Char, varchar, nvarchar, ntext, text, image

Binary

binary

Timestamp

Timestamp (note that this is being deprecated).
RowVersion

34
Hive Mathematical Operations
 Primitive Types

 Complex Types

• Plus

• Arrays

• Negative

• Maps

• Addition

• Structs

• Subtraction

• Union

• Multiplication
• Division
• Modulus

35
How does Hive work?
Hive as a structuring Tool
Creates a schema around the data
Tables stored in Directories

Hive Tables
Rows and columns, like SQL tables

Hive Metastore
Namespace with a set of tables
Holds table definitions
Physical Layout
Column Types
Partition Information

36
Visualising Big Data

Self-Service

Insights
Actions
37
Different Tools for Different Jobs
 Power View

 Power Map

 Highly Visual Design Experience

 Power Map is a new 3D
visualization add-in for Excel
helping you to analyse
geographical and temporal data

 Power View is an interactive, ad
hoc, query and visualization
experience.
 It is for business question ‘mystery’
solving

 Mapping
 Exploring
 Interacting

38
38
Data where you want it

39
39
Data you want about ‘where’

40
40
Data you want to share

41
41
Your data…. Fresh.

42
42
Demo

43
What did we learn from the demo?

44
JOIN US for our second annual event to get the best learning for
analyzing, managing, and sharing business information and
insights through the Microsoft Data Platform of technologies.
Don’t be shy… questions?
Thank you for listening
Sponsors

Weitere ähnliche Inhalte

Was ist angesagt?

Hadoop acm presentation
Hadoop acm presentationHadoop acm presentation
Hadoop acm presentationBrad Sarsfield
 
Kyvos Insights
Kyvos Insights Kyvos Insights
Kyvos Insights rebeccatho
 
Владимир Слободянюк «DWH & BigData – architecture approaches»
Владимир Слободянюк «DWH & BigData – architecture approaches»Владимир Слободянюк «DWH & BigData – architecture approaches»
Владимир Слободянюк «DWH & BigData – architecture approaches»Anna Shymchenko
 
On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)
On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)
On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)Stéphane Fréchette
 
Lambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
Lambda Architecture in the Cloud with Azure Databricks with Andrei VaranovichLambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
Lambda Architecture in the Cloud with Azure Databricks with Andrei VaranovichDatabricks
 
Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Ranjith Sekar
 
Hd insight essentials quick view
Hd insight essentials quick viewHd insight essentials quick view
Hd insight essentials quick viewRajesh Nadipalli
 
Pentaho Big Data Analytics with Vertica and Hadoop
Pentaho Big Data Analytics with Vertica and HadoopPentaho Big Data Analytics with Vertica and Hadoop
Pentaho Big Data Analytics with Vertica and HadoopMark Kromer
 
Azure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analyticsAzure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analyticsMark Kromer
 
Introduction to PolyBase
Introduction to PolyBaseIntroduction to PolyBase
Introduction to PolyBaseJames Serra
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Casesboorad
 
Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)James Serra
 
From hadoop to spark
From hadoop to sparkFrom hadoop to spark
From hadoop to sparksteccami
 
Big Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureMark Kromer
 
Introduction to Big Data Technologies & Applications
Introduction to Big Data Technologies & ApplicationsIntroduction to Big Data Technologies & Applications
Introduction to Big Data Technologies & ApplicationsNguyen Cao
 
Modern Big Data Analytics Tools: An Overview
Modern Big Data Analytics Tools: An OverviewModern Big Data Analytics Tools: An Overview
Modern Big Data Analytics Tools: An OverviewGreat Wide Open
 
BigData Analytics with Hadoop and BIRT
BigData Analytics with Hadoop and BIRTBigData Analytics with Hadoop and BIRT
BigData Analytics with Hadoop and BIRTAmrit Chhetri
 
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive
 

Was ist angesagt? (20)

Hadoop acm presentation
Hadoop acm presentationHadoop acm presentation
Hadoop acm presentation
 
Kyvos Insights
Kyvos Insights Kyvos Insights
Kyvos Insights
 
Владимир Слободянюк «DWH & BigData – architecture approaches»
Владимир Слободянюк «DWH & BigData – architecture approaches»Владимир Слободянюк «DWH & BigData – architecture approaches»
Владимир Слободянюк «DWH & BigData – architecture approaches»
 
On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)
On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)
On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)
 
Lambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
Lambda Architecture in the Cloud with Azure Databricks with Andrei VaranovichLambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
Lambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
 
Hadoop and BigData - July 2016
Hadoop and BigData - July 2016Hadoop and BigData - July 2016
Hadoop and BigData - July 2016
 
Hd insight essentials quick view
Hd insight essentials quick viewHd insight essentials quick view
Hd insight essentials quick view
 
Pentaho Big Data Analytics with Vertica and Hadoop
Pentaho Big Data Analytics with Vertica and HadoopPentaho Big Data Analytics with Vertica and Hadoop
Pentaho Big Data Analytics with Vertica and Hadoop
 
Azure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analyticsAzure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analytics
 
Introduction to PolyBase
Introduction to PolyBaseIntroduction to PolyBase
Introduction to PolyBase
 
Exploring Big Data Analytics Tools
Exploring Big Data Analytics ToolsExploring Big Data Analytics Tools
Exploring Big Data Analytics Tools
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Cases
 
Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)Introduction to Microsoft’s Hadoop solution (HDInsight)
Introduction to Microsoft’s Hadoop solution (HDInsight)
 
Intro to Big Data - Spark
Intro to Big Data - SparkIntro to Big Data - Spark
Intro to Big Data - Spark
 
From hadoop to spark
From hadoop to sparkFrom hadoop to spark
From hadoop to spark
 
Big Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft Azure
 
Introduction to Big Data Technologies & Applications
Introduction to Big Data Technologies & ApplicationsIntroduction to Big Data Technologies & Applications
Introduction to Big Data Technologies & Applications
 
Modern Big Data Analytics Tools: An Overview
Modern Big Data Analytics Tools: An OverviewModern Big Data Analytics Tools: An Overview
Modern Big Data Analytics Tools: An Overview
 
BigData Analytics with Hadoop and BIRT
BigData Analytics with Hadoop and BIRTBigData Analytics with Hadoop and BIRT
BigData Analytics with Hadoop and BIRT
 
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
 

Andere mochten auch

Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol
Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol
Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol HARMAN Services
 
Business Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache HadoopBusiness Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache HadoopCloudera, Inc.
 
democratization of data sql-konferenz
democratization of data sql-konferenzdemocratization of data sql-konferenz
democratization of data sql-konferenzJen Stirrup
 
Sql saturday denmark power bi for pdf
Sql saturday denmark power bi for pdfSql saturday denmark power bi for pdf
Sql saturday denmark power bi for pdfJen Stirrup
 
Office 365 Saturday Europe - Yammer, Office 365, SharePoint (yOS) : hybrid ar...
Office 365 Saturday Europe - Yammer, Office 365, SharePoint (yOS) : hybrid ar...Office 365 Saturday Europe - Yammer, Office 365, SharePoint (yOS) : hybrid ar...
Office 365 Saturday Europe - Yammer, Office 365, SharePoint (yOS) : hybrid ar...Patrick Guimonet
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHortonworks
 
Big Data visualization with Apache Spark and Zeppelin
Big Data visualization with Apache Spark and ZeppelinBig Data visualization with Apache Spark and Zeppelin
Big Data visualization with Apache Spark and Zeppelinprajods
 

Andere mochten auch (7)

Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol
Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol
Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol
 
Business Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache HadoopBusiness Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache Hadoop
 
democratization of data sql-konferenz
democratization of data sql-konferenzdemocratization of data sql-konferenz
democratization of data sql-konferenz
 
Sql saturday denmark power bi for pdf
Sql saturday denmark power bi for pdfSql saturday denmark power bi for pdf
Sql saturday denmark power bi for pdf
 
Office 365 Saturday Europe - Yammer, Office 365, SharePoint (yOS) : hybrid ar...
Office 365 Saturday Europe - Yammer, Office 365, SharePoint (yOS) : hybrid ar...Office 365 Saturday Europe - Yammer, Office 365, SharePoint (yOS) : hybrid ar...
Office 365 Saturday Europe - Yammer, Office 365, SharePoint (yOS) : hybrid ar...
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - Jaspersoft
 
Big Data visualization with Apache Spark and Zeppelin
Big Data visualization with Apache Spark and ZeppelinBig Data visualization with Apache Spark and Zeppelin
Big Data visualization with Apache Spark and Zeppelin
 

Ähnlich wie Data Visualisation with Hadoop Mashups, Hive, Power BI and Excel 2013

The modern analytics architecture
The modern analytics architectureThe modern analytics architecture
The modern analytics architectureJoseph D'Antoni
 
the Data World Distilled
the Data World Distilledthe Data World Distilled
the Data World DistilledRTTS
 
Big Data & Analytics (CSE6005) L6.pptx
Big Data & Analytics (CSE6005) L6.pptxBig Data & Analytics (CSE6005) L6.pptx
Big Data & Analytics (CSE6005) L6.pptxAnonymous9etQKwW
 
Bi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in LondonBi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in LondonDremio Corporation
 
hive_slides_Webinar_Session_1.pptx
hive_slides_Webinar_Session_1.pptxhive_slides_Webinar_Session_1.pptx
hive_slides_Webinar_Session_1.pptxvishwasgarade1
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsAndrew Brust
 
Big Data Developers Moscow Meetup 1 - sql on hadoop
Big Data Developers Moscow Meetup 1  - sql on hadoopBig Data Developers Moscow Meetup 1  - sql on hadoop
Big Data Developers Moscow Meetup 1 - sql on hadoopbddmoscow
 
hive architecture and hive components in detail
hive architecture and hive components in detailhive architecture and hive components in detail
hive architecture and hive components in detailHariKumar544765
 
Haddop in Business Intelligence
Haddop in Business IntelligenceHaddop in Business Intelligence
Haddop in Business IntelligenceHGanesh
 
7 Databases in 70 minutes
7 Databases in 70 minutes7 Databases in 70 minutes
7 Databases in 70 minutesKaren Lopez
 

Ähnlich wie Data Visualisation with Hadoop Mashups, Hive, Power BI and Excel 2013 (20)

The modern analytics architecture
The modern analytics architectureThe modern analytics architecture
The modern analytics architecture
 
Apache Hadoop Hive
Apache Hadoop HiveApache Hadoop Hive
Apache Hadoop Hive
 
Hive with HDInsight
Hive with HDInsightHive with HDInsight
Hive with HDInsight
 
the Data World Distilled
the Data World Distilledthe Data World Distilled
the Data World Distilled
 
Hive.pptx
Hive.pptxHive.pptx
Hive.pptx
 
hive.pptx
hive.pptxhive.pptx
hive.pptx
 
Big Data & Analytics (CSE6005) L6.pptx
Big Data & Analytics (CSE6005) L6.pptxBig Data & Analytics (CSE6005) L6.pptx
Big Data & Analytics (CSE6005) L6.pptx
 
Apache hive1
Apache hive1Apache hive1
Apache hive1
 
Bi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in LondonBi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in London
 
hive_slides_Webinar_Session_1.pptx
hive_slides_Webinar_Session_1.pptxhive_slides_Webinar_Session_1.pptx
hive_slides_Webinar_Session_1.pptx
 
Hive and querying data
Hive and querying dataHive and querying data
Hive and querying data
 
1. Apache HIVE
1. Apache HIVE1. Apache HIVE
1. Apache HIVE
 
Hive
HiveHive
Hive
 
Big data and tools
Big data and tools Big data and tools
Big data and tools
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI Pros
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
Big Data Developers Moscow Meetup 1 - sql on hadoop
Big Data Developers Moscow Meetup 1  - sql on hadoopBig Data Developers Moscow Meetup 1  - sql on hadoop
Big Data Developers Moscow Meetup 1 - sql on hadoop
 
hive architecture and hive components in detail
hive architecture and hive components in detailhive architecture and hive components in detail
hive architecture and hive components in detail
 
Haddop in Business Intelligence
Haddop in Business IntelligenceHaddop in Business Intelligence
Haddop in Business Intelligence
 
7 Databases in 70 minutes
7 Databases in 70 minutes7 Databases in 70 minutes
7 Databases in 70 minutes
 

Mehr von Jen Stirrup

AI Applications in Healthcare and Medicine.pdf
AI Applications in Healthcare and Medicine.pdfAI Applications in Healthcare and Medicine.pdf
AI Applications in Healthcare and Medicine.pdfJen Stirrup
 
BUILDING A STRONG FOUNDATION FOR SUCCESS WITH BI AND DIGITAL TRANSFORMATION
BUILDING A STRONG FOUNDATION FOR SUCCESS WITH BI AND DIGITAL TRANSFORMATIONBUILDING A STRONG FOUNDATION FOR SUCCESS WITH BI AND DIGITAL TRANSFORMATION
BUILDING A STRONG FOUNDATION FOR SUCCESS WITH BI AND DIGITAL TRANSFORMATIONJen Stirrup
 
CuRious about R in Power BI? End to end R in Power BI for beginners
CuRious about R in Power BI? End to end R in Power BI for beginners CuRious about R in Power BI? End to end R in Power BI for beginners
CuRious about R in Power BI? End to end R in Power BI for beginners Jen Stirrup
 
Artificial Intelligence Ethics keynote: With Great Power, comes Great Respons...
Artificial Intelligence Ethics keynote: With Great Power, comes Great Respons...Artificial Intelligence Ethics keynote: With Great Power, comes Great Respons...
Artificial Intelligence Ethics keynote: With Great Power, comes Great Respons...Jen Stirrup
 
1 Introduction to Microsoft data platform analytics for release
1 Introduction to Microsoft data platform analytics for release1 Introduction to Microsoft data platform analytics for release
1 Introduction to Microsoft data platform analytics for releaseJen Stirrup
 
5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for Analytics5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for AnalyticsJen Stirrup
 
Comparing Microsoft Big Data Platform Technologies
Comparing Microsoft Big Data Platform TechnologiesComparing Microsoft Big Data Platform Technologies
Comparing Microsoft Big Data Platform TechnologiesJen Stirrup
 
Introduction to Analytics with Azure Notebooks and Python
Introduction to Analytics with Azure Notebooks and PythonIntroduction to Analytics with Azure Notebooks and Python
Introduction to Analytics with Azure Notebooks and PythonJen Stirrup
 
Sales Analytics in Power BI
Sales Analytics in Power BISales Analytics in Power BI
Sales Analytics in Power BIJen Stirrup
 
Analytics for Marketing
Analytics for MarketingAnalytics for Marketing
Analytics for MarketingJen Stirrup
 
Diversity and inclusion for the newbies and doers
Diversity and inclusion for the newbies and doersDiversity and inclusion for the newbies and doers
Diversity and inclusion for the newbies and doersJen Stirrup
 
Artificial Intelligence from the Business perspective
Artificial Intelligence from the Business perspectiveArtificial Intelligence from the Business perspective
Artificial Intelligence from the Business perspectiveJen Stirrup
 
How to be successful with Artificial Intelligence - from small to success
How to be successful with Artificial Intelligence - from small to successHow to be successful with Artificial Intelligence - from small to success
How to be successful with Artificial Intelligence - from small to successJen Stirrup
 
Artificial Intelligence: Winning the Red Queen’s Race Keynote at ESPC with Je...
Artificial Intelligence: Winning the Red Queen’s Race Keynote at ESPC with Je...Artificial Intelligence: Winning the Red Queen’s Race Keynote at ESPC with Je...
Artificial Intelligence: Winning the Red Queen’s Race Keynote at ESPC with Je...Jen Stirrup
 
Data Visualization dataviz superpower
Data Visualization dataviz superpowerData Visualization dataviz superpower
Data Visualization dataviz superpowerJen Stirrup
 
R - what do the numbers mean? #RStats
R - what do the numbers mean? #RStatsR - what do the numbers mean? #RStats
R - what do the numbers mean? #RStatsJen Stirrup
 
Artificial Intelligence and Deep Learning in Azure, CNTK and Tensorflow
Artificial Intelligence and Deep Learning in Azure, CNTK and TensorflowArtificial Intelligence and Deep Learning in Azure, CNTK and Tensorflow
Artificial Intelligence and Deep Learning in Azure, CNTK and TensorflowJen Stirrup
 
Blockchain Demystified for Business Intelligence Professionals
Blockchain Demystified for Business Intelligence ProfessionalsBlockchain Demystified for Business Intelligence Professionals
Blockchain Demystified for Business Intelligence ProfessionalsJen Stirrup
 
Examples of the worst data visualization ever
Examples of the worst data visualization everExamples of the worst data visualization ever
Examples of the worst data visualization everJen Stirrup
 
Lighting up Big Data Analytics with Apache Spark in Azure
Lighting up Big Data Analytics with Apache Spark in AzureLighting up Big Data Analytics with Apache Spark in Azure
Lighting up Big Data Analytics with Apache Spark in AzureJen Stirrup
 

Mehr von Jen Stirrup (20)

AI Applications in Healthcare and Medicine.pdf
AI Applications in Healthcare and Medicine.pdfAI Applications in Healthcare and Medicine.pdf
AI Applications in Healthcare and Medicine.pdf
 
BUILDING A STRONG FOUNDATION FOR SUCCESS WITH BI AND DIGITAL TRANSFORMATION
BUILDING A STRONG FOUNDATION FOR SUCCESS WITH BI AND DIGITAL TRANSFORMATIONBUILDING A STRONG FOUNDATION FOR SUCCESS WITH BI AND DIGITAL TRANSFORMATION
BUILDING A STRONG FOUNDATION FOR SUCCESS WITH BI AND DIGITAL TRANSFORMATION
 
CuRious about R in Power BI? End to end R in Power BI for beginners
CuRious about R in Power BI? End to end R in Power BI for beginners CuRious about R in Power BI? End to end R in Power BI for beginners
CuRious about R in Power BI? End to end R in Power BI for beginners
 
Artificial Intelligence Ethics keynote: With Great Power, comes Great Respons...
Artificial Intelligence Ethics keynote: With Great Power, comes Great Respons...Artificial Intelligence Ethics keynote: With Great Power, comes Great Respons...
Artificial Intelligence Ethics keynote: With Great Power, comes Great Respons...
 
1 Introduction to Microsoft data platform analytics for release
1 Introduction to Microsoft data platform analytics for release1 Introduction to Microsoft data platform analytics for release
1 Introduction to Microsoft data platform analytics for release
 
5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for Analytics5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for Analytics
 
Comparing Microsoft Big Data Platform Technologies
Comparing Microsoft Big Data Platform TechnologiesComparing Microsoft Big Data Platform Technologies
Comparing Microsoft Big Data Platform Technologies
 
Introduction to Analytics with Azure Notebooks and Python
Introduction to Analytics with Azure Notebooks and PythonIntroduction to Analytics with Azure Notebooks and Python
Introduction to Analytics with Azure Notebooks and Python
 
Sales Analytics in Power BI
Sales Analytics in Power BISales Analytics in Power BI
Sales Analytics in Power BI
 
Analytics for Marketing
Analytics for MarketingAnalytics for Marketing
Analytics for Marketing
 
Diversity and inclusion for the newbies and doers
Diversity and inclusion for the newbies and doersDiversity and inclusion for the newbies and doers
Diversity and inclusion for the newbies and doers
 
Artificial Intelligence from the Business perspective
Artificial Intelligence from the Business perspectiveArtificial Intelligence from the Business perspective
Artificial Intelligence from the Business perspective
 
How to be successful with Artificial Intelligence - from small to success
How to be successful with Artificial Intelligence - from small to successHow to be successful with Artificial Intelligence - from small to success
How to be successful with Artificial Intelligence - from small to success
 
Artificial Intelligence: Winning the Red Queen’s Race Keynote at ESPC with Je...
Artificial Intelligence: Winning the Red Queen’s Race Keynote at ESPC with Je...Artificial Intelligence: Winning the Red Queen’s Race Keynote at ESPC with Je...
Artificial Intelligence: Winning the Red Queen’s Race Keynote at ESPC with Je...
 
Data Visualization dataviz superpower
Data Visualization dataviz superpowerData Visualization dataviz superpower
Data Visualization dataviz superpower
 
R - what do the numbers mean? #RStats
R - what do the numbers mean? #RStatsR - what do the numbers mean? #RStats
R - what do the numbers mean? #RStats
 
Artificial Intelligence and Deep Learning in Azure, CNTK and Tensorflow
Artificial Intelligence and Deep Learning in Azure, CNTK and TensorflowArtificial Intelligence and Deep Learning in Azure, CNTK and Tensorflow
Artificial Intelligence and Deep Learning in Azure, CNTK and Tensorflow
 
Blockchain Demystified for Business Intelligence Professionals
Blockchain Demystified for Business Intelligence ProfessionalsBlockchain Demystified for Business Intelligence Professionals
Blockchain Demystified for Business Intelligence Professionals
 
Examples of the worst data visualization ever
Examples of the worst data visualization everExamples of the worst data visualization ever
Examples of the worst data visualization ever
 
Lighting up Big Data Analytics with Apache Spark in Azure
Lighting up Big Data Analytics with Apache Spark in AzureLighting up Big Data Analytics with Apache Spark in Azure
Lighting up Big Data Analytics with Apache Spark in Azure
 

Kürzlich hochgeladen

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Kürzlich hochgeladen (20)

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Data Visualisation with Hadoop Mashups, Hive, Power BI and Excel 2013

  • 1. Visualising Big Data Big Data Visualisation with Hadoop, Hive and Excel 2013
  • 3. Explore Everything PASS Has to Offer Free SQL Server and BI Web Events Free 1-day Training Events Regional Event This is Community Business Analytics Training Local User Groups Around the World Session Recordings PASS Newsletter Free Online Technical Training 3
  • 4. About me  Director-At-Large (Elect) PASS Board from Jan 2014  SQL Server MVP  Blogger, data strategist, public speaker, technologist  Joint owner of Copper Blue Consulting Ltd 4 |
  • 5. Agenda     5 | Overview of Big Data Technologies Data Visualisation with Office365 and PowerBI Hive Visualising Big Data with Microsoft
  • 7. HDInsight Ecosystem ODBC Distributed Processing (Map Reduce) Distributed Storage (HDFS) (Azure Data Marketplace) Windows Azure Storage
  • 8. What is Hadoop? “Flexible and Available Architecture for Large Scale computation and data processing on a network of highly available commodity hardware.”
  • 9. Hadoop’s Lineage * Resource: Kerberos Konference (Yahoo) – 2010
  • 10. Data Visualisation Background We have the tools. All we’ve got to do is imagine what could be. We can reinvent the present; we can transform the world around us. Jason Silva 10
  • 11. Almost 50% of your brain is dedicated to visual processing. David van Essen Researchers found that colour visuals increase the willingness to read by 11 80% About 70% of your sensory receptors are in your eyes.
  • 12. Why is Data Visualisation Important?  It’s clearly a budget. It has a lot of numbers in it. George W Bush I could never figure out where the decimal point went. (Lord Randolph Churchill)
  • 13.
  • 14. The Unknown Unknowns  That is to say, there are things that we know we don't know. But there are also unknown unknowns. There are things we don't know we don't know. (Donald Rumsfeld)
  • 15.
  • 16. What is the purpose of Hive? Hive is a solution to a business problem: How do you analyse large amounts of data? Data Scientists want to study data Communicate with the data Businesses want to reap benefits of data Results that make sense of the data 16
  • 17. 17
  • 18. What is the purpose of Hive? Hive is a data warehousing system for Hadoop To meet the needs of businesses, data scientists, analysts and BI professionals Data, Summarized Fit a structure onto data Data, Analyzed Analysis of Large Datasets stored in Hadoop File Systems SQL-Like language called HiveQL Custom mappers and reduces when HiveQL isn’t enough 18
  • 19. Agenda  Hive solves the business problem of analysing large amounts of data • • • • 19 What is the purpose of Hive? Why Hive? A history of Hive What are Hive’s constituents
  • 20. Why Hive? Can’t Hadoop be used to solve these problems? Why is there a need for Hive? Writing MR jobs in Java can be difficult You don’t know it’s wrong until it’s fallen over! Joining Large Datasets can be difficult Learning Curve 20
  • 21. Agenda  Hive solves the business problem of analysing large amounts of data • • • • 21 What is the purpose of Hive? Why Hive? A history of Hive What are Hive’s constituents
  • 24. What can Hive offer you?  Hive can help with a range of business problems: • • • • 24 Log Processing Predictive Modelling Hypothesis testing And Business Intelligence
  • 25. Hive is not a replacement for SQL  So don’t throw out your SQL Server instances! • Hive is for processing large data sets that may span hundreds, or even thousands, of machines • Hive as a high overhead for starting a job. It translates queries to MR so it takes time • Hive does not cache data, like SQL Server • Hive performance tuning is mainly Hadoop performance tuning • Similarity of the query engine, but different architectures for different purposes 25
  • 26. Agenda  Hive solves the business problem of analysing large amounts of data • • • • What is the purpose of Hive? Why Hive? A history of Hive What are Hive’s constituents?  Hive as a SQL-like Language Query Tool  Hive as a Translation Tool  Hive as a Structuring Tool 26
  • 27. HiveQL Hive QL is a SQL-like language It outputs naturally occurring groups for further analysis Easy Data Summarization Large Datasets, summarized Fit a structure onto data Analysis of Large Datasets stored in Hadoop file systems SQL-Like language called HiveQL Custom mappers and reduces when HiveQL isn’t enough 27
  • 28. HiveQL Queries like SQL Queries? Similarities in Syntax and Features Similar features SELECT FROM WHERE GROUP BY / HAVING Table Aliases Computed Columns 28
  • 29. HiveQL Queries like SQL Queries? Similarities in Syntax and Features Similar features Aggregate Functions Nested Select CASE LIKE / RLIKE JOIN ORDER BY / SORT BY 29
  • 30. How does Hive work? Hive as a Translation Tool Compiles and executes queries Hive translates the SQL Query to a Map Reduce Job These are chained together Queries are compiled and executed 30
  • 31. How does Hive work? Hive as a structuring Tool Creates a schema around the data Tables stored in Directories Hive Tables Rows and columns, like SQL tables Hive Metastore Namespace with a set of tables Holds table definitions Physical Layout Column Types Partition Information 31
  • 32. Hive and SQL Data Types Hive SQL Tinyint Tinyint SmallInt Smallint Int Int BigInt BigInt Boolean Bit (setting as NOT NULL) Float Float Double Real BigDecimal Decimal 33
  • 33. Hive and SQL Data Types HEADING HEADING String Char, varchar, nvarchar, ntext, text, image Binary binary Timestamp Timestamp (note that this is being deprecated). RowVersion 34
  • 34. Hive Mathematical Operations  Primitive Types  Complex Types • Plus • Arrays • Negative • Maps • Addition • Structs • Subtraction • Union • Multiplication • Division • Modulus 35
  • 35. How does Hive work? Hive as a structuring Tool Creates a schema around the data Tables stored in Directories Hive Tables Rows and columns, like SQL tables Hive Metastore Namespace with a set of tables Holds table definitions Physical Layout Column Types Partition Information 36
  • 37. Different Tools for Different Jobs  Power View  Power Map  Highly Visual Design Experience  Power Map is a new 3D visualization add-in for Excel helping you to analyse geographical and temporal data  Power View is an interactive, ad hoc, query and visualization experience.  It is for business question ‘mystery’ solving  Mapping  Exploring  Interacting 38 38
  • 38. Data where you want it 39 39
  • 39. Data you want about ‘where’ 40 40
  • 40. Data you want to share 41 41
  • 43. What did we learn from the demo? 44
  • 44. JOIN US for our second annual event to get the best learning for analyzing, managing, and sharing business information and insights through the Microsoft Data Platform of technologies.
  • 45. Don’t be shy… questions?
  • 46. Thank you for listening