SlideShare a Scribd company logo
1 of 16
Download to read offline
Introducing BigSheets
Spreadsheet-Style Tool
for IBM InfoSphere BigInsights

Cynthia M. Saracco
Senior Solution Architect
IBM Silicon Valley Lab
What is BigSheets?

Browser-based analytics tool for business users.

Why BigSheets?

How can BigSheets help?

Business users need a non-technical approach
for analyzing Big Data.

Translating untapped data into actionable
business insights is a common requirement.

Built-in “readers” can work with data in
several common formats (JSON, CSV, TSV, …)

Visualizing and drilling down into enterprise
and Web data promotes new business
intelligence.

2

Spreadsheet-like interface enables business
users to gather and analyze data easily.

Users can combine and explore various types
of data to identify “hidden” insights.

© 2013 IBM Corporation
What you can do with BigSheets
Model “big data”
collected from various
sources in spreadsheetlike structures
Filter and enrich content
with built-in functions
Combine data in different
workbooks
Visualize results through
spreadsheets, charts
Export data into common
formats (if desired)
No programming knowledge needed!
3

© 2013 IBM Corporation
Sample Scenario
Data gathering

Data storage

• WebCrawler app
• DBMS import app
• BoardReader app
• Accelerators
• Flume
• Hadoop commands
• -...

• Distributed file system
• Web-based file browser
and administration

Data exploration,
manipulation, and
analysis
• BigSheets

InfoSphere BigInsights

Blue italics = IBM technology
4

© 2013 IBM Corporation
Technology

5

© 2013 IBM Corporation
Working with BigSheets
Create workbook (spreadsheet-style structure) to model target data
Customize workbook through graphical editor and built-in functions
– Filter data
– Manipulate data (e.g., concatenate fields)
– Combine data from multiple workbooks

“Run” workbook: apply work to full data set
Explore results in spreadsheet format and/or create charts
Optionally, export your data

6

© 2013 IBM Corporation
What are Workbooks?
Spreadsheet-like structures defined by user
Based on data accessible in BigInsights

7

© 2013 IBM Corporation
Creating a Workbook (one approach)
From BigSheets tab of
Web console, click New
Workbook button
Supply input
– Workbook name
– Source file (select from file
system directory tree)
– Appropriate “reader” (data
format translator)
• Built-in readers for Web
data, JSON, CSV, TSV,
Hive, etc.
• User-written plug-ins
supported

Save the workbook

8

© 2013 IBM Corporation
Customizing a workbook
Work with built-in editor
Add / delete columns
Filter data
Specify formulas to compute
new values using
spreadsheet-style syntax
Apply built-in or custom macro
functions
– Supplied text analytic functions
for popular business entities:
person, location, phone number,
etc.

...
9

© 2013 IBM Corporation
Visualizing results
Built-in charting facility aids analysis
Pie charts, bar charts, tag clouds, maps, etc.
Hover over sections to reveal details

10

© 2013 IBM Corporation
Exporting data
Useful for sharing with downstream applications
Several common formats supported
Save to distributed file system or display in browser (Save As -> local file)

11

© 2013 IBM Corporation
On-demand videos
Available on YouTube’s IBM Big
Data Channel at
http://www.youtube.com/user/ibm
bigdata
“Analyzing Social Media for IBM
Watson”
“Big Data Patent Analysis with
BigSheets”
“Big Data for Business Users”
“BigSheets in Action”
See the full list of videos at
http://tinyurl.com/biginsights
12

© 2013 IBM Corporation
Supplemental

13

© 2013 IBM Corporation
Inspecting runtime statistics

14

© 2013 IBM Corporation
Displaying the workflow diagram

15

© 2013 IBM Corporation
Built-in text analysis functions
Included with BigInsights
Version 2.1
BigSheets functions for
extracting common business
entities from text-based
columns
– Address, EmailAddress, Country,
Person, etc.
– Based on pre-built text extractor
library provided with BigInsights

Add Sheet -> Function ->
Categories -> entities

16

© 2013 IBM Corporation

More Related Content

What's hot

Data cube computation
Data cube computationData cube computation
Data cube computationRashmi Sheikh
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingankur bhalla
 
Cloud deployment models
Cloud deployment modelsCloud deployment models
Cloud deployment modelsAshok Kumar
 
2.3 bayesian classification
2.3 bayesian classification2.3 bayesian classification
2.3 bayesian classificationKrish_ver2
 
Evolution of Cloud Computing
Evolution of Cloud ComputingEvolution of Cloud Computing
Evolution of Cloud ComputingNephoScale
 
Infrastructure as a Service ( IaaS)
Infrastructure as a Service ( IaaS)Infrastructure as a Service ( IaaS)
Infrastructure as a Service ( IaaS)Ravindra Dastikop
 
Big Data Analytics with R
Big Data Analytics with RBig Data Analytics with R
Big Data Analytics with RGreat Wide Open
 
IoT Enabling Technologies
IoT Enabling TechnologiesIoT Enabling Technologies
IoT Enabling TechnologiesPrakash Honnur
 
lazy learners and other classication methods
lazy learners and other classication methodslazy learners and other classication methods
lazy learners and other classication methodsrajshreemuthiah
 
Cloud computing and service models
Cloud computing and service modelsCloud computing and service models
Cloud computing and service modelsPrateek Soni
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision treesKnoldus Inc.
 
Cloud computing in a nutshell
Cloud computing in a nutshellCloud computing in a nutshell
Cloud computing in a nutshellMehmet Gonullu
 

What's hot (20)

Data cube computation
Data cube computationData cube computation
Data cube computation
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Cloud deployment models
Cloud deployment modelsCloud deployment models
Cloud deployment models
 
2.3 bayesian classification
2.3 bayesian classification2.3 bayesian classification
2.3 bayesian classification
 
Support Vector Machines ( SVM )
Support Vector Machines ( SVM ) Support Vector Machines ( SVM )
Support Vector Machines ( SVM )
 
Evolution of Cloud Computing
Evolution of Cloud ComputingEvolution of Cloud Computing
Evolution of Cloud Computing
 
Virtualization.ppt
Virtualization.pptVirtualization.ppt
Virtualization.ppt
 
Virtual machine security
Virtual machine securityVirtual machine security
Virtual machine security
 
Domain specific IoT
Domain specific IoTDomain specific IoT
Domain specific IoT
 
Infrastructure as a Service ( IaaS)
Infrastructure as a Service ( IaaS)Infrastructure as a Service ( IaaS)
Infrastructure as a Service ( IaaS)
 
Big Data Analytics with R
Big Data Analytics with RBig Data Analytics with R
Big Data Analytics with R
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
IoT Enabling Technologies
IoT Enabling TechnologiesIoT Enabling Technologies
IoT Enabling Technologies
 
lazy learners and other classication methods
lazy learners and other classication methodslazy learners and other classication methods
lazy learners and other classication methods
 
Cloud computing and service models
Cloud computing and service modelsCloud computing and service models
Cloud computing and service models
 
Machine Learning with Decision trees
Machine Learning with Decision treesMachine Learning with Decision trees
Machine Learning with Decision trees
 
Common Standards in Cloud Computing
Common Standards in Cloud ComputingCommon Standards in Cloud Computing
Common Standards in Cloud Computing
 
Clustering in Data Mining
Clustering in Data MiningClustering in Data Mining
Clustering in Data Mining
 
Cloud computing in a nutshell
Cloud computing in a nutshellCloud computing in a nutshell
Cloud computing in a nutshell
 
Distributed System ppt
Distributed System pptDistributed System ppt
Distributed System ppt
 

Viewers also liked

Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...Cynthia Saracco
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data PlatformVikas Manoria
 
Big Data: Getting off to a fast start with Big SQL (World of Watson 2016 sess...
Big Data: Getting off to a fast start with Big SQL (World of Watson 2016 sess...Big Data: Getting off to a fast start with Big SQL (World of Watson 2016 sess...
Big Data: Getting off to a fast start with Big SQL (World of Watson 2016 sess...Cynthia Saracco
 
Big Data: Getting started with Big SQL self-study guide
Big Data:  Getting started with Big SQL self-study guideBig Data:  Getting started with Big SQL self-study guide
Big Data: Getting started with Big SQL self-study guideCynthia Saracco
 
Big Data: SQL on Hadoop from IBM
Big Data:  SQL on Hadoop from IBM Big Data:  SQL on Hadoop from IBM
Big Data: SQL on Hadoop from IBM Cynthia Saracco
 
Hadoop Summit Japan 2011 Fall - LT by IBM
Hadoop Summit Japan 2011 Fall - LT by IBMHadoop Summit Japan 2011 Fall - LT by IBM
Hadoop Summit Japan 2011 Fall - LT by IBMAtsushi Tsuchiya
 
InfoSphere Streams Technical Overview - Use Cases Big Data - Jerome CHAILLOUX
InfoSphere Streams Technical Overview - Use Cases Big Data - Jerome CHAILLOUXInfoSphere Streams Technical Overview - Use Cases Big Data - Jerome CHAILLOUX
InfoSphere Streams Technical Overview - Use Cases Big Data - Jerome CHAILLOUXIBMInfoSphereUGFR
 
Big data presentation (2014)
Big data presentation (2014)Big data presentation (2014)
Big data presentation (2014)Xavier Constant
 
Using BigSheets for Spreadsheet-like Analytics
Using BigSheets for Spreadsheet-like AnalyticsUsing BigSheets for Spreadsheet-like Analytics
Using BigSheets for Spreadsheet-like AnalyticsDinesh Kumar.V
 
Big Data: Explore Hadoop and BigInsights self-study lab
Big Data:  Explore Hadoop and BigInsights self-study labBig Data:  Explore Hadoop and BigInsights self-study lab
Big Data: Explore Hadoop and BigInsights self-study labCynthia Saracco
 
Big Data: Querying complex JSON data with BigInsights and Hadoop
Big Data:  Querying complex JSON data with BigInsights and HadoopBig Data:  Querying complex JSON data with BigInsights and Hadoop
Big Data: Querying complex JSON data with BigInsights and HadoopCynthia Saracco
 
Big Data: InterConnect 2016 Session on Getting Started with Big Data Analytics
Big Data:  InterConnect 2016 Session on Getting Started with Big Data AnalyticsBig Data:  InterConnect 2016 Session on Getting Started with Big Data Analytics
Big Data: InterConnect 2016 Session on Getting Started with Big Data AnalyticsCynthia Saracco
 
Big Data: Working with Big SQL data from Spark
Big Data:  Working with Big SQL data from Spark Big Data:  Working with Big SQL data from Spark
Big Data: Working with Big SQL data from Spark Cynthia Saracco
 
Big Data: SQL query federation for Hadoop and RDBMS data
Big Data:  SQL query federation for Hadoop and RDBMS dataBig Data:  SQL query federation for Hadoop and RDBMS data
Big Data: SQL query federation for Hadoop and RDBMS dataCynthia Saracco
 
Big Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
Big Data, Big Thinking: Simplified Architecture Webinar Fact SheetBig Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
Big Data, Big Thinking: Simplified Architecture Webinar Fact SheetSAP Technology
 
Big Data: Big SQL and HBase
Big Data:  Big SQL and HBase Big Data:  Big SQL and HBase
Big Data: Big SQL and HBase Cynthia Saracco
 
IBM Hadoop-DS Benchmark Report - 30TB
IBM Hadoop-DS Benchmark Report - 30TBIBM Hadoop-DS Benchmark Report - 30TB
IBM Hadoop-DS Benchmark Report - 30TBGord Sissons
 
Big Data & Analytics Architecture
Big Data & Analytics ArchitectureBig Data & Analytics Architecture
Big Data & Analytics ArchitectureArvind Sathi
 
Help Desk Presentation 09202009
Help Desk Presentation 09202009Help Desk Presentation 09202009
Help Desk Presentation 09202009guest75acf2
 

Viewers also liked (20)

Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data Platform
 
Big Data: Getting off to a fast start with Big SQL (World of Watson 2016 sess...
Big Data: Getting off to a fast start with Big SQL (World of Watson 2016 sess...Big Data: Getting off to a fast start with Big SQL (World of Watson 2016 sess...
Big Data: Getting off to a fast start with Big SQL (World of Watson 2016 sess...
 
Big Data: Getting started with Big SQL self-study guide
Big Data:  Getting started with Big SQL self-study guideBig Data:  Getting started with Big SQL self-study guide
Big Data: Getting started with Big SQL self-study guide
 
Big Data: SQL on Hadoop from IBM
Big Data:  SQL on Hadoop from IBM Big Data:  SQL on Hadoop from IBM
Big Data: SQL on Hadoop from IBM
 
Hadoop Summit Japan 2011 Fall - LT by IBM
Hadoop Summit Japan 2011 Fall - LT by IBMHadoop Summit Japan 2011 Fall - LT by IBM
Hadoop Summit Japan 2011 Fall - LT by IBM
 
InfoSphere Streams Technical Overview - Use Cases Big Data - Jerome CHAILLOUX
InfoSphere Streams Technical Overview - Use Cases Big Data - Jerome CHAILLOUXInfoSphere Streams Technical Overview - Use Cases Big Data - Jerome CHAILLOUX
InfoSphere Streams Technical Overview - Use Cases Big Data - Jerome CHAILLOUX
 
Big data presentation (2014)
Big data presentation (2014)Big data presentation (2014)
Big data presentation (2014)
 
Using BigSheets for Spreadsheet-like Analytics
Using BigSheets for Spreadsheet-like AnalyticsUsing BigSheets for Spreadsheet-like Analytics
Using BigSheets for Spreadsheet-like Analytics
 
Big Data: Explore Hadoop and BigInsights self-study lab
Big Data:  Explore Hadoop and BigInsights self-study labBig Data:  Explore Hadoop and BigInsights self-study lab
Big Data: Explore Hadoop and BigInsights self-study lab
 
Big Data: Querying complex JSON data with BigInsights and Hadoop
Big Data:  Querying complex JSON data with BigInsights and HadoopBig Data:  Querying complex JSON data with BigInsights and Hadoop
Big Data: Querying complex JSON data with BigInsights and Hadoop
 
Big Data: InterConnect 2016 Session on Getting Started with Big Data Analytics
Big Data:  InterConnect 2016 Session on Getting Started with Big Data AnalyticsBig Data:  InterConnect 2016 Session on Getting Started with Big Data Analytics
Big Data: InterConnect 2016 Session on Getting Started with Big Data Analytics
 
Big Data: Working with Big SQL data from Spark
Big Data:  Working with Big SQL data from Spark Big Data:  Working with Big SQL data from Spark
Big Data: Working with Big SQL data from Spark
 
Big Data: SQL query federation for Hadoop and RDBMS data
Big Data:  SQL query federation for Hadoop and RDBMS dataBig Data:  SQL query federation for Hadoop and RDBMS data
Big Data: SQL query federation for Hadoop and RDBMS data
 
Big Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
Big Data, Big Thinking: Simplified Architecture Webinar Fact SheetBig Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
Big Data, Big Thinking: Simplified Architecture Webinar Fact Sheet
 
Big Data: Big SQL and HBase
Big Data:  Big SQL and HBase Big Data:  Big SQL and HBase
Big Data: Big SQL and HBase
 
IBM Hadoop-DS Benchmark Report - 30TB
IBM Hadoop-DS Benchmark Report - 30TBIBM Hadoop-DS Benchmark Report - 30TB
IBM Hadoop-DS Benchmark Report - 30TB
 
Big Data & Analytics Architecture
Big Data & Analytics ArchitectureBig Data & Analytics Architecture
Big Data & Analytics Architecture
 
Help Desk Presentation 09202009
Help Desk Presentation 09202009Help Desk Presentation 09202009
Help Desk Presentation 09202009
 
Machine Data Analytics
Machine Data AnalyticsMachine Data Analytics
Machine Data Analytics
 

Similar to Big Data: Technical Introduction to BigSheets for InfoSphere BigInsights

ICS UserGroup - 2015 - Infrastructure Assessment - Analyze, Visualize and Opt...
ICS UserGroup - 2015 - Infrastructure Assessment - Analyze, Visualize and Opt...ICS UserGroup - 2015 - Infrastructure Assessment - Analyze, Visualize and Opt...
ICS UserGroup - 2015 - Infrastructure Assessment - Analyze, Visualize and Opt...Christoph Adler
 
NLS Banking Solutions - NQuest BI
NLS Banking Solutions - NQuest BINLS Banking Solutions - NQuest BI
NLS Banking Solutions - NQuest BIkarthik nagarajan
 
engage 2015 - - 2015 - Infrastructure Assessment - Analyze, Visualize and Op...
engage 2015 -  - 2015 - Infrastructure Assessment - Analyze, Visualize and Op...engage 2015 -  - 2015 - Infrastructure Assessment - Analyze, Visualize and Op...
engage 2015 - - 2015 - Infrastructure Assessment - Analyze, Visualize and Op...Christoph Adler
 
Extreme SSAS- SQL 2011
Extreme SSAS- SQL 2011Extreme SSAS- SQL 2011
Extreme SSAS- SQL 2011Itay Braun
 
Power BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data SolutionsPower BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data SolutionsJames Serra
 
SharePoint - You've got it, now what?
SharePoint - You've got it, now what?SharePoint - You've got it, now what?
SharePoint - You've got it, now what?Robert Crane
 
New dimensions for_reporting
New dimensions for_reportingNew dimensions for_reporting
New dimensions for_reportingRahul Mahajan
 
Data Architecture Process in a BI environment
Data Architecture Process in a BI environmentData Architecture Process in a BI environment
Data Architecture Process in a BI environmentSasha Citino
 
Create Your First SQL Server Cubes
Create Your First SQL Server CubesCreate Your First SQL Server Cubes
Create Your First SQL Server CubesMark Kromer
 
Microsoft Fabric Introduction
Microsoft Fabric IntroductionMicrosoft Fabric Introduction
Microsoft Fabric IntroductionJames Serra
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)James Serra
 
SPS Vancouver 2018 - What is CDM and CDS
SPS Vancouver 2018 - What is CDM and CDSSPS Vancouver 2018 - What is CDM and CDS
SPS Vancouver 2018 - What is CDM and CDSNicolas Georgeault
 
Libera la potenza del Machine Learning
Libera la potenza del Machine LearningLibera la potenza del Machine Learning
Libera la potenza del Machine LearningJürgen Ambrosi
 
OC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMOC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMBig Data Joe™ Rossi
 
SD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBMSD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBMBig Data Joe™ Rossi
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?James Serra
 

Similar to Big Data: Technical Introduction to BigSheets for InfoSphere BigInsights (20)

ICS UserGroup - 2015 - Infrastructure Assessment - Analyze, Visualize and Opt...
ICS UserGroup - 2015 - Infrastructure Assessment - Analyze, Visualize and Opt...ICS UserGroup - 2015 - Infrastructure Assessment - Analyze, Visualize and Opt...
ICS UserGroup - 2015 - Infrastructure Assessment - Analyze, Visualize and Opt...
 
NLS Banking Solutions - NQuest BI
NLS Banking Solutions - NQuest BINLS Banking Solutions - NQuest BI
NLS Banking Solutions - NQuest BI
 
engage 2015 - - 2015 - Infrastructure Assessment - Analyze, Visualize and Op...
engage 2015 -  - 2015 - Infrastructure Assessment - Analyze, Visualize and Op...engage 2015 -  - 2015 - Infrastructure Assessment - Analyze, Visualize and Op...
engage 2015 - - 2015 - Infrastructure Assessment - Analyze, Visualize and Op...
 
Extreme SSAS- SQL 2011
Extreme SSAS- SQL 2011Extreme SSAS- SQL 2011
Extreme SSAS- SQL 2011
 
Business intelligent
Business intelligentBusiness intelligent
Business intelligent
 
Power BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data SolutionsPower BI for Big Data and the New Look of Big Data Solutions
Power BI for Big Data and the New Look of Big Data Solutions
 
SharePoint - You've got it, now what?
SharePoint - You've got it, now what?SharePoint - You've got it, now what?
SharePoint - You've got it, now what?
 
New dimensions for_reporting
New dimensions for_reportingNew dimensions for_reporting
New dimensions for_reporting
 
Data Architecture Process in a BI environment
Data Architecture Process in a BI environmentData Architecture Process in a BI environment
Data Architecture Process in a BI environment
 
Create Your First SQL Server Cubes
Create Your First SQL Server CubesCreate Your First SQL Server Cubes
Create Your First SQL Server Cubes
 
Microsoft Fabric Introduction
Microsoft Fabric IntroductionMicrosoft Fabric Introduction
Microsoft Fabric Introduction
 
Sap BusinessObjects 4
Sap BusinessObjects 4Sap BusinessObjects 4
Sap BusinessObjects 4
 
Mihai_Nuta
Mihai_NutaMihai_Nuta
Mihai_Nuta
 
Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)Azure Synapse Analytics Overview (r1)
Azure Synapse Analytics Overview (r1)
 
SPS Vancouver 2018 - What is CDM and CDS
SPS Vancouver 2018 - What is CDM and CDSSPS Vancouver 2018 - What is CDM and CDS
SPS Vancouver 2018 - What is CDM and CDS
 
Libera la potenza del Machine Learning
Libera la potenza del Machine LearningLibera la potenza del Machine Learning
Libera la potenza del Machine Learning
 
IBM Operations Analytics For z Systems V2.2 - Client Short Pres
IBM Operations Analytics For z Systems V2.2 - Client Short PresIBM Operations Analytics For z Systems V2.2 - Client Short Pres
IBM Operations Analytics For z Systems V2.2 - Client Short Pres
 
OC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBMOC Big Data Monthly Meetup #6 - Session 1 - IBM
OC Big Data Monthly Meetup #6 - Session 1 - IBM
 
SD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBMSD Big Data Monthly Meetup #4 - Session 1 - IBM
SD Big Data Monthly Meetup #4 - Session 1 - IBM
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?
 

Recently uploaded

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Recently uploaded (20)

Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

Big Data: Technical Introduction to BigSheets for InfoSphere BigInsights

  • 1. Introducing BigSheets Spreadsheet-Style Tool for IBM InfoSphere BigInsights Cynthia M. Saracco Senior Solution Architect IBM Silicon Valley Lab
  • 2. What is BigSheets? Browser-based analytics tool for business users. Why BigSheets? How can BigSheets help? Business users need a non-technical approach for analyzing Big Data. Translating untapped data into actionable business insights is a common requirement. Built-in “readers” can work with data in several common formats (JSON, CSV, TSV, …) Visualizing and drilling down into enterprise and Web data promotes new business intelligence. 2 Spreadsheet-like interface enables business users to gather and analyze data easily. Users can combine and explore various types of data to identify “hidden” insights. © 2013 IBM Corporation
  • 3. What you can do with BigSheets Model “big data” collected from various sources in spreadsheetlike structures Filter and enrich content with built-in functions Combine data in different workbooks Visualize results through spreadsheets, charts Export data into common formats (if desired) No programming knowledge needed! 3 © 2013 IBM Corporation
  • 4. Sample Scenario Data gathering Data storage • WebCrawler app • DBMS import app • BoardReader app • Accelerators • Flume • Hadoop commands • -... • Distributed file system • Web-based file browser and administration Data exploration, manipulation, and analysis • BigSheets InfoSphere BigInsights Blue italics = IBM technology 4 © 2013 IBM Corporation
  • 6. Working with BigSheets Create workbook (spreadsheet-style structure) to model target data Customize workbook through graphical editor and built-in functions – Filter data – Manipulate data (e.g., concatenate fields) – Combine data from multiple workbooks “Run” workbook: apply work to full data set Explore results in spreadsheet format and/or create charts Optionally, export your data 6 © 2013 IBM Corporation
  • 7. What are Workbooks? Spreadsheet-like structures defined by user Based on data accessible in BigInsights 7 © 2013 IBM Corporation
  • 8. Creating a Workbook (one approach) From BigSheets tab of Web console, click New Workbook button Supply input – Workbook name – Source file (select from file system directory tree) – Appropriate “reader” (data format translator) • Built-in readers for Web data, JSON, CSV, TSV, Hive, etc. • User-written plug-ins supported Save the workbook 8 © 2013 IBM Corporation
  • 9. Customizing a workbook Work with built-in editor Add / delete columns Filter data Specify formulas to compute new values using spreadsheet-style syntax Apply built-in or custom macro functions – Supplied text analytic functions for popular business entities: person, location, phone number, etc. ... 9 © 2013 IBM Corporation
  • 10. Visualizing results Built-in charting facility aids analysis Pie charts, bar charts, tag clouds, maps, etc. Hover over sections to reveal details 10 © 2013 IBM Corporation
  • 11. Exporting data Useful for sharing with downstream applications Several common formats supported Save to distributed file system or display in browser (Save As -> local file) 11 © 2013 IBM Corporation
  • 12. On-demand videos Available on YouTube’s IBM Big Data Channel at http://www.youtube.com/user/ibm bigdata “Analyzing Social Media for IBM Watson” “Big Data Patent Analysis with BigSheets” “Big Data for Business Users” “BigSheets in Action” See the full list of videos at http://tinyurl.com/biginsights 12 © 2013 IBM Corporation
  • 14. Inspecting runtime statistics 14 © 2013 IBM Corporation
  • 15. Displaying the workflow diagram 15 © 2013 IBM Corporation
  • 16. Built-in text analysis functions Included with BigInsights Version 2.1 BigSheets functions for extracting common business entities from text-based columns – Address, EmailAddress, Country, Person, etc. – Based on pre-built text extractor library provided with BigInsights Add Sheet -> Function -> Categories -> entities 16 © 2013 IBM Corporation