SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Downloaden Sie, um offline zu lesen
Sai Paravastu
Principal
BAR360
BIG Data and BIRT
Source and Definition
Big data is a collection of data sets so large and complex that it becomes difficult to
process using currently on-hand database management tools or traditional data
processing applications
Source: Wikipedia
Web Logs RFID Sensors Social Networks
Internet Text Searches
Call Detail
Records
Astronomy
Atmospheric
Info
Genomics Biogeochemical Biological
Military Surveillance Medical Records E-Commerce Video
Top Big Data Challenges
• Data integration
– Top challenge
– Integrating disparate data, different sources, different formats is difficult
• Getting started with the right project
– Building the right team
– Determine the top business problem
• Architecting a big data system.
– High volume, high frequency data
– Build unified information architecture
• Lack of skills or staff
– Some hire externally / university hires.
– Others try to re-train from within.
– Cross pollinate skills from another part of the organization
– Build centers of excellence that help with the training
Source:TDWI
Big Data Growth Drivers
• Increased awareness of the Big Data benefits
– Not just web, financial services, pharmaceuticals, retail
• Increased maturity of Big Data software
– Data stores, analytical engines
• Increased availability of professional services
– Supporting business use cases
• Increased investment in infrastructure
– Google, Facebook, Amazon
Source: Wikibon Community
Big Data Opportunities
• A single view opens new paths to value
– ACQUIRE NEW CUSTOMERS
– GROW EXISTING RELATIONSHIPS
– RETAIN VALUABLE CUSTOMERS
• Preventative maintenance, resource optimization and proactive
intervention
– PREVENTATIVE MAINTENANCE
– RESOURCE OPTIMIZATION
– BEHAVIORAL INSIGHT
• Data discovery opens new paths to value
– EXPLORE
– COMBINE
New Approaches to Big Data Processing & Analytics
Traditional tools and technologies are
straining
• New approaches to data processing
– Commodity hardware to scale
– Parallel processing techniques
– Non-relational data storage
capabilities
– Unstructured, semi-structured data
• Better analytics
– Advanced visualization
– Data mining
Source: Wikibon Community
New Approaches to Big Data Processing & Analytics
– Hadoop Approach
• Data broken into “parts”
• Loaded into file system
• Multiple nodes
• MapReduce
• Batch-style historical analysis
– NoSQL
• Cassandra, MongoDB, CouchDB, HBase*
• Discrete data stored among large volumes
• Higher performance than relational data sources
– Massively Parallel Analytic Databases
• Quickly ingest mostly structured data
• Minimal data modeling
• Scale to petabytes of data
• Near real-time results to complex SQL
Source: Wikibon Community
Big Data or Little Data - How Do You Display Yours?
The Eclipse Foundation would like to better understand
how developers are using Eclipse with big data and
reporting projects.
Eclipse Promoted the Survey this survey to get the pulse of what technologies where in
demand related to Eclipse/BIRT technologies
60% of 518 responders claimed to be big data users
Eclipse BIRT Survey
0 10 20 30 40 50
None
Hadoop
BIRT
MongoDB
R
Hive
Mahout
Cassandra
Talend
What Big Data technologies are you using with Eclipse?
Eclipse BIRT Survey - Technology Choices
Note: Responders could choose more than one option
BIRT supports Big Data
September 2004 BIRT Project proposal accepted, and project launched
June 2005 1.0 Eclipse Report Designer, Report Engine, Chart Engine
December 2005 2.0 Support for a wide variety of common layouts
June 2006 2.1 Advanced parameters, ability to join data sets, …
June 2007 2.2 Dynamic crosstab support, web services data source, …
June 2008 2.3 JavaScript Debugger, BiDi Support, Charts in Crosstabs, …
June 2009 2.5 Page aggregates, Multiple drill-downs in Charts, …
June 2010 2.6 New charts, more chart control, developer productivity, …
June 2011 3.7 POJO Runtime, Hive/Hadoop, Open Office emitters…
June 2012 4.2 Maven Support, Excel Data Source, Relative Time Periods…
June 2013 4.3 POJO Data Source, MongoDB/Cassandra support, client JS
June 2014 4.4 Minor release
BIRT Offers many ways to get data
Standard Data Sources
Flat File (CSV, TSV, SSV, PSV)
Hive Data Source (Hadoop)
Cassandra Scripted Data Source
MongoDB Data Source
JDBC Textual or Graphical
Web Service - XPath syntax
XML - XPath syntax
XLS/XLSX
Scripted Data Source Written in Java or
JavaScript
Open Data Access (ODA) DTP Project
Community Contributions
GoogleDocs
XML/A
Casandra
REST
MongoDB
Multi-Flat File
GitHub
Twitter JSON Search
Dropbox usage
YQL
Google Analytics
LinkedIn
Facebook FQL
BIRT Data Access
BIRT Community
BIRT OEM Adopters
Software and Technology Companies
Eclipse / Hadoop Ecosystem
Connecting to Hadoop
Hortonworks Sandbox Demo
http://hortonworks.com/hadoop-tutorial/birt_reporting_tutorial/
Connecting to Hadoop
Hive JDBC – HQL Sub Query Example
Hive JDBC – get_json_object UDF
Hive JDBC – RegExP Example
Hive JDBC – HQL Hints example
Hive JDBC – Transform Example
Cassandra – Scripted Data Set Example
BIRT in the World
of Big Data
Questions?
BIRT Open Source – Thank you
Forrester Wave Report
Actuate proposed and started BIRT
Business Intelligence and Reporting Tools
Project a top-level Eclipse project
Professional open source
Primary development resources
funded by Actuate
IBM uses BIRT in over 40 products for reporting
We have a pool of local BIRT resources
We regularly provide Training in BIRT
We organise regular BIRT User Group conferences
Search “BUG Downunder” to join our user group
learn@bar360.com.au

Weitere ähnliche Inhalte

Was ist angesagt?

Data Governance with Profisee, Microsoft & CCG
Data Governance with Profisee, Microsoft & CCG Data Governance with Profisee, Microsoft & CCG
Data Governance with Profisee, Microsoft & CCG CCG
 
Enterprise Data World Webinar: A Strategic Approach to Data Quality
Enterprise Data World Webinar: A Strategic Approach to Data Quality Enterprise Data World Webinar: A Strategic Approach to Data Quality
Enterprise Data World Webinar: A Strategic Approach to Data Quality DATAVERSITY
 
Enterprise Data Management
Enterprise Data ManagementEnterprise Data Management
Enterprise Data ManagementBhavendra Chavan
 
Subscribing to Your Critical Data Supply Chain - Getting Value from True Data...
Subscribing to Your Critical Data Supply Chain - Getting Value from True Data...Subscribing to Your Critical Data Supply Chain - Getting Value from True Data...
Subscribing to Your Critical Data Supply Chain - Getting Value from True Data...DATAVERSITY
 
Revolution In Data Governance - Transforming the customer experience
Revolution In Data Governance - Transforming the customer experienceRevolution In Data Governance - Transforming the customer experience
Revolution In Data Governance - Transforming the customer experiencePaul Dyksterhouse
 
Data Governance and MDM | Profisse, Microsoft, and CCG
Data Governance and MDM | Profisse, Microsoft, and CCGData Governance and MDM | Profisse, Microsoft, and CCG
Data Governance and MDM | Profisse, Microsoft, and CCGCCG
 
How to Build Data Governance Programs That Last: A Business-First Approach
How to Build Data Governance Programs That Last: A Business-First ApproachHow to Build Data Governance Programs That Last: A Business-First Approach
How to Build Data Governance Programs That Last: A Business-First ApproachPrecisely
 
Kickstart a Data Quality Strategy to Build Trust in Data
Kickstart a Data Quality Strategy to Build Trust in DataKickstart a Data Quality Strategy to Build Trust in Data
Kickstart a Data Quality Strategy to Build Trust in DataPrecisely
 
How to Structure the Data Organization
How to Structure the Data OrganizationHow to Structure the Data Organization
How to Structure the Data OrganizationRobyn Bollhorst
 
Unlocking Success in the 3 Stages of Master Data Management
Unlocking Success in the 3 Stages of Master Data ManagementUnlocking Success in the 3 Stages of Master Data Management
Unlocking Success in the 3 Stages of Master Data ManagementPerficient, Inc.
 
TDWI Spotlight: Enabling Data Self-Service with Security, Governance, and Reg...
TDWI Spotlight: Enabling Data Self-Service with Security, Governance, and Reg...TDWI Spotlight: Enabling Data Self-Service with Security, Governance, and Reg...
TDWI Spotlight: Enabling Data Self-Service with Security, Governance, and Reg...Denodo
 
Data Management Meets Human Management - Why Words Matter
Data Management Meets Human Management - Why Words MatterData Management Meets Human Management - Why Words Matter
Data Management Meets Human Management - Why Words MatterDATAVERSITY
 
Data governance
Data governanceData governance
Data governanceMD Redaan
 
Sustaining Data Governance and Adding Value for the Long Term
Sustaining Data Governance and Adding Value for the Long TermSustaining Data Governance and Adding Value for the Long Term
Sustaining Data Governance and Adding Value for the Long TermFirst San Francisco Partners
 
Complying with Cybersecurity Regulations for IBM i Servers and Data
Complying with Cybersecurity Regulations for IBM i Servers and DataComplying with Cybersecurity Regulations for IBM i Servers and Data
Complying with Cybersecurity Regulations for IBM i Servers and DataPrecisely
 
Linking Data Governance to Business Goals
Linking Data Governance to Business GoalsLinking Data Governance to Business Goals
Linking Data Governance to Business GoalsPrecisely
 
Governance beyond master data
Governance beyond master dataGovernance beyond master data
Governance beyond master dataGary Allemann
 
The Key Reason Why Your DG Program is Failing
The Key Reason Why Your DG Program is FailingThe Key Reason Why Your DG Program is Failing
The Key Reason Why Your DG Program is FailingCCG
 

Was ist angesagt? (20)

Data Governance with Profisee, Microsoft & CCG
Data Governance with Profisee, Microsoft & CCG Data Governance with Profisee, Microsoft & CCG
Data Governance with Profisee, Microsoft & CCG
 
Enterprise Data World Webinar: A Strategic Approach to Data Quality
Enterprise Data World Webinar: A Strategic Approach to Data Quality Enterprise Data World Webinar: A Strategic Approach to Data Quality
Enterprise Data World Webinar: A Strategic Approach to Data Quality
 
Enterprise Data Management
Enterprise Data ManagementEnterprise Data Management
Enterprise Data Management
 
Subscribing to Your Critical Data Supply Chain - Getting Value from True Data...
Subscribing to Your Critical Data Supply Chain - Getting Value from True Data...Subscribing to Your Critical Data Supply Chain - Getting Value from True Data...
Subscribing to Your Critical Data Supply Chain - Getting Value from True Data...
 
Revolution In Data Governance - Transforming the customer experience
Revolution In Data Governance - Transforming the customer experienceRevolution In Data Governance - Transforming the customer experience
Revolution In Data Governance - Transforming the customer experience
 
Data Governance and MDM | Profisse, Microsoft, and CCG
Data Governance and MDM | Profisse, Microsoft, and CCGData Governance and MDM | Profisse, Microsoft, and CCG
Data Governance and MDM | Profisse, Microsoft, and CCG
 
How to Build Data Governance Programs That Last: A Business-First Approach
How to Build Data Governance Programs That Last: A Business-First ApproachHow to Build Data Governance Programs That Last: A Business-First Approach
How to Build Data Governance Programs That Last: A Business-First Approach
 
Kickstart a Data Quality Strategy to Build Trust in Data
Kickstart a Data Quality Strategy to Build Trust in DataKickstart a Data Quality Strategy to Build Trust in Data
Kickstart a Data Quality Strategy to Build Trust in Data
 
How to Structure the Data Organization
How to Structure the Data OrganizationHow to Structure the Data Organization
How to Structure the Data Organization
 
Unlocking Success in the 3 Stages of Master Data Management
Unlocking Success in the 3 Stages of Master Data ManagementUnlocking Success in the 3 Stages of Master Data Management
Unlocking Success in the 3 Stages of Master Data Management
 
TDWI Spotlight: Enabling Data Self-Service with Security, Governance, and Reg...
TDWI Spotlight: Enabling Data Self-Service with Security, Governance, and Reg...TDWI Spotlight: Enabling Data Self-Service with Security, Governance, and Reg...
TDWI Spotlight: Enabling Data Self-Service with Security, Governance, and Reg...
 
Data Management Meets Human Management - Why Words Matter
Data Management Meets Human Management - Why Words MatterData Management Meets Human Management - Why Words Matter
Data Management Meets Human Management - Why Words Matter
 
Strategy For Data Quality
Strategy For Data QualityStrategy For Data Quality
Strategy For Data Quality
 
Data governance
Data governanceData governance
Data governance
 
Sustaining Data Governance and Adding Value for the Long Term
Sustaining Data Governance and Adding Value for the Long TermSustaining Data Governance and Adding Value for the Long Term
Sustaining Data Governance and Adding Value for the Long Term
 
Complying with Cybersecurity Regulations for IBM i Servers and Data
Complying with Cybersecurity Regulations for IBM i Servers and DataComplying with Cybersecurity Regulations for IBM i Servers and Data
Complying with Cybersecurity Regulations for IBM i Servers and Data
 
Linking Data Governance to Business Goals
Linking Data Governance to Business GoalsLinking Data Governance to Business Goals
Linking Data Governance to Business Goals
 
Data Quality
Data QualityData Quality
Data Quality
 
Governance beyond master data
Governance beyond master dataGovernance beyond master data
Governance beyond master data
 
The Key Reason Why Your DG Program is Failing
The Key Reason Why Your DG Program is FailingThe Key Reason Why Your DG Program is Failing
The Key Reason Why Your DG Program is Failing
 

Ähnlich wie Eclipse day Sydney 2014 BIG data presentation

Introduction to BIG DATA
Introduction to BIG DATA Introduction to BIG DATA
Introduction to BIG DATA Zeeshan Khan
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneySai Paravastu
 
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)Moacyr Passador
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Group
 
Data lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amiryData lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amirydatastack
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKRajesh Jayarman
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?James Serra
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolutionitnewsafrica
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dataconomy Media
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big DataFrank Kienle
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User GroupBig Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User GroupScott Mitchell
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2RojaT4
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoopSri Kanth
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureCaserta
 
Modernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APSModernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APSStéphane Fréchette
 

Ähnlich wie Eclipse day Sydney 2014 BIG data presentation (20)

Introduction to BIG DATA
Introduction to BIG DATA Introduction to BIG DATA
Introduction to BIG DATA
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, Sydney
 
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
 
Skillwise Big Data part 2
Skillwise Big Data part 2Skillwise Big Data part 2
Skillwise Big Data part 2
 
Skilwise Big data
Skilwise Big dataSkilwise Big data
Skilwise Big data
 
Data lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amiryData lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amiry
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RK
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolution
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
 
Introduction Big Data
Introduction Big DataIntroduction Big Data
Introduction Big Data
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User GroupBig Data and BI Tools - BI Reporting for Bay Area Startups User Group
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
 
Big data unit 2
Big data unit 2Big data unit 2
Big data unit 2
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
Big data
Big dataBig data
Big data
 
Incorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic ArchitectureIncorporating the Data Lake into Your Analytic Architecture
Incorporating the Data Lake into Your Analytic Architecture
 
Modernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APSModernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APS
 
TSE_Pres12.pptx
TSE_Pres12.pptxTSE_Pres12.pptx
TSE_Pres12.pptx
 

Kürzlich hochgeladen

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 

Kürzlich hochgeladen (20)

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 

Eclipse day Sydney 2014 BIG data presentation

  • 2. Source and Definition Big data is a collection of data sets so large and complex that it becomes difficult to process using currently on-hand database management tools or traditional data processing applications Source: Wikipedia Web Logs RFID Sensors Social Networks Internet Text Searches Call Detail Records Astronomy Atmospheric Info Genomics Biogeochemical Biological Military Surveillance Medical Records E-Commerce Video
  • 3. Top Big Data Challenges • Data integration – Top challenge – Integrating disparate data, different sources, different formats is difficult • Getting started with the right project – Building the right team – Determine the top business problem • Architecting a big data system. – High volume, high frequency data – Build unified information architecture • Lack of skills or staff – Some hire externally / university hires. – Others try to re-train from within. – Cross pollinate skills from another part of the organization – Build centers of excellence that help with the training Source:TDWI
  • 4. Big Data Growth Drivers • Increased awareness of the Big Data benefits – Not just web, financial services, pharmaceuticals, retail • Increased maturity of Big Data software – Data stores, analytical engines • Increased availability of professional services – Supporting business use cases • Increased investment in infrastructure – Google, Facebook, Amazon Source: Wikibon Community
  • 5. Big Data Opportunities • A single view opens new paths to value – ACQUIRE NEW CUSTOMERS – GROW EXISTING RELATIONSHIPS – RETAIN VALUABLE CUSTOMERS • Preventative maintenance, resource optimization and proactive intervention – PREVENTATIVE MAINTENANCE – RESOURCE OPTIMIZATION – BEHAVIORAL INSIGHT • Data discovery opens new paths to value – EXPLORE – COMBINE
  • 6. New Approaches to Big Data Processing & Analytics Traditional tools and technologies are straining • New approaches to data processing – Commodity hardware to scale – Parallel processing techniques – Non-relational data storage capabilities – Unstructured, semi-structured data • Better analytics – Advanced visualization – Data mining Source: Wikibon Community
  • 7. New Approaches to Big Data Processing & Analytics – Hadoop Approach • Data broken into “parts” • Loaded into file system • Multiple nodes • MapReduce • Batch-style historical analysis – NoSQL • Cassandra, MongoDB, CouchDB, HBase* • Discrete data stored among large volumes • Higher performance than relational data sources – Massively Parallel Analytic Databases • Quickly ingest mostly structured data • Minimal data modeling • Scale to petabytes of data • Near real-time results to complex SQL Source: Wikibon Community
  • 8. Big Data or Little Data - How Do You Display Yours? The Eclipse Foundation would like to better understand how developers are using Eclipse with big data and reporting projects. Eclipse Promoted the Survey this survey to get the pulse of what technologies where in demand related to Eclipse/BIRT technologies 60% of 518 responders claimed to be big data users Eclipse BIRT Survey
  • 9. 0 10 20 30 40 50 None Hadoop BIRT MongoDB R Hive Mahout Cassandra Talend What Big Data technologies are you using with Eclipse? Eclipse BIRT Survey - Technology Choices Note: Responders could choose more than one option
  • 10. BIRT supports Big Data September 2004 BIRT Project proposal accepted, and project launched June 2005 1.0 Eclipse Report Designer, Report Engine, Chart Engine December 2005 2.0 Support for a wide variety of common layouts June 2006 2.1 Advanced parameters, ability to join data sets, … June 2007 2.2 Dynamic crosstab support, web services data source, … June 2008 2.3 JavaScript Debugger, BiDi Support, Charts in Crosstabs, … June 2009 2.5 Page aggregates, Multiple drill-downs in Charts, … June 2010 2.6 New charts, more chart control, developer productivity, … June 2011 3.7 POJO Runtime, Hive/Hadoop, Open Office emitters… June 2012 4.2 Maven Support, Excel Data Source, Relative Time Periods… June 2013 4.3 POJO Data Source, MongoDB/Cassandra support, client JS June 2014 4.4 Minor release
  • 11. BIRT Offers many ways to get data Standard Data Sources Flat File (CSV, TSV, SSV, PSV) Hive Data Source (Hadoop) Cassandra Scripted Data Source MongoDB Data Source JDBC Textual or Graphical Web Service - XPath syntax XML - XPath syntax XLS/XLSX Scripted Data Source Written in Java or JavaScript Open Data Access (ODA) DTP Project Community Contributions GoogleDocs XML/A Casandra REST MongoDB Multi-Flat File GitHub Twitter JSON Search Dropbox usage YQL Google Analytics LinkedIn Facebook FQL BIRT Data Access
  • 13. BIRT OEM Adopters Software and Technology Companies
  • 14. Eclipse / Hadoop Ecosystem
  • 15. Connecting to Hadoop Hortonworks Sandbox Demo http://hortonworks.com/hadoop-tutorial/birt_reporting_tutorial/
  • 17. Hive JDBC – HQL Sub Query Example
  • 18. Hive JDBC – get_json_object UDF
  • 19. Hive JDBC – RegExP Example
  • 20. Hive JDBC – HQL Hints example
  • 21. Hive JDBC – Transform Example
  • 22. Cassandra – Scripted Data Set Example
  • 23.
  • 24. BIRT in the World of Big Data Questions?
  • 25. BIRT Open Source – Thank you Forrester Wave Report Actuate proposed and started BIRT Business Intelligence and Reporting Tools Project a top-level Eclipse project Professional open source Primary development resources funded by Actuate IBM uses BIRT in over 40 products for reporting We have a pool of local BIRT resources We regularly provide Training in BIRT We organise regular BIRT User Group conferences Search “BUG Downunder” to join our user group learn@bar360.com.au