SlideShare ist ein Scribd-Unternehmen logo
1 von 13
Downloaden Sie, um offline zu lesen
Building a data warehouse with
Pentaho and Docker
Wellington Marinho
wpmarinho@globo.com
Sources
https://github.com/wmarinho/edw_cenipa
OPEN DATA CASE STUDY: CENIPA - AERONAUTICAL ACCIDENT INVESTIGATION AND PREVENTION CENTER
http://dados.gov.br/dataset/ocorrencias-aeronauticas-da-aviacao-civil-brasileira
Architecture
GitHub
docker-pentaho
( Dockerfile / scripts )
pentaho-biserver:5.4
( imagem)
edw-cenipa
( Dockerfile / scripts )
BI SERVER / PDI
PROJETO EDW
pentaho-kettle:5.4
( imagem)
BI SERVER
PDI
Docker Hub
Jenkins + Docker Compose
Amazon EC2
BI SERVER
Amazon EC2
PDI
Amazon RDS
Postgresql / Redshift
ETL
Data Sources
Dashboards – Aeronautical Accident & Incident
http://localhost/pentaho/plugin/cenipa/api/ocorrencias
Business Analytics
CASE STUDY- EDW CENIPA
EDW CENIPA is a opensource project designed to enable analysis of aeronautical incidentes that occured
in the brazilian civil aviation. The project uses techniques and BI tools that explore innovative low-cost
technologies. Historically, Business Intelligence platforms are expensive and impracticable for small projects.
BI projects require specialized skills and high development costs. This work aims to break this barrier.
All analyzes are based on open data provided by CENIPA with historical events of the last 10 years :
• http://dados.gov.br/dataset/ocorrencias-aeronauticas-da-aviacao-civil-brasileira
The graphics were inspired by the report available on the link:
• http://www.cenipa.aer.mil.br/cenipa/index.php/estatisticas/estatisticas/panorama.
Tools
Here are some resources, tools and platforms that were used to develop and deploy the project
• Amazon Web Services - https://aws.amazon.com/
• Linux Operating System - CentOS 6 / Ubuntu 14
• GitHub - https://github.com/ - Powerful collaboration, code review, and code management for
open source and private projects
• Docker - https://www.docker.com/ - An open platform for distributed applications for developers and
sysadmins.
• Pentaho - http://www.pentaho.com/ e http://community.pentaho.com/ - Big data integration and analytics
solutions.
Requirements
• Linux Operating System 4GB RAM and 10GB available hard disk space
• Docker v1.7.1
• CentOS: https://docs.docker.com/installation/centos/
• Ubuntu: https://docs.docker.com/installation/ubuntulinux/
• Mac : https://docs.docker.com/installation/mac/
• Docker Compose v1.4.2 - https://docs.docker.com/compose/install/
$ yum update -y
$ yum install -y docker
$ service docker start
$ usermod -a -G docker ec2-user
$ yum install -y git
$ pip install -U docker-compose
$ PATH=$PATH:/usr/local/bin
Fast deployment on Amazon Linux AMI
Pentaho + Docker – Building an image from a Dockerfile
FROM java:7
MAINTAINER Wellington Marinho wpmarinho@globo.com
# Init ENV
ENV BISERVER_VERSION 5.4
ENV BISERVER_TAG 5.4.0.1-130
ENV PENTAHO_HOME /opt/pentaho
# Apply JAVA_HOME
RUN . /etc/environment
ENV PENTAHO_JAVA_HOME $JAVA_HOME
ENV PENTAHO_JAVA_HOME /usr/lib/jvm/java-1.7.0-openjdk-amd64
ENV JAVA_HOME /usr/lib/jvm/java-1.7.0-openjdk-amd64
# Install Dependences
RUN apt-get update; apt-get install zip -y; 
apt-get install wget unzip git -y; 
apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*;
RUN mkdir ${PENTAHO_HOME};
# Download Pentaho BI Server
RUN /usr/bin/wget --progress=dot:giga http://downloads.sourceforge.net/project/pentaho/Business%20Intelligence%20Server/${BISERVER_VERSION}/biserver-ce-${BISERVER_TAG}.zip
-O /tmp/biserver-ce-${BISERVER_TAG}.zip; 
/usr/bin/unzip -q /tmp/biserver-ce-${BISERVER_TAG}.zip -d $PENTAHO_HOME; 
rm -f /tmp/biserver-ce-${BISERVER_TAG}.zip $PENTAHO_HOME/biserver-ce/promptuser.sh; 
sed -i -e 's/(exec ".*") start/1 run/' $PENTAHO_HOME/biserver-ce/tomcat/bin/startup.sh; 
chmod +x $PENTAHO_HOME/biserver-ce/start-pentaho.sh
RUN useradd -s /bin/bash -d ${PENTAHO_HOME} pentaho; chown -R pentaho:pentaho ${PENTAHO_HOME};
#Always non-root user
USER pentaho
WORKDIR /opt/pentaho
EXPOSE 8080
CMD ["sh", "/opt/pentaho/biserver-ce/start-pentaho.sh"]
Pentaho BI Server
$ docker build -t pentaho/biserver:5.4 .
$ docker run --rm -p 8080:8080 -it pentaho/biserver:5.4
Building an image and runing docker container
Open Pentaho BI Server
Deploying Project
Deploying EDW CENIPA project
$ wget -O - https://raw.githubusercontent.com/wmarinho/edw_cenipa/master/easy_install | sh
Check if containers are running
$ docker ps
The project has 3 containers :
• edwcenipa_db_1 – PostgreSQL database container
• edwcenipa_pdi_1 – Pentaho Data Integration container
• edwcenipa_biserver_1 – Pentaho BI Server container
Check logs
$ docker logs -f edwcenipa_pdi_1
$ docker logs -f edwcenipa_biserver_1
Installation can take over 30 minutes , depending of server configuration and Internet bandwidth .
Docker Compose
docker-composse.yml – Define and run all docker applications
pdi:
image: image_cenipa/pdi
links:
- biserver:edw_biserver
volumes:
- /data/stage:/tmp/stage
environment:
- PGHOST=172.17.42.1
- PGUSER=pgadmin
- PGPASSWORD=pgadmin.
- PENTAHO_DI_JAVA_OPTIONS=-Xmx2014m -XX:MaxPermSize=256m
biserver:
image: image_cenipa/biserver
ports:
- "80:8080"
links:
- db:edw_db
environment:
- PGUSER=pgadmin
- PGPASSWORD=pgadmin.
- INSTALL_PLUGIN=saiku
- CUSTOM_LAYOUT=y
db:
image: wmarinho/postgresql:9.3
ports:
- "5432:5432"
Pentaho + Docker + Amazon
$ SUBNET_ID=
$ SGROUP_IDS=
$ KEY_NAME=
$ aws ec2 run-instances 
--image-id ami-e3106686 
--instance-type c4.large 
--subnet-id ${SUBNET_ID} 
--security-group-ids ${SGROUP_IDS} 
--key-name ${KEY_NAME} 
--associate-public-ip-address 
--user-data "https://raw.githubusercontent.com/wmarinho/edw_cenipa/master/aws/user-data.sh" 
--count 1
With the following command and the appropriate credentials , you can run the project on
Amazon Web Services. REMEMBER to replace the variables before running the command (check
the parameters in the AWS console) .
Thank you!
Sources:
https://github.com/wmarinho/edw_cenipa
https://github.com/wmarinho/docker-pentaho
https://hub.docker.com/r/wmarinho/pentaho/
Thanks:
Marcelo Módolo – Globosat
Caio Moreno – IT4Biz
Fernando Maia – IT4Biz

Weitere ähnliche Inhalte

Was ist angesagt?

The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...Databricks
 
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...DoKC
 
Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...
Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...
Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...Nelson Calero
 
MAA Best Practices for Oracle Database 19c
MAA Best Practices for Oracle Database 19cMAA Best Practices for Oracle Database 19c
MAA Best Practices for Oracle Database 19cMarkus Michalewicz
 
The Forefront of the Development for NVDIMM on Linux Kernel (Linux Plumbers c...
The Forefront of the Development for NVDIMM on Linux Kernel (Linux Plumbers c...The Forefront of the Development for NVDIMM on Linux Kernel (Linux Plumbers c...
The Forefront of the Development for NVDIMM on Linux Kernel (Linux Plumbers c...Yasunori Goto
 
State of the Trino Project
State of the Trino ProjectState of the Trino Project
State of the Trino ProjectMartin Traverso
 
All about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAll about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAltinity Ltd
 
New Generation Oracle RAC Performance
New Generation Oracle RAC PerformanceNew Generation Oracle RAC Performance
New Generation Oracle RAC PerformanceAnil Nair
 
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...HostedbyConfluent
 
Using Kafka to scale database replication
Using Kafka to scale database replicationUsing Kafka to scale database replication
Using Kafka to scale database replicationVenu Ryali
 
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLWebinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLSeveralnines
 
Scalable Filesystem Metadata Services with RocksDB
Scalable Filesystem Metadata Services with RocksDBScalable Filesystem Metadata Services with RocksDB
Scalable Filesystem Metadata Services with RocksDBAlluxio, Inc.
 
XDP in Practice: DDoS Mitigation @Cloudflare
XDP in Practice: DDoS Mitigation @CloudflareXDP in Practice: DDoS Mitigation @Cloudflare
XDP in Practice: DDoS Mitigation @CloudflareC4Media
 
Full Page Writes in PostgreSQL PGCONFEU 2022
Full Page Writes in PostgreSQL PGCONFEU 2022Full Page Writes in PostgreSQL PGCONFEU 2022
Full Page Writes in PostgreSQL PGCONFEU 2022Grant McAlister
 
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...Altinity Ltd
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationAlex Moundalexis
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisArnab Mitra
 
A Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQLA Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQLDatabricks
 

Was ist angesagt? (20)

The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
 
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...
 
Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...
Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...
Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...
 
MAA Best Practices for Oracle Database 19c
MAA Best Practices for Oracle Database 19cMAA Best Practices for Oracle Database 19c
MAA Best Practices for Oracle Database 19c
 
The Forefront of the Development for NVDIMM on Linux Kernel (Linux Plumbers c...
The Forefront of the Development for NVDIMM on Linux Kernel (Linux Plumbers c...The Forefront of the Development for NVDIMM on Linux Kernel (Linux Plumbers c...
The Forefront of the Development for NVDIMM on Linux Kernel (Linux Plumbers c...
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
State of the Trino Project
State of the Trino ProjectState of the Trino Project
State of the Trino Project
 
All about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAll about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdf
 
New Generation Oracle RAC Performance
New Generation Oracle RAC PerformanceNew Generation Oracle RAC Performance
New Generation Oracle RAC Performance
 
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
 
SQL Database Mirroring setup
SQL Database Mirroring setupSQL Database Mirroring setup
SQL Database Mirroring setup
 
Using Kafka to scale database replication
Using Kafka to scale database replicationUsing Kafka to scale database replication
Using Kafka to scale database replication
 
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQLWebinar slides: An Introduction to Performance Monitoring for PostgreSQL
Webinar slides: An Introduction to Performance Monitoring for PostgreSQL
 
Scalable Filesystem Metadata Services with RocksDB
Scalable Filesystem Metadata Services with RocksDBScalable Filesystem Metadata Services with RocksDB
Scalable Filesystem Metadata Services with RocksDB
 
XDP in Practice: DDoS Mitigation @Cloudflare
XDP in Practice: DDoS Mitigation @CloudflareXDP in Practice: DDoS Mitigation @Cloudflare
XDP in Practice: DDoS Mitigation @Cloudflare
 
Full Page Writes in PostgreSQL PGCONFEU 2022
Full Page Writes in PostgreSQL PGCONFEU 2022Full Page Writes in PostgreSQL PGCONFEU 2022
Full Page Writes in PostgreSQL PGCONFEU 2022
 
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
A Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQLA Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQL
 

Andere mochten auch

Building Data Integration and Transformations using Pentaho
Building Data Integration and Transformations using PentahoBuilding Data Integration and Transformations using Pentaho
Building Data Integration and Transformations using PentahoAshnikbiz
 
Indic threads pune12-accelerating computation in html 5
Indic threads pune12-accelerating computation in html 5Indic threads pune12-accelerating computation in html 5
Indic threads pune12-accelerating computation in html 5IndicThreads
 
Docker Ecosystem: Engine, Compose, Machine, Swarm, Registry
Docker Ecosystem: Engine, Compose, Machine, Swarm, RegistryDocker Ecosystem: Engine, Compose, Machine, Swarm, Registry
Docker Ecosystem: Engine, Compose, Machine, Swarm, RegistryMario IC
 
Continuous Development with Jenkins - Stephen Connolly at PuppetCamp Dublin '12
Continuous Development with Jenkins - Stephen Connolly at PuppetCamp Dublin '12Continuous Development with Jenkins - Stephen Connolly at PuppetCamp Dublin '12
Continuous Development with Jenkins - Stephen Connolly at PuppetCamp Dublin '12Puppet
 
Business Intelligence and Big Data Analytics with Pentaho
Business Intelligence and Big Data Analytics with Pentaho Business Intelligence and Big Data Analytics with Pentaho
Business Intelligence and Big Data Analytics with Pentaho Uday Kothari
 
Introduction to docker swarm
Introduction to docker swarmIntroduction to docker swarm
Introduction to docker swarmWalid Ashraf
 
Scaling Jenkins with Docker: Swarm, Kubernetes or Mesos?
Scaling Jenkins with Docker: Swarm, Kubernetes or Mesos?Scaling Jenkins with Docker: Swarm, Kubernetes or Mesos?
Scaling Jenkins with Docker: Swarm, Kubernetes or Mesos?Carlos Sanchez
 
Migración de datos con OpenERP-Kettle
Migración de datos con OpenERP-KettleMigración de datos con OpenERP-Kettle
Migración de datos con OpenERP-Kettleraimonesteve
 
Advanced ETL2 Pentaho
Advanced ETL2  Pentaho Advanced ETL2  Pentaho
Advanced ETL2 Pentaho Sunny U Okoro
 
NGINX Plus PLATFORM For Flawless Application Delivery
NGINX Plus PLATFORM For Flawless Application DeliveryNGINX Plus PLATFORM For Flawless Application Delivery
NGINX Plus PLATFORM For Flawless Application DeliveryAshnikbiz
 
Jenkins Peru Meetup Docker Ecosystem
Jenkins Peru Meetup Docker EcosystemJenkins Peru Meetup Docker Ecosystem
Jenkins Peru Meetup Docker EcosystemMario IC
 
Elementos ETL - Kettle Pentaho
Elementos ETL - Kettle Pentaho Elementos ETL - Kettle Pentaho
Elementos ETL - Kettle Pentaho valex_haro
 
Clustering with Docker Swarm - Dockerops 2016 @ Cento (FE) Italy
Clustering with Docker Swarm - Dockerops 2016 @ Cento (FE) ItalyClustering with Docker Swarm - Dockerops 2016 @ Cento (FE) Italy
Clustering with Docker Swarm - Dockerops 2016 @ Cento (FE) ItalyGiovanni Toraldo
 
Scaling Jenkins with Docker and Kubernetes
Scaling Jenkins with Docker and KubernetesScaling Jenkins with Docker and Kubernetes
Scaling Jenkins with Docker and KubernetesCarlos Sanchez
 
Pentaho | Data Integration & Report designer
Pentaho | Data Integration & Report designerPentaho | Data Integration & Report designer
Pentaho | Data Integration & Report designerHamdi Hmidi
 
Docker Ecosystem - Part II - Compose
Docker Ecosystem - Part II - ComposeDocker Ecosystem - Part II - Compose
Docker Ecosystem - Part II - ComposeMario IC
 
Docker swarm introduction
Docker swarm introductionDocker swarm introduction
Docker swarm introductionEvan Lin
 
Load Balancing Apps in Docker Swarm with NGINX
Load Balancing Apps in Docker Swarm with NGINXLoad Balancing Apps in Docker Swarm with NGINX
Load Balancing Apps in Docker Swarm with NGINXNGINX, Inc.
 

Andere mochten auch (20)

Building Data Integration and Transformations using Pentaho
Building Data Integration and Transformations using PentahoBuilding Data Integration and Transformations using Pentaho
Building Data Integration and Transformations using Pentaho
 
Indic threads pune12-accelerating computation in html 5
Indic threads pune12-accelerating computation in html 5Indic threads pune12-accelerating computation in html 5
Indic threads pune12-accelerating computation in html 5
 
Docker Ecosystem: Engine, Compose, Machine, Swarm, Registry
Docker Ecosystem: Engine, Compose, Machine, Swarm, RegistryDocker Ecosystem: Engine, Compose, Machine, Swarm, Registry
Docker Ecosystem: Engine, Compose, Machine, Swarm, Registry
 
Continuous Development with Jenkins - Stephen Connolly at PuppetCamp Dublin '12
Continuous Development with Jenkins - Stephen Connolly at PuppetCamp Dublin '12Continuous Development with Jenkins - Stephen Connolly at PuppetCamp Dublin '12
Continuous Development with Jenkins - Stephen Connolly at PuppetCamp Dublin '12
 
Business Intelligence and Big Data Analytics with Pentaho
Business Intelligence and Big Data Analytics with Pentaho Business Intelligence and Big Data Analytics with Pentaho
Business Intelligence and Big Data Analytics with Pentaho
 
Introduction to GPU Programming
Introduction to GPU ProgrammingIntroduction to GPU Programming
Introduction to GPU Programming
 
Introduction to docker swarm
Introduction to docker swarmIntroduction to docker swarm
Introduction to docker swarm
 
Scaling Jenkins with Docker: Swarm, Kubernetes or Mesos?
Scaling Jenkins with Docker: Swarm, Kubernetes or Mesos?Scaling Jenkins with Docker: Swarm, Kubernetes or Mesos?
Scaling Jenkins with Docker: Swarm, Kubernetes or Mesos?
 
Migración de datos con OpenERP-Kettle
Migración de datos con OpenERP-KettleMigración de datos con OpenERP-Kettle
Migración de datos con OpenERP-Kettle
 
Tao zhang
Tao zhangTao zhang
Tao zhang
 
Advanced ETL2 Pentaho
Advanced ETL2  Pentaho Advanced ETL2  Pentaho
Advanced ETL2 Pentaho
 
NGINX Plus PLATFORM For Flawless Application Delivery
NGINX Plus PLATFORM For Flawless Application DeliveryNGINX Plus PLATFORM For Flawless Application Delivery
NGINX Plus PLATFORM For Flawless Application Delivery
 
Jenkins Peru Meetup Docker Ecosystem
Jenkins Peru Meetup Docker EcosystemJenkins Peru Meetup Docker Ecosystem
Jenkins Peru Meetup Docker Ecosystem
 
Elementos ETL - Kettle Pentaho
Elementos ETL - Kettle Pentaho Elementos ETL - Kettle Pentaho
Elementos ETL - Kettle Pentaho
 
Clustering with Docker Swarm - Dockerops 2016 @ Cento (FE) Italy
Clustering with Docker Swarm - Dockerops 2016 @ Cento (FE) ItalyClustering with Docker Swarm - Dockerops 2016 @ Cento (FE) Italy
Clustering with Docker Swarm - Dockerops 2016 @ Cento (FE) Italy
 
Scaling Jenkins with Docker and Kubernetes
Scaling Jenkins with Docker and KubernetesScaling Jenkins with Docker and Kubernetes
Scaling Jenkins with Docker and Kubernetes
 
Pentaho | Data Integration & Report designer
Pentaho | Data Integration & Report designerPentaho | Data Integration & Report designer
Pentaho | Data Integration & Report designer
 
Docker Ecosystem - Part II - Compose
Docker Ecosystem - Part II - ComposeDocker Ecosystem - Part II - Compose
Docker Ecosystem - Part II - Compose
 
Docker swarm introduction
Docker swarm introductionDocker swarm introduction
Docker swarm introduction
 
Load Balancing Apps in Docker Swarm with NGINX
Load Balancing Apps in Docker Swarm with NGINXLoad Balancing Apps in Docker Swarm with NGINX
Load Balancing Apps in Docker Swarm with NGINX
 

Ähnlich wie Building a data warehouse with Pentaho and Docker

Introduction to Docker
Introduction to DockerIntroduction to Docker
Introduction to Docker皓鈞 張
 
Docker Container As A Service - Mix-IT 2016
Docker Container As A Service - Mix-IT 2016Docker Container As A Service - Mix-IT 2016
Docker Container As A Service - Mix-IT 2016Patrick Chanezon
 
BBL Premiers pas avec Docker
BBL Premiers pas avec DockerBBL Premiers pas avec Docker
BBL Premiers pas avec Dockerkanedafromparis
 
Develop with docker 2014 aug
Develop with docker 2014 augDevelop with docker 2014 aug
Develop with docker 2014 augVincent De Smet
 
Dayta AI Seminar - Kubernetes, Docker and AI on Cloud
Dayta AI Seminar - Kubernetes, Docker and AI on CloudDayta AI Seminar - Kubernetes, Docker and AI on Cloud
Dayta AI Seminar - Kubernetes, Docker and AI on CloudJung-Hong Kim
 
Dockerizing a Symfony2 application
Dockerizing a Symfony2 applicationDockerizing a Symfony2 application
Dockerizing a Symfony2 applicationRoman Rodomansky
 
From Docker to Production - ZendCon 2016
From Docker to Production - ZendCon 2016From Docker to Production - ZendCon 2016
From Docker to Production - ZendCon 2016Chris Tankersley
 
Deploying Windows Containers on Windows Server 2016
Deploying Windows Containers on Windows Server 2016Deploying Windows Containers on Windows Server 2016
Deploying Windows Containers on Windows Server 2016Ben Hall
 
Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...
Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...
Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...Patrick Chanezon
 
Architecting .NET Applications for Docker and Container Based Deployments
Architecting .NET Applications for Docker and Container Based DeploymentsArchitecting .NET Applications for Docker and Container Based Deployments
Architecting .NET Applications for Docker and Container Based DeploymentsBen Hall
 
Scaleable PHP Applications in Kubernetes
Scaleable PHP Applications in KubernetesScaleable PHP Applications in Kubernetes
Scaleable PHP Applications in KubernetesRobert Lemke
 
Improve your Java Environment with Docker
Improve your Java Environment with DockerImprove your Java Environment with Docker
Improve your Java Environment with DockerHanoiJUG
 
Docker module 1
Docker module 1Docker module 1
Docker module 1Liang Bo
 
TIAD 2016 : Real-Time Data Processing Pipeline & Visualization with Docker, S...
TIAD 2016 : Real-Time Data Processing Pipeline & Visualization with Docker, S...TIAD 2016 : Real-Time Data Processing Pipeline & Visualization with Docker, S...
TIAD 2016 : Real-Time Data Processing Pipeline & Visualization with Docker, S...The Incredible Automation Day
 
Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...
Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...
Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...Roberto Hashioka
 
DevOPS training - Day 2/2
DevOPS training - Day 2/2DevOPS training - Day 2/2
DevOPS training - Day 2/2Vincent Mercier
 
Docker for developers on mac and windows
Docker for developers on mac and windowsDocker for developers on mac and windows
Docker for developers on mac and windowsDocker, Inc.
 
WSO2ConEU 2016 Tutorial - Deploying WSO2 Middleware on Containers
WSO2ConEU 2016 Tutorial - Deploying WSO2 Middleware on ContainersWSO2ConEU 2016 Tutorial - Deploying WSO2 Middleware on Containers
WSO2ConEU 2016 Tutorial - Deploying WSO2 Middleware on ContainersLakmal Warusawithana
 
Deploying WSO2 Middleware on Containers
Deploying WSO2 Middleware on ContainersDeploying WSO2 Middleware on Containers
Deploying WSO2 Middleware on ContainersImesh Gunaratne
 
WebSphere and Docker
WebSphere and DockerWebSphere and Docker
WebSphere and DockerDavid Currie
 

Ähnlich wie Building a data warehouse with Pentaho and Docker (20)

Introduction to Docker
Introduction to DockerIntroduction to Docker
Introduction to Docker
 
Docker Container As A Service - Mix-IT 2016
Docker Container As A Service - Mix-IT 2016Docker Container As A Service - Mix-IT 2016
Docker Container As A Service - Mix-IT 2016
 
BBL Premiers pas avec Docker
BBL Premiers pas avec DockerBBL Premiers pas avec Docker
BBL Premiers pas avec Docker
 
Develop with docker 2014 aug
Develop with docker 2014 augDevelop with docker 2014 aug
Develop with docker 2014 aug
 
Dayta AI Seminar - Kubernetes, Docker and AI on Cloud
Dayta AI Seminar - Kubernetes, Docker and AI on CloudDayta AI Seminar - Kubernetes, Docker and AI on Cloud
Dayta AI Seminar - Kubernetes, Docker and AI on Cloud
 
Dockerizing a Symfony2 application
Dockerizing a Symfony2 applicationDockerizing a Symfony2 application
Dockerizing a Symfony2 application
 
From Docker to Production - ZendCon 2016
From Docker to Production - ZendCon 2016From Docker to Production - ZendCon 2016
From Docker to Production - ZendCon 2016
 
Deploying Windows Containers on Windows Server 2016
Deploying Windows Containers on Windows Server 2016Deploying Windows Containers on Windows Server 2016
Deploying Windows Containers on Windows Server 2016
 
Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...
Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...
Docker Azure Friday OSS March 2017 - Developing and deploying Java & Linux on...
 
Architecting .NET Applications for Docker and Container Based Deployments
Architecting .NET Applications for Docker and Container Based DeploymentsArchitecting .NET Applications for Docker and Container Based Deployments
Architecting .NET Applications for Docker and Container Based Deployments
 
Scaleable PHP Applications in Kubernetes
Scaleable PHP Applications in KubernetesScaleable PHP Applications in Kubernetes
Scaleable PHP Applications in Kubernetes
 
Improve your Java Environment with Docker
Improve your Java Environment with DockerImprove your Java Environment with Docker
Improve your Java Environment with Docker
 
Docker module 1
Docker module 1Docker module 1
Docker module 1
 
TIAD 2016 : Real-Time Data Processing Pipeline & Visualization with Docker, S...
TIAD 2016 : Real-Time Data Processing Pipeline & Visualization with Docker, S...TIAD 2016 : Real-Time Data Processing Pipeline & Visualization with Docker, S...
TIAD 2016 : Real-Time Data Processing Pipeline & Visualization with Docker, S...
 
Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...
Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...
Real-Time Data Processing Pipeline & Visualization with Docker, Spark, Kafka ...
 
DevOPS training - Day 2/2
DevOPS training - Day 2/2DevOPS training - Day 2/2
DevOPS training - Day 2/2
 
Docker for developers on mac and windows
Docker for developers on mac and windowsDocker for developers on mac and windows
Docker for developers on mac and windows
 
WSO2ConEU 2016 Tutorial - Deploying WSO2 Middleware on Containers
WSO2ConEU 2016 Tutorial - Deploying WSO2 Middleware on ContainersWSO2ConEU 2016 Tutorial - Deploying WSO2 Middleware on Containers
WSO2ConEU 2016 Tutorial - Deploying WSO2 Middleware on Containers
 
Deploying WSO2 Middleware on Containers
Deploying WSO2 Middleware on ContainersDeploying WSO2 Middleware on Containers
Deploying WSO2 Middleware on Containers
 
WebSphere and Docker
WebSphere and DockerWebSphere and Docker
WebSphere and Docker
 

Kürzlich hochgeladen

Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...amitlee9823
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachBoston Institute of Analytics
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...karishmasinghjnh
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 

Kürzlich hochgeladen (20)

Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 

Building a data warehouse with Pentaho and Docker

  • 1. Building a data warehouse with Pentaho and Docker Wellington Marinho wpmarinho@globo.com Sources https://github.com/wmarinho/edw_cenipa OPEN DATA CASE STUDY: CENIPA - AERONAUTICAL ACCIDENT INVESTIGATION AND PREVENTION CENTER http://dados.gov.br/dataset/ocorrencias-aeronauticas-da-aviacao-civil-brasileira
  • 2. Architecture GitHub docker-pentaho ( Dockerfile / scripts ) pentaho-biserver:5.4 ( imagem) edw-cenipa ( Dockerfile / scripts ) BI SERVER / PDI PROJETO EDW pentaho-kettle:5.4 ( imagem) BI SERVER PDI Docker Hub Jenkins + Docker Compose Amazon EC2 BI SERVER Amazon EC2 PDI Amazon RDS Postgresql / Redshift ETL Data Sources
  • 3. Dashboards – Aeronautical Accident & Incident http://localhost/pentaho/plugin/cenipa/api/ocorrencias
  • 5. CASE STUDY- EDW CENIPA EDW CENIPA is a opensource project designed to enable analysis of aeronautical incidentes that occured in the brazilian civil aviation. The project uses techniques and BI tools that explore innovative low-cost technologies. Historically, Business Intelligence platforms are expensive and impracticable for small projects. BI projects require specialized skills and high development costs. This work aims to break this barrier. All analyzes are based on open data provided by CENIPA with historical events of the last 10 years : • http://dados.gov.br/dataset/ocorrencias-aeronauticas-da-aviacao-civil-brasileira The graphics were inspired by the report available on the link: • http://www.cenipa.aer.mil.br/cenipa/index.php/estatisticas/estatisticas/panorama.
  • 6. Tools Here are some resources, tools and platforms that were used to develop and deploy the project • Amazon Web Services - https://aws.amazon.com/ • Linux Operating System - CentOS 6 / Ubuntu 14 • GitHub - https://github.com/ - Powerful collaboration, code review, and code management for open source and private projects • Docker - https://www.docker.com/ - An open platform for distributed applications for developers and sysadmins. • Pentaho - http://www.pentaho.com/ e http://community.pentaho.com/ - Big data integration and analytics solutions.
  • 7. Requirements • Linux Operating System 4GB RAM and 10GB available hard disk space • Docker v1.7.1 • CentOS: https://docs.docker.com/installation/centos/ • Ubuntu: https://docs.docker.com/installation/ubuntulinux/ • Mac : https://docs.docker.com/installation/mac/ • Docker Compose v1.4.2 - https://docs.docker.com/compose/install/ $ yum update -y $ yum install -y docker $ service docker start $ usermod -a -G docker ec2-user $ yum install -y git $ pip install -U docker-compose $ PATH=$PATH:/usr/local/bin Fast deployment on Amazon Linux AMI
  • 8. Pentaho + Docker – Building an image from a Dockerfile FROM java:7 MAINTAINER Wellington Marinho wpmarinho@globo.com # Init ENV ENV BISERVER_VERSION 5.4 ENV BISERVER_TAG 5.4.0.1-130 ENV PENTAHO_HOME /opt/pentaho # Apply JAVA_HOME RUN . /etc/environment ENV PENTAHO_JAVA_HOME $JAVA_HOME ENV PENTAHO_JAVA_HOME /usr/lib/jvm/java-1.7.0-openjdk-amd64 ENV JAVA_HOME /usr/lib/jvm/java-1.7.0-openjdk-amd64 # Install Dependences RUN apt-get update; apt-get install zip -y; apt-get install wget unzip git -y; apt-get clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*; RUN mkdir ${PENTAHO_HOME}; # Download Pentaho BI Server RUN /usr/bin/wget --progress=dot:giga http://downloads.sourceforge.net/project/pentaho/Business%20Intelligence%20Server/${BISERVER_VERSION}/biserver-ce-${BISERVER_TAG}.zip -O /tmp/biserver-ce-${BISERVER_TAG}.zip; /usr/bin/unzip -q /tmp/biserver-ce-${BISERVER_TAG}.zip -d $PENTAHO_HOME; rm -f /tmp/biserver-ce-${BISERVER_TAG}.zip $PENTAHO_HOME/biserver-ce/promptuser.sh; sed -i -e 's/(exec ".*") start/1 run/' $PENTAHO_HOME/biserver-ce/tomcat/bin/startup.sh; chmod +x $PENTAHO_HOME/biserver-ce/start-pentaho.sh RUN useradd -s /bin/bash -d ${PENTAHO_HOME} pentaho; chown -R pentaho:pentaho ${PENTAHO_HOME}; #Always non-root user USER pentaho WORKDIR /opt/pentaho EXPOSE 8080 CMD ["sh", "/opt/pentaho/biserver-ce/start-pentaho.sh"]
  • 9. Pentaho BI Server $ docker build -t pentaho/biserver:5.4 . $ docker run --rm -p 8080:8080 -it pentaho/biserver:5.4 Building an image and runing docker container Open Pentaho BI Server
  • 10. Deploying Project Deploying EDW CENIPA project $ wget -O - https://raw.githubusercontent.com/wmarinho/edw_cenipa/master/easy_install | sh Check if containers are running $ docker ps The project has 3 containers : • edwcenipa_db_1 – PostgreSQL database container • edwcenipa_pdi_1 – Pentaho Data Integration container • edwcenipa_biserver_1 – Pentaho BI Server container Check logs $ docker logs -f edwcenipa_pdi_1 $ docker logs -f edwcenipa_biserver_1 Installation can take over 30 minutes , depending of server configuration and Internet bandwidth .
  • 11. Docker Compose docker-composse.yml – Define and run all docker applications pdi: image: image_cenipa/pdi links: - biserver:edw_biserver volumes: - /data/stage:/tmp/stage environment: - PGHOST=172.17.42.1 - PGUSER=pgadmin - PGPASSWORD=pgadmin. - PENTAHO_DI_JAVA_OPTIONS=-Xmx2014m -XX:MaxPermSize=256m biserver: image: image_cenipa/biserver ports: - "80:8080" links: - db:edw_db environment: - PGUSER=pgadmin - PGPASSWORD=pgadmin. - INSTALL_PLUGIN=saiku - CUSTOM_LAYOUT=y db: image: wmarinho/postgresql:9.3 ports: - "5432:5432"
  • 12. Pentaho + Docker + Amazon $ SUBNET_ID= $ SGROUP_IDS= $ KEY_NAME= $ aws ec2 run-instances --image-id ami-e3106686 --instance-type c4.large --subnet-id ${SUBNET_ID} --security-group-ids ${SGROUP_IDS} --key-name ${KEY_NAME} --associate-public-ip-address --user-data "https://raw.githubusercontent.com/wmarinho/edw_cenipa/master/aws/user-data.sh" --count 1 With the following command and the appropriate credentials , you can run the project on Amazon Web Services. REMEMBER to replace the variables before running the command (check the parameters in the AWS console) .