Weitere ähnliche Inhalte Ähnlich wie Eskimo - Big Data 2.0 Management Platform (20) Kürzlich hochgeladen (20) Eskimo - Big Data 2.0 Management Platform3. 3 © 2019 eskimo.sh - All rights reserved
Hello World !
I am Eskimo
A Big Data Management Web Console
to
build, manage and operate
Big Data 2.0 clusters using
Docker and Mesos
4. 4 © 2019 eskimo.sh - All rights reserved
Eskimo is …
… in a certain way the Operating System of your Big Data Cluster :
- an Administration Application aimed at drastically simplifying
the deployment, administration and operation of your Big Data Cluster
- a state of the art Big Data 2.0 platform
based on Docker, Mesos and Systemd
and packaging Gluster, Spark, Kafka, Flink and ElasticSearch
- a collection of ready to use docker containers
packaging fine-tuned and highly customized plug and play services
- a framework for building and deploying Big Data and NoSQL services
based on docker
5. 5 © 2019 eskimo.sh - All rights reserved
Eskimo Key Features
Abstraction of location
Just define where you want
to run which services and let
eskimo take care of
everything.
Move services between
nodes or install new services
in just a few clicks.
Don’t bother remembering
where you installed Web
consoles and UI
applications, Eskimo wraps
them all in a single UI.
Eskimo Web Console
Eskimo’s tip of the iceberg is
its flagship web console.
The Eskimo Console is the
single and entry point to all
your cluster operations, from
services installation to
accessing Kibana, Zeppelin
and other UI applications.
The Eskimo Console also
provides SSH consoles, File
browser access and
monitoring to your cluster.
Services Framework
Eskimo is a Big Data
Components service
development and integration
framework based on Docker
and Systemd.
Eskimo provides out of the
box ready-to use
components such as Spark,
ElasticSearch, Kafka,
Mesos, Zeppelin, etc.
Eskimo also enables the
user to develop his own
services very easily.
6. 6 © 2019 eskimo.sh - All rights reserved
Why is Eskimo cool ? 4 major reasons.
- Taking care of it !
Making Mesos, Kafka, ElasticSearch, Flink, Spark, etc. work perfectly together is difficult and tedious.
Eskimo takes care of everything.
- Big Data 2.0
Most if not all private-cloud Big Data Platform such as Hortonworks, Cloudera, MapR, etc. are based on
Hadoop, HDFS, YARN, etc. which are quite old components and technology.
Eskimo is based on Mesos, ElasticSearch, Kafka and Spark, cutting edge components from a newer
generation.
- Leveraging on docker
Most if not all private-cloud Big Data Platform such as Hortonworks, Cloudera, MapR, etc. install
components natively, thus having strong requirements and impacts on underlying nodes.
Eskimo uses docker to isolates Eskimo components from underlying hosts and vice versa.
- Eskimo is an open platform.
Eskimo works out of the box but users can customize and extend the way they like, the way they decide
7. 7 © 2019 eskimo.sh - All rights reserved
Big Data 2.0 Technologies
Apache
Mesos
Apache
Kafka
Cerebro
Elastic Kibana
Elastic Logstash
Apache Zeppelin
ElasticSearch
Gluster
FS
Apache
Zookeeper
8. 8 © 2019 eskimo.sh - All rights reserved
Why Eskimo ? (1/2)
Big Data 2.0
In contrary to popular
Hadoop-based and other
Big Data Platforms, Eskimo
is based on cutting-edge
technologies:
- GlusterFS instead of
HDFS
- Spark instead of Hive
and Pig
- Mesos instead of Yarn
- Docker instead of not
native deployment
- ElasticSearch instead of
not HBase
- Flink instead of not
Storm
One ring to Rule them all
Making docker, gluster,
elasticsearch, kafka, spark, zeppelin,
etc. all work perfectly and 100%
together is very tedious and difficult.
Eskimo takes care of everything and
fine tunes all these services to make
them understand each other and
work together.
Eskimo enables you one-click
administration of all of them, moving
services, provisioning nodes, etc.
Yet it's open : open-source and built
on standards
One size fits all
Do you want to build a
production grade Big
Data Processing cluster
with thousands of nodes
to analyze the internet ?
Or do you want to build
a small AI laboratory on
your own laptop ?
Eskimo is made for you
in these both cases.
Lightweight in DNA
MapR, Hortonworks,
Cloudera and every other
hadoop based Big Data
Platforms are Behemoths.
Eskimo leverages on
gluster, mesos, spark,
elasticsearch, logstash,
kibana, Zeppelin, etc. -
simple and extremely
lightweight components
that have a broad use
cases coverage while
simplifying administration,
operation and usage
9. 9 © 2019 eskimo.sh - All rights reserved
Open platform
extensible and customizable
Eskimo works out of the box, taking
care of the burden to make all this
software works perfectly and 100%
together.
Eskimo is not a black box, it’s an open
platform. One can fine tune and adapt
everything exactly as desired : from
the docker containers building to the
services setup on the platform.
Want to leverage on eskimo to
integrate other services such as
Apache Flink or Cassandra ? declare
your own services and import your
own containers, built it as you like !
Universal Platform
Eskimo is exhaustively built
on top of Docker.
Only mesos agents need to
be compiled and adapted
to the host linux OS
running your cluster nodes
All the other components -
from kafka to zeppelin
through spark - run on
docker
Eskimo is successfully
tested on Ubuntu, Debian,
CentOS and Fedora nodes
so far ... more are coming.
Cloud Friendly
Build your own Big Data
Cloud
Eskimo is VM friendly
You have a bunch of VMs
somewhere on Amazon or
google cloud ?
Make it a state of the art big
data cluster, your way, not
amazon or google's
predefined, fixed and
constraining way
Choose your services and
let eskimo take care of
everything
Simplicity as a
core value
Eskimo leverages on
simple approaches and
technologies.
No fancy scripting
language, just plain old
shell scripts.
No fancy container
management middleware,
just plain old docker and
systemd.
Eskimo doesn’t require you
to learn anything else than
Linux standard tools.
Why Eskimo ? (2/2)
10. 10 © 2019 eskimo.sh - All rights reserved
DataLoadingSystemAdmin.
Eskimo Platform Overview
Logstash
DataVisualization
Kibana
Mesos
Console
Flink
App
Master
Prome-
theus
Grafana
Spark
History
Cerebro
Python
Apache Mesos
Docker / SystemD / Linux
DataProcessing
Apache
Spark
Apache Kafka
Apache
Flink
Zeppelin
Kafka
Manager
ElasticSearch
Gluster FS
Eskimo
11. 11 © 2019 eskimo.sh - All rights reserved
Data Ingestion
Eskimo Platform Architecture
Data Storage
ElasticSearchGluster FS
Low Level and System Orchestration
Docker
Zookeeper
Resources Management
Mesos
Logstash
Data Processing Monitoring
Kafka Spark Flink Prometheus Grafana
Data Science Administration and Management
Kibana Zeppelin
Flink
Dashb.
Cerebro
Kafka
Manager
Spark
History
Mesos
Consol.
12. 12 © 2019 eskimo.sh - All rights reserved
Docker
Eskimo Platform Technical ArchitectureUsertierProcessingtierStoragelayer
HTTP / JDBC / APIEs
/run&/var/log
(temporaryfiles)
Node.js Java VM Linux Operating System
Kafka
Mesos
Agent
Mesos
Master Data frames
Parquet
Web
Browser
Mesos
UI
Spark
History
Kafka
Manager Cerebro KIbana
Logstash
Zeppelin
ElasticSearch
ES Hadoop
Eskimo
Gluster FS Gluster Shares - /var/lib/spark/*
Zookeeper
Grafana
Prometheus
Flink
Flink
Dashboard
13. 13 © 2019 eskimo.sh - All rights reserved
Zeppelin
Kibana
Cerebro
Spark UI
Mesos UI
Kafka Manager
Grafana
Flink Dashboard
External
World
Data capture &
extraction layer
Result visualization &
management layer
Raw data storage &
dispatching layer
Data capture flow Data Processing flow
Eskimo Typical Application architecture
Data processing and Analytics layer
STP
Eskimo
Data Analytics and Visualization flow
External World /
Source Data
STP /
RT
Logstash
Kafka
Gluster FS
Mesos
ElasticSearch
STP
File
Copy
Real-
Time
Batch
R
T
Prometheus
B.
FileCopy
Scripts
Flink
14. 14 © 2019 eskimo.sh - All rights reserved
Main Node
Eskimo Example System architectureEskimo
Master Node 1
Master Node 2
MesoUI
Spark
History
Kafka
Manager
Cerebro KIbana
Zeppelin
Mesos
Master
Slave Node 2
Gluster FS
Mesos Agent
Logstash
Elastic
Search
Zoo
keeper
Slave Node 1
Gluster FS
Mesos Agent
Logstash
Elastic
Search
Slave Node 4
Gluster FS
Mesos Agent
Flink
Elastic
Search
Slave Node 3
Gluster FS
Mesos Agent
Fllink
Elastic
Search
Grafana
Flink
Dashboard
Prom.
Prom.
Prom.
Prom.
15. 15 © 2019 eskimo.sh - All rights reserved
Main Node
SSH Tunneling
Master Node 1
Master Node 2
MesosUI
Spark
History
Kafka
Manager
Cerebro KIbana ZeppelinGrafana
Flink
Dashb.
SSHSSH
Eskimo SSH TunnelsHTTP Proxy
Web
Browser
16. 16 © 2019 eskimo.sh - All rights reserved
Gluster Infrastructure
Master Node 1 Slave Node 1 Slave Node 2
Gluster Docker Container Gluster Docker Container Gluster Docker Container
Gluster FS
Mutualizing services shared folders, e.g
- /var/lib/spark/data && /var/lib/spark/event_log
- /var/lib/flink/data
- /var/lib/logstash/data
- Etc.
EskimoGluster
Infrastructure
Command server
gluster_remote.py
Command server
gluster_remote.py
Command server
gluster_remote.py
HTTP REST API
Command client
gluster_call_remote.sh
Command client
gluster_call_remote.sh
Command client
gluster_call_remote.sh
Gluster Toolbox
gluster_mount.sh
Gluster Toolbox
gluster_mount.sh
Gluster Toolbox
gluster_mount.sh
18. 18 © 2019 eskimo.sh - All rights reserved
Eskimo System Status
19. 19 © 2019 eskimo.sh - All rights reserved
Zeppelin Application in Eskimo
20. 20 © 2019 eskimo.sh - All rights reserved
Kibana Application in Eskimo
21. 21 © 2019 eskimo.sh - All rights reserved
Eskimo cluster nodes configuration
22. 22 © 2019 eskimo.sh - All rights reserved
Spark UI in Eskimo
23. 23 © 2019 eskimo.sh - All rights reserved
Kafka Manager in Eskimo
24. 24 © 2019 eskimo.sh - All rights reserved
Flink App Master in Eskimo
25. 25 © 2019 eskimo.sh - All rights reserved
Eskimo SFTP File Manager
26. 26 © 2019 eskimo.sh - All rights reserved
Eskimo Web SSH Console
27. 27 © 2019 eskimo.sh - All rights reserved
Mesos Console in Eskimo
28. 28 © 2019 eskimo.sh - All rights reserved
Cerebro (ElasticSearch admin) in Eskimo
29. 29 © 2019 eskimo.sh - All rights reserved
Grafana (Cluster monitoring) in Eskimo
31. 31 © 2019 eskimo.sh - All rights reserved
Community Edition vs. Enterprise Edition
Feature
Community
Edition
Enterprise
Edition
- Setup of the cluster and installation of services X X
- Supporting Red Hat-based and Debian-based Linux nodes X X
- Supporting single node deployments or clusters of hundreds of nodes X X
- Prepackaged and effective distributions of docker containers and systemd configurations for
supported services and UI applications (see below)
X X
- Automatic installation and setup of packaged services: ntp, zookeeper, gluster, mesos, kafka, spark, flink, elasticsearch,
logstash)
X X
- Automatic installation, setup and embedding of UI services : Kibana, Grafana, Gdash, Mesos Console, Kafka Manager, Spark
History, Flink App Master, Cerebro, Zeppelin
X X
- Service customization and extension with service development framework X X
- Management of services and Abstraction of Localization (easy moving) X X
- Monitoring of services and cluster status X X
- Web SSH Terminals X X
- Web Cluster SFTP File Management X X
- Documentation and guides X X
- Advanced settings management for services X
- High Availability for service masters and UI services X
- Backup and restore functions X
- Encryption (security feature : data encryption and intra-cluster communications encryption) X
- User management and user impersonation when accessing services through Eskimo X
- File Editor integrated in SFTP File Manager X
- Support X
32. 32 © 2019 eskimo.sh - All rights reserved
Thank you.