SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Downloaden Sie, um offline zu lesen
Dynamic Provisioning of Data Intensive
Computing Middleware Frameworks: A Case
Study
Linh B. Ngo1
Michael E. Payne1
Flavio Villanustre2
Richard Taylor2
Amy W. Apon1
1School of Computing, Clemson University
2LexisNexis® Risk Solutions
Contents
1.	
  Overview	
  of	
  Clemson	
  University’s	
  Cyberinfrastructure	
  Resource	
  
2.	
  Demand	
  for	
  Dynamic	
  Data-­‐Intensive	
  Compu@ng	
  Middleware	
  Frameworks	
  
3.	
  Dynamic	
  Provisioning	
  of	
  Data-­‐Intensive	
  Compu@ng	
  Framework	
  	
  
4.	
  Deploying	
  Hadoop	
  Ecosystem	
  vs.	
  Deploying	
  HPCC	
  Systems®	
  
5.	
  Lessons	
  Learned	
  
	
  
Cyberinfrastructure Resource at Clemson University
Condominium model
2,007 Computer Nodes (21,400 cores), including 276 GPU nodes
Sustained 551 Tflops (benchmarked on GPU nodes only)
1289 active users, 12 academic departments across 36 fields of research
Facilities
Cyberinfrastructure Resource at Clemson University
•  1G/10G/Myrinet-­‐10G/Infiniband-­‐40G/Infiniband-­‐56G	
  
•  Local	
  storage	
  between	
  100-­‐200GB	
  (majority)	
  and	
  400-­‐900GB	
  (since	
  2013)	
  
•  Shared	
  233TB	
  OrangeFS	
  scratch	
  space	
  and	
  more	
  than	
  3PB	
  archival	
  space	
  
Demand	
  for	
  Dynamic	
  Data-­‐Intensive	
  Compu@ng	
  Middleware	
  
Frameworks
•  Genome	
  Sequencing	
  (Hadoop	
  MapReduce/GPGPU)	
  
•  Molecular	
  Dynamic	
  Forward	
  Flux	
  Sampling	
  (Hadoop	
  Streaming/LAMMPS)	
  
•  Streaming	
  Data	
  Infrastructure	
  for	
  Connected	
  Vehicle	
  System	
  (Hadoop	
  
Distributed	
  File	
  System/Spark/Ka_a)	
  
•  Big	
  Scholarly	
  Data	
  (HPCC	
  Systems)	
  
•  CS	
  Course	
  in	
  Distributed	
  and	
  Cluster	
  Compu@ng	
  (MPI/MapReduce,	
  
Hadoop/Spark/HPCC	
  Systems®	
  …)	
  
Demand	
  for	
  Dynamic	
  Data-­‐Intensive	
  Compu@ng	
  Middleware	
  
Frameworks
•  Changes	
  in	
  cyberinfrastructure	
  support	
  model	
  for	
  data	
  infrastructure:	
  
–  Beyond	
  a	
  tradi@onal	
  remote	
  distributed	
  file	
  system	
  model	
  
–  From	
  sta@c	
  and	
  dedicated	
  resource	
  to	
  dynamic	
  resource	
  
–  Data	
  management	
  processes	
  co-­‐locate	
  with	
  compu@ng	
  processes	
  
•  Challenges	
  for	
  system	
  administrators:	
  
–  Accommoda@ng	
  different	
  frameworks	
  for	
  different	
  research	
  
–  Complying	
  with	
  exis@ng	
  administra@ve	
  policy	
  and	
  scheduling	
  priority	
  
•  What	
  can	
  users	
  do?	
  
–  Deploying	
  dynamic	
  data-­‐intensive	
  compu@ng	
  frameworks	
  within	
  the	
  
limits	
  of	
  user	
  privilege	
  and	
  without	
  the	
  interven@on	
  of	
  administrators	
  
Dynamic	
  Provisioning	
  of	
  Data-­‐Intensive	
  Compu@ng	
  Framework:	
  
Installa@on	
  
•  Where	
  to	
  install	
  
1.  Home	
  directory:	
  Persistent,	
  limited	
  in	
  storage	
  
2.  Shared	
  distributed	
  storage:	
  Fast,	
  semi-­‐persistent,	
  “unlimited”	
  storage	
  
3.  Local	
  storage	
  on	
  compute	
  node:	
  Fast,	
  non-­‐persistent,	
  requires	
  
reinstalla@on	
  
•  How	
  to	
  handle	
  dependencies	
  
1.  Ideally	
  in	
  home	
  or	
  shared	
  distributed	
  storage	
  (persistency)	
  
2.  Dynamic	
  loading	
  mechanisms	
  via	
  environment	
  paths	
  
Target	
  
deployment	
  
directories	
  on	
  
local	
  disks	
  
	
  
PBS_NODEFILE	
  
Deployment/
ConfiguraBon	
  Scripts	
  
1	
  
2	
  
3	
  
4	
  
user.palmeHo.clemson.edu	
  
Dynamic	
  Provisioning	
  of	
  Data-­‐Intensive	
  Compu@ng	
  Framework:	
  
Deployment	
  
Deploying	
  Hadoop	
  Ecosystem	
  vs.	
  deploying	
  HPCC	
  Systems®:	
  
Overview	
  
•  Open	
  source	
  alterna@ves	
  based	
  
on	
  the	
  conceptual	
  architecture	
  of	
  
a	
  data-­‐intensive	
  compu@ng	
  
infrastructure	
  developed	
  by	
  
Google	
  
•  Comprehensive	
  data-­‐intensive	
  
compu@ng	
  system	
  targe@ng	
  
enterprise	
  users,	
  developed	
  in	
  
early	
  2000,	
  open	
  source	
  since	
  
2011	
  
	
  
Deploying	
  Hadoop	
  Ecosystem	
  vs.	
  deploying	
  HPCC	
  Systems®:	
  
Installa@on:	
  Hadoop	
  
•  Self-­‐contained,	
  pre-­‐compiled	
  jar	
  files	
  
•  No	
  installa@on	
  is	
  needed,	
  relies	
  on	
  shell	
  scripts	
  to	
  launch	
  component	
  
daemons	
  
•  Dependencies:	
  JDK	
  
Deploying	
  Hadoop	
  Ecosystem	
  vs.	
  deploying	
  HPCC	
  Systems®:	
  
Installa@on:	
  HPCC	
  Systems	
  
•  Standard	
  configure/make/make	
  install	
  
–  Assump@on	
  about	
  an	
  industrial	
  produc@on	
  environment	
  (with	
  
administra@ve	
  privileges)	
  
–  Modifica@on	
  to	
  avoid	
  hard-­‐coded	
  system	
  installa@on	
  paths	
  
–  Modifica@on	
  of	
  template	
  XML	
  configura@on	
  files	
  to	
  avoid	
  default	
  
HPCC	
  Systems-­‐specific	
  user	
  crea@on	
  and	
  administra@ve	
  check	
  
•  Dependencies:	
  	
  
–  Not	
  on	
  Palmeko:	
  ICU,	
  Xalan,	
  Xerces,	
  APR	
  …	
  
–  On	
  Palmeko	
  but	
  no	
  correct	
  version:	
  Binu@ls	
  
Deploying	
  Hadoop	
  Ecosystem	
  vs.	
  deploying	
  HPCC	
  Systems:	
  
Deployment:	
  Hadoop	
  
•  Component	
  
placement	
  
determina@on	
  
•  Cleanup	
  target	
  
directories	
  from	
  
previous	
  
deployment	
  
•  Create	
  target	
  
directories	
  (log,	
  
storage,	
  pid	
  …)	
  
•  Synchronize	
  order	
  
of	
  component	
  
start-­‐up	
  
Namenode	
   ResourceManager	
   SparkMaster	
  
DataNode	
  
NodeManager	
  
SparkExecutor	
  
DataNode	
  
NodeManager	
  
SparkExecutor	
  
DataNode	
  
NodeManager	
  
SparkExecutor	
  
1st	
  node	
  in	
  
PBS_NODEFILE	
  
2nd	
  node	
  in	
  
PBS_NODEFILE	
  
3rd	
  node	
  in	
  
PBS_NODEFILE	
  
4th	
  node	
  in	
  
PBS_NODEFILE	
  
5th	
  node	
  in	
  
PBS_NODEFILE	
  
nth	
  node	
  in	
  
PBS_NODEFILE	
  
•  Addi@onal	
  components	
  (Hbase,	
  Hive,	
  Ka_a	
  …)	
  can	
  be	
  
added	
  to	
  this	
  deployment	
  model	
  
Deploying	
  Hadoop	
  Ecosystem	
  vs.	
  deploying	
  HPCC	
  Systems:	
  
Deployment:	
  HPCC	
  Systems	
  
	
  
•  Determine	
  node	
  
alloca@on	
  and	
  internal	
  
IP	
  addresses	
  
•  HPCC	
  Systems	
  is	
  
configured	
  via	
  its	
  own	
  
deployment	
  programs	
  
(configmgr,	
  configgen,	
  
hpcc-­‐init)	
  
1st	
  node	
  in	
  
PBS_NODEFILE	
  
2nd	
  node	
  in	
  
PBS_NODEFILE	
  
1st	
  node	
  in	
  
PBS_NODEFILE	
  
3rd	
  node	
  in	
  
PBS_NODEFILE	
  
4th	
  node	
  in	
  
PBS_NODEFILE	
  
5th	
  node	
  in	
  
PBS_NODEFILE	
  
nth	
  node	
  in	
  
PBS_NODEFILE	
  
Deploying	
  Hadoop	
  Ecosystem	
  vs.	
  deploying	
  HPCC	
  Systems:	
  
Deployment:	
  HPCC	
  Systems	
  
	
  
•  Node	
  memory	
  constraints	
  
•  HPCC	
  Systems	
  reserves	
  
75%	
  of	
  available	
  memory	
  
for	
  thor	
  by	
  default	
  
•  Palmeko	
  does	
  not	
  allow	
  
unlimited	
  memory	
  
reserva@on	
  	
  
•  As	
  a	
  result,	
  thor_master	
  	
  
cannot	
  launch	
  new	
  jobs	
  
via	
  fork()	
  
•  Resolved	
  by	
  lower	
  
memory	
  reserva@on	
  
1st	
  node	
  in	
  
PBS_NODEFILE	
  
2nd	
  node	
  in	
  
PBS_NODEFILE	
  
1st	
  node	
  in	
  
PBS_NODEFILE	
  
3rd	
  node	
  in	
  
PBS_NODEFILE	
  
4th	
  node	
  in	
  
PBS_NODEFILE	
  
5th	
  node	
  in	
  
PBS_NODEFILE	
  
nth	
  node	
  in	
  
PBS_NODEFILE	
  
Lessons	
  Learned	
  
•  A	
  common	
  approach	
  can	
  be	
  adapted	
  for	
  both	
  Hadoop	
  Ecosystem	
  and	
  
HPCC	
  Systems	
  
•  Limita@ons	
  on	
  non-­‐administra@ve	
  accounts	
  can	
  impact	
  the	
  deployment	
  
and	
  performance	
  via	
  system	
  resource	
  constraints	
  
–  Unable	
  to	
  u@lize	
  all	
  available	
  memory	
  on	
  allocated	
  node	
  (HPCC	
  
Systems)	
  
•  Dynamic	
  deployment	
  via	
  non-­‐administra@ve	
  accounts	
  provide	
  ini@a@ve	
  
for	
  users	
  to	
  experiment	
  with	
  and	
  u@lize	
  new	
  large	
  scale	
  frameworks	
  
without	
  addi@onal	
  burden	
  for	
  administrators	
  
Lessons	
  Learned	
  
•  Experience	
  in	
  deploying	
  as	
  users	
  is,	
  in	
  turn,	
  extremely	
  applicable	
  to	
  the	
  
process	
  of	
  deployment	
  with	
  administra@ve	
  privileges.	
  	
  
•  E.g.:	
  CloudLab	
  cloud	
  compu@ng	
  experimental	
  testbed	
  with	
  non-­‐persistent,	
  
ephemeral,	
  and	
  short-­‐term	
  (15	
  hours)	
  alloca@on	
  
–  Script-­‐based	
  installa@on	
  and	
  deployment	
  are	
  needed,	
  even	
  with	
  
administra@ve	
  right,	
  to	
  automate	
  the	
  deployment	
  of	
  the	
  experiment	
  
•  Experience	
  in	
  deploying	
  as	
  administrators	
  is	
  helpful	
  in	
  debugging	
  user-­‐
based	
  deployment:	
  
–  Iden@fica@on	
  and	
  resolu@on	
  of	
  memory	
  alloca@on	
  issue	
  in	
  HPCC	
  
Systems	
  were	
  done	
  by	
  changing	
  system	
  limita@on	
  using	
  administra@ve	
  
commands.	
  	
  
QUESTIONS?
Linh B. Ngo1 Michael E. Payne1 Flavio Villanustre2 Richard Taylor2 Amy W. Apon1
{lngo,mpayne3,aapon}@clemson.edu
1School of Computing, Clemson University
{flavio.villanustre,richard.taylor}@lexisnexis.com
2LexisNexis Risk Solutions
More information about HPCCSystems can be found at http://hpccsystems.com

Weitere ähnliche Inhalte

Was ist angesagt?

HDFS Federation++
HDFS Federation++HDFS Federation++
HDFS Federation++Hortonworks
 
Hadoop, Evolution of Hadoop, Features of Hadoop
Hadoop, Evolution of Hadoop, Features of HadoopHadoop, Evolution of Hadoop, Features of Hadoop
Hadoop, Evolution of Hadoop, Features of HadoopDr Neelesh Jain
 
Panasas ® University of Cologne Success Story
Panasas ® University of Cologne Success StoryPanasas ® University of Cologne Success Story
Panasas ® University of Cologne Success StoryPanasas
 
IRJET- A Novel Approach to Process Small HDFS Files with Apache Spark
IRJET- A Novel Approach to Process Small HDFS Files with Apache SparkIRJET- A Novel Approach to Process Small HDFS Files with Apache Spark
IRJET- A Novel Approach to Process Small HDFS Files with Apache SparkIRJET Journal
 
Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2hdhappy001
 
NonStop Hadoop - Applying the PaxosFamily of Protocols to make Critical Hadoo...
NonStop Hadoop - Applying the PaxosFamily of Protocols to make Critical Hadoo...NonStop Hadoop - Applying the PaxosFamily of Protocols to make Critical Hadoo...
NonStop Hadoop - Applying the PaxosFamily of Protocols to make Critical Hadoo...DataWorks Summit
 
Dynamic Resource Allocation Algorithm using Containers
Dynamic Resource Allocation Algorithm using ContainersDynamic Resource Allocation Algorithm using Containers
Dynamic Resource Allocation Algorithm using ContainersIRJET Journal
 
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014 WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014 Chris Almond
 
Apache Hudi: The Path Forward
Apache Hudi: The Path ForwardApache Hudi: The Path Forward
Apache Hudi: The Path ForwardAlluxio, Inc.
 
Hadoop migration and upgradation
Hadoop migration and upgradationHadoop migration and upgradation
Hadoop migration and upgradationShashwat Shriparv
 
Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...
Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...
Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...Spark Summit
 
Schedulers optimization to handle multiple jobs in hadoop cluster
Schedulers optimization to handle multiple jobs in hadoop clusterSchedulers optimization to handle multiple jobs in hadoop cluster
Schedulers optimization to handle multiple jobs in hadoop clusterShivraj Raj
 
Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase clientShashwat Shriparv
 

Was ist angesagt? (20)

HDFS Federation++
HDFS Federation++HDFS Federation++
HDFS Federation++
 
Hadoop, Evolution of Hadoop, Features of Hadoop
Hadoop, Evolution of Hadoop, Features of HadoopHadoop, Evolution of Hadoop, Features of Hadoop
Hadoop, Evolution of Hadoop, Features of Hadoop
 
Panasas ® University of Cologne Success Story
Panasas ® University of Cologne Success StoryPanasas ® University of Cologne Success Story
Panasas ® University of Cologne Success Story
 
IRJET- A Novel Approach to Process Small HDFS Files with Apache Spark
IRJET- A Novel Approach to Process Small HDFS Files with Apache SparkIRJET- A Novel Approach to Process Small HDFS Files with Apache Spark
IRJET- A Novel Approach to Process Small HDFS Files with Apache Spark
 
Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2Nicholas:hdfs what is new in hadoop 2
Nicholas:hdfs what is new in hadoop 2
 
NonStop Hadoop - Applying the PaxosFamily of Protocols to make Critical Hadoo...
NonStop Hadoop - Applying the PaxosFamily of Protocols to make Critical Hadoo...NonStop Hadoop - Applying the PaxosFamily of Protocols to make Critical Hadoo...
NonStop Hadoop - Applying the PaxosFamily of Protocols to make Critical Hadoo...
 
Upgrading hadoop
Upgrading hadoopUpgrading hadoop
Upgrading hadoop
 
Tutorial Haddop 2.3
Tutorial Haddop 2.3Tutorial Haddop 2.3
Tutorial Haddop 2.3
 
Dynamic Resource Allocation Algorithm using Containers
Dynamic Resource Allocation Algorithm using ContainersDynamic Resource Allocation Algorithm using Containers
Dynamic Resource Allocation Algorithm using Containers
 
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014 WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
 
Apache Hudi: The Path Forward
Apache Hudi: The Path ForwardApache Hudi: The Path Forward
Apache Hudi: The Path Forward
 
Unit 2.pptx
Unit 2.pptxUnit 2.pptx
Unit 2.pptx
 
Next generation technology
Next generation technologyNext generation technology
Next generation technology
 
Hadoop migration and upgradation
Hadoop migration and upgradationHadoop migration and upgradation
Hadoop migration and upgradation
 
Gfs vs hdfs
Gfs vs hdfsGfs vs hdfs
Gfs vs hdfs
 
Big data- HDFS(2nd presentation)
Big data- HDFS(2nd presentation)Big data- HDFS(2nd presentation)
Big data- HDFS(2nd presentation)
 
Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...
Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...
Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...
 
Schedulers optimization to handle multiple jobs in hadoop cluster
Schedulers optimization to handle multiple jobs in hadoop clusterSchedulers optimization to handle multiple jobs in hadoop cluster
Schedulers optimization to handle multiple jobs in hadoop cluster
 
Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase client
 
Enterprise Grade Streaming under 2ms on Hadoop
Enterprise Grade Streaming under 2ms on HadoopEnterprise Grade Streaming under 2ms on Hadoop
Enterprise Grade Streaming under 2ms on Hadoop
 

Andere mochten auch

Valoración apache
Valoración apache Valoración apache
Valoración apache liomd3
 
HPCC Systems 6.0.0 Highlights
HPCC Systems 6.0.0 HighlightsHPCC Systems 6.0.0 Highlights
HPCC Systems 6.0.0 HighlightsHPCC Systems
 
Optimizing Supervised and Implementing Unsupervised Machine Learning Algorith...
Optimizing Supervised and Implementing Unsupervised Machine Learning Algorith...Optimizing Supervised and Implementing Unsupervised Machine Learning Algorith...
Optimizing Supervised and Implementing Unsupervised Machine Learning Algorith...HPCC Systems
 
HPCC Systems Presentation to TDWI Chicago Chapter
HPCC Systems Presentation to TDWI Chicago ChapterHPCC Systems Presentation to TDWI Chicago Chapter
HPCC Systems Presentation to TDWI Chicago ChapterHPCC Systems
 
Big data processing using HPCC Systems Above and Beyond Hadoop
Big data processing using HPCC Systems Above and Beyond HadoopBig data processing using HPCC Systems Above and Beyond Hadoop
Big data processing using HPCC Systems Above and Beyond HadoopHPCC Systems
 
Studies of HPCC Systems from Machine Learning Perspectives
Studies of HPCC Systems from Machine Learning PerspectivesStudies of HPCC Systems from Machine Learning Perspectives
Studies of HPCC Systems from Machine Learning PerspectivesHPCC Systems
 
Introduction to the Open Source HPCC Systems Platform by Arjuna Chala
Introduction to the Open Source HPCC Systems Platform by Arjuna ChalaIntroduction to the Open Source HPCC Systems Platform by Arjuna Chala
Introduction to the Open Source HPCC Systems Platform by Arjuna ChalaHPCC Systems
 
HPCC Systems - Open source, Big Data Processing & Analytics
HPCC Systems - Open source, Big Data Processing & AnalyticsHPCC Systems - Open source, Big Data Processing & Analytics
HPCC Systems - Open source, Big Data Processing & AnalyticsHPCC Systems
 
HPCC Systems - ECL for Programmers - Big Data - Data Scientist
HPCC Systems - ECL for Programmers - Big Data - Data ScientistHPCC Systems - ECL for Programmers - Big Data - Data Scientist
HPCC Systems - ECL for Programmers - Big Data - Data ScientistFujio Turner
 
Meetup - Exabyte Big Data - HPCC Systems - SQL to ECL
Meetup - Exabyte Big Data - HPCC Systems - SQL to ECLMeetup - Exabyte Big Data - HPCC Systems - SQL to ECL
Meetup - Exabyte Big Data - HPCC Systems - SQL to ECLFujio Turner
 
2016 HPCC Systems Poster Presentation Competition
2016 HPCC Systems Poster Presentation Competition2016 HPCC Systems Poster Presentation Competition
2016 HPCC Systems Poster Presentation CompetitionHPCC Systems
 
HPCC Systems Engineering Summit Presentation - Leveraging HPCC Systems with V...
HPCC Systems Engineering Summit Presentation - Leveraging HPCC Systems with V...HPCC Systems Engineering Summit Presentation - Leveraging HPCC Systems with V...
HPCC Systems Engineering Summit Presentation - Leveraging HPCC Systems with V...HPCC Systems
 
HPCC Systems vs Hadoop
HPCC Systems vs HadoopHPCC Systems vs Hadoop
HPCC Systems vs HadoopFujio Turner
 
Patologías del Sistema Nervioso
Patologías del Sistema NerviosoPatologías del Sistema Nervioso
Patologías del Sistema NerviosoHector Martínez
 
HPCC Systems - Using Big Data to Help Feed the World
HPCC Systems - Using Big Data to Help Feed the WorldHPCC Systems - Using Big Data to Help Feed the World
HPCC Systems - Using Big Data to Help Feed the WorldHPCC Systems
 
Apache Server Tutorial
Apache Server TutorialApache Server Tutorial
Apache Server TutorialJagat Kothari
 
HPCC Platform + Visualization
HPCC Platform + VisualizationHPCC Platform + Visualization
HPCC Platform + VisualizationGordon Smith
 
Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink Slim Baltagi
 

Andere mochten auch (20)

Valoración apache
Valoración apache Valoración apache
Valoración apache
 
HPCC Systems 6.0.0 Highlights
HPCC Systems 6.0.0 HighlightsHPCC Systems 6.0.0 Highlights
HPCC Systems 6.0.0 Highlights
 
Optimizing Supervised and Implementing Unsupervised Machine Learning Algorith...
Optimizing Supervised and Implementing Unsupervised Machine Learning Algorith...Optimizing Supervised and Implementing Unsupervised Machine Learning Algorith...
Optimizing Supervised and Implementing Unsupervised Machine Learning Algorith...
 
HPCC Presentation
HPCC PresentationHPCC Presentation
HPCC Presentation
 
HPCC Systems Presentation to TDWI Chicago Chapter
HPCC Systems Presentation to TDWI Chicago ChapterHPCC Systems Presentation to TDWI Chicago Chapter
HPCC Systems Presentation to TDWI Chicago Chapter
 
Big data processing using HPCC Systems Above and Beyond Hadoop
Big data processing using HPCC Systems Above and Beyond HadoopBig data processing using HPCC Systems Above and Beyond Hadoop
Big data processing using HPCC Systems Above and Beyond Hadoop
 
Studies of HPCC Systems from Machine Learning Perspectives
Studies of HPCC Systems from Machine Learning PerspectivesStudies of HPCC Systems from Machine Learning Perspectives
Studies of HPCC Systems from Machine Learning Perspectives
 
Introduction to the Open Source HPCC Systems Platform by Arjuna Chala
Introduction to the Open Source HPCC Systems Platform by Arjuna ChalaIntroduction to the Open Source HPCC Systems Platform by Arjuna Chala
Introduction to the Open Source HPCC Systems Platform by Arjuna Chala
 
HPCC Systems - Open source, Big Data Processing & Analytics
HPCC Systems - Open source, Big Data Processing & AnalyticsHPCC Systems - Open source, Big Data Processing & Analytics
HPCC Systems - Open source, Big Data Processing & Analytics
 
HPCC Systems - ECL for Programmers - Big Data - Data Scientist
HPCC Systems - ECL for Programmers - Big Data - Data ScientistHPCC Systems - ECL for Programmers - Big Data - Data Scientist
HPCC Systems - ECL for Programmers - Big Data - Data Scientist
 
Meetup - Exabyte Big Data - HPCC Systems - SQL to ECL
Meetup - Exabyte Big Data - HPCC Systems - SQL to ECLMeetup - Exabyte Big Data - HPCC Systems - SQL to ECL
Meetup - Exabyte Big Data - HPCC Systems - SQL to ECL
 
2016 HPCC Systems Poster Presentation Competition
2016 HPCC Systems Poster Presentation Competition2016 HPCC Systems Poster Presentation Competition
2016 HPCC Systems Poster Presentation Competition
 
HPCC Systems Engineering Summit Presentation - Leveraging HPCC Systems with V...
HPCC Systems Engineering Summit Presentation - Leveraging HPCC Systems with V...HPCC Systems Engineering Summit Presentation - Leveraging HPCC Systems with V...
HPCC Systems Engineering Summit Presentation - Leveraging HPCC Systems with V...
 
Hpcc
HpccHpcc
Hpcc
 
HPCC Systems vs Hadoop
HPCC Systems vs HadoopHPCC Systems vs Hadoop
HPCC Systems vs Hadoop
 
Patologías del Sistema Nervioso
Patologías del Sistema NerviosoPatologías del Sistema Nervioso
Patologías del Sistema Nervioso
 
HPCC Systems - Using Big Data to Help Feed the World
HPCC Systems - Using Big Data to Help Feed the WorldHPCC Systems - Using Big Data to Help Feed the World
HPCC Systems - Using Big Data to Help Feed the World
 
Apache Server Tutorial
Apache Server TutorialApache Server Tutorial
Apache Server Tutorial
 
HPCC Platform + Visualization
HPCC Platform + VisualizationHPCC Platform + Visualization
HPCC Platform + Visualization
 
Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink Step-by-Step Introduction to Apache Flink
Step-by-Step Introduction to Apache Flink
 

Ähnlich wie Dynamic Provisioning of Data Intensive Computing Middleware Frameworks

MOD-2 presentation on engineering students
MOD-2 presentation on engineering studentsMOD-2 presentation on engineering students
MOD-2 presentation on engineering studentsrishavkumar1402
 
CSD-2881 - Achieving System Production Readiness for IBM PureApplication System
CSD-2881 - Achieving System Production Readiness for IBM PureApplication SystemCSD-2881 - Achieving System Production Readiness for IBM PureApplication System
CSD-2881 - Achieving System Production Readiness for IBM PureApplication SystemHendrik van Run
 
Lecture 3.31 3.32.pptx
Lecture 3.31  3.32.pptxLecture 3.31  3.32.pptx
Lecture 3.31 3.32.pptxRATISHKUMAR32
 
Climb stateoftheartintro
Climb stateoftheartintroClimb stateoftheartintro
Climb stateoftheartintrothomasrconnor
 
Module-2_HADOOP.pptx
Module-2_HADOOP.pptxModule-2_HADOOP.pptx
Module-2_HADOOP.pptxShreyasKv13
 
BIg Data Analytics-Module-2 vtu engineering.pptx
BIg Data Analytics-Module-2 vtu engineering.pptxBIg Data Analytics-Module-2 vtu engineering.pptx
BIg Data Analytics-Module-2 vtu engineering.pptxVishalBH1
 
Big data processing using hadoop poster presentation
Big data processing using hadoop poster presentationBig data processing using hadoop poster presentation
Big data processing using hadoop poster presentationAmrut Patil
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyPeter Clapham
 
Introduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSIntroduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSSteve Wong
 
Introduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceIntroduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceeakasit_dpu
 
CloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use CaseCloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use CaseCloudLightning
 
Supporting Research through "Desktop as a Service" models of e-infrastructure...
Supporting Research through "Desktop as a Service" models of e-infrastructure...Supporting Research through "Desktop as a Service" models of e-infrastructure...
Supporting Research through "Desktop as a Service" models of e-infrastructure...David Wallom
 
Self-Organisation as a Cloud Resource Management Strategy
Self-Organisation as a Cloud Resource Management StrategySelf-Organisation as a Cloud Resource Management Strategy
Self-Organisation as a Cloud Resource Management StrategyCloudLightning
 
Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Fra...
Dache: A Data Aware Caching for Big-Data Applications Usingthe MapReduce Fra...Dache: A Data Aware Caching for Big-Data Applications Usingthe MapReduce Fra...
Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Fra...Govt.Engineering college, Idukki
 
DOE Magellan OpenStack user story
DOE Magellan OpenStack user storyDOE Magellan OpenStack user story
DOE Magellan OpenStack user storylaurabeckcahoon
 
Optimized NFV placement in Openstack Clouds
Optimized NFV placement in Openstack CloudsOptimized NFV placement in Openstack Clouds
Optimized NFV placement in Openstack CloudsYathiraj Udupi, Ph.D.
 
Hadoop for Scientific Workloads__HadoopSummit2010
Hadoop for Scientific Workloads__HadoopSummit2010Hadoop for Scientific Workloads__HadoopSummit2010
Hadoop for Scientific Workloads__HadoopSummit2010Yahoo Developer Network
 
Shaping the Future: To Globus Compute and Beyond!
Shaping the Future: To Globus Compute and Beyond!Shaping the Future: To Globus Compute and Beyond!
Shaping the Future: To Globus Compute and Beyond!Globus
 

Ähnlich wie Dynamic Provisioning of Data Intensive Computing Middleware Frameworks (20)

MOD-2 presentation on engineering students
MOD-2 presentation on engineering studentsMOD-2 presentation on engineering students
MOD-2 presentation on engineering students
 
CSD-2881 - Achieving System Production Readiness for IBM PureApplication System
CSD-2881 - Achieving System Production Readiness for IBM PureApplication SystemCSD-2881 - Achieving System Production Readiness for IBM PureApplication System
CSD-2881 - Achieving System Production Readiness for IBM PureApplication System
 
Lecture 3.31 3.32.pptx
Lecture 3.31  3.32.pptxLecture 3.31  3.32.pptx
Lecture 3.31 3.32.pptx
 
Climb stateoftheartintro
Climb stateoftheartintroClimb stateoftheartintro
Climb stateoftheartintro
 
Module-2_HADOOP.pptx
Module-2_HADOOP.pptxModule-2_HADOOP.pptx
Module-2_HADOOP.pptx
 
BIg Data Analytics-Module-2 vtu engineering.pptx
BIg Data Analytics-Module-2 vtu engineering.pptxBIg Data Analytics-Module-2 vtu engineering.pptx
BIg Data Analytics-Module-2 vtu engineering.pptx
 
Big data processing using hadoop poster presentation
Big data processing using hadoop poster presentationBig data processing using hadoop poster presentation
Big data processing using hadoop poster presentation
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
 
Introduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSIntroduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OS
 
Introduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceIntroduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduce
 
CloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use CaseCloudLightning and the OPM-based Use Case
CloudLightning and the OPM-based Use Case
 
Supporting Research through "Desktop as a Service" models of e-infrastructure...
Supporting Research through "Desktop as a Service" models of e-infrastructure...Supporting Research through "Desktop as a Service" models of e-infrastructure...
Supporting Research through "Desktop as a Service" models of e-infrastructure...
 
Resource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache StormResource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache Storm
 
Resource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache StormResource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache Storm
 
Self-Organisation as a Cloud Resource Management Strategy
Self-Organisation as a Cloud Resource Management StrategySelf-Organisation as a Cloud Resource Management Strategy
Self-Organisation as a Cloud Resource Management Strategy
 
Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Fra...
Dache: A Data Aware Caching for Big-Data Applications Usingthe MapReduce Fra...Dache: A Data Aware Caching for Big-Data Applications Usingthe MapReduce Fra...
Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Fra...
 
DOE Magellan OpenStack user story
DOE Magellan OpenStack user storyDOE Magellan OpenStack user story
DOE Magellan OpenStack user story
 
Optimized NFV placement in Openstack Clouds
Optimized NFV placement in Openstack CloudsOptimized NFV placement in Openstack Clouds
Optimized NFV placement in Openstack Clouds
 
Hadoop for Scientific Workloads__HadoopSummit2010
Hadoop for Scientific Workloads__HadoopSummit2010Hadoop for Scientific Workloads__HadoopSummit2010
Hadoop for Scientific Workloads__HadoopSummit2010
 
Shaping the Future: To Globus Compute and Beyond!
Shaping the Future: To Globus Compute and Beyond!Shaping the Future: To Globus Compute and Beyond!
Shaping the Future: To Globus Compute and Beyond!
 

Kürzlich hochgeladen

George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024eCommerce Institute
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesPooja Nehwal
 
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
Mathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptxMathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptxMoumonDas2
 
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...Pooja Nehwal
 
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...henrik385807
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...Sheetaleventcompany
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxraffaeleoman
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Hasting Chen
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyPooja Nehwal
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Vipesco
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Chameera Dedduwage
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )Pooja Nehwal
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfhenrik385807
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AITatiana Gurgel
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024eCommerce Institute
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Delhi Call girls
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Kayode Fayemi
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaKayode Fayemi
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubssamaasim06
 

Kürzlich hochgeladen (20)

George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
 
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
 
Mathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptxMathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptx
 
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
 
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
CTAC 2024 Valencia - Sven Zoelle - Most Crucial Invest to Digitalisation_slid...
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AI
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubs
 

Dynamic Provisioning of Data Intensive Computing Middleware Frameworks

  • 1. Dynamic Provisioning of Data Intensive Computing Middleware Frameworks: A Case Study Linh B. Ngo1 Michael E. Payne1 Flavio Villanustre2 Richard Taylor2 Amy W. Apon1 1School of Computing, Clemson University 2LexisNexis® Risk Solutions
  • 2. Contents 1.  Overview  of  Clemson  University’s  Cyberinfrastructure  Resource   2.  Demand  for  Dynamic  Data-­‐Intensive  Compu@ng  Middleware  Frameworks   3.  Dynamic  Provisioning  of  Data-­‐Intensive  Compu@ng  Framework     4.  Deploying  Hadoop  Ecosystem  vs.  Deploying  HPCC  Systems®   5.  Lessons  Learned    
  • 3. Cyberinfrastructure Resource at Clemson University Condominium model 2,007 Computer Nodes (21,400 cores), including 276 GPU nodes Sustained 551 Tflops (benchmarked on GPU nodes only) 1289 active users, 12 academic departments across 36 fields of research Facilities
  • 4. Cyberinfrastructure Resource at Clemson University •  1G/10G/Myrinet-­‐10G/Infiniband-­‐40G/Infiniband-­‐56G   •  Local  storage  between  100-­‐200GB  (majority)  and  400-­‐900GB  (since  2013)   •  Shared  233TB  OrangeFS  scratch  space  and  more  than  3PB  archival  space  
  • 5. Demand  for  Dynamic  Data-­‐Intensive  Compu@ng  Middleware   Frameworks •  Genome  Sequencing  (Hadoop  MapReduce/GPGPU)   •  Molecular  Dynamic  Forward  Flux  Sampling  (Hadoop  Streaming/LAMMPS)   •  Streaming  Data  Infrastructure  for  Connected  Vehicle  System  (Hadoop   Distributed  File  System/Spark/Ka_a)   •  Big  Scholarly  Data  (HPCC  Systems)   •  CS  Course  in  Distributed  and  Cluster  Compu@ng  (MPI/MapReduce,   Hadoop/Spark/HPCC  Systems®  …)  
  • 6. Demand  for  Dynamic  Data-­‐Intensive  Compu@ng  Middleware   Frameworks •  Changes  in  cyberinfrastructure  support  model  for  data  infrastructure:   –  Beyond  a  tradi@onal  remote  distributed  file  system  model   –  From  sta@c  and  dedicated  resource  to  dynamic  resource   –  Data  management  processes  co-­‐locate  with  compu@ng  processes   •  Challenges  for  system  administrators:   –  Accommoda@ng  different  frameworks  for  different  research   –  Complying  with  exis@ng  administra@ve  policy  and  scheduling  priority   •  What  can  users  do?   –  Deploying  dynamic  data-­‐intensive  compu@ng  frameworks  within  the   limits  of  user  privilege  and  without  the  interven@on  of  administrators  
  • 7. Dynamic  Provisioning  of  Data-­‐Intensive  Compu@ng  Framework:   Installa@on   •  Where  to  install   1.  Home  directory:  Persistent,  limited  in  storage   2.  Shared  distributed  storage:  Fast,  semi-­‐persistent,  “unlimited”  storage   3.  Local  storage  on  compute  node:  Fast,  non-­‐persistent,  requires   reinstalla@on   •  How  to  handle  dependencies   1.  Ideally  in  home  or  shared  distributed  storage  (persistency)   2.  Dynamic  loading  mechanisms  via  environment  paths  
  • 8. Target   deployment   directories  on   local  disks     PBS_NODEFILE   Deployment/ ConfiguraBon  Scripts   1   2   3   4   user.palmeHo.clemson.edu   Dynamic  Provisioning  of  Data-­‐Intensive  Compu@ng  Framework:   Deployment  
  • 9. Deploying  Hadoop  Ecosystem  vs.  deploying  HPCC  Systems®:   Overview   •  Open  source  alterna@ves  based   on  the  conceptual  architecture  of   a  data-­‐intensive  compu@ng   infrastructure  developed  by   Google   •  Comprehensive  data-­‐intensive   compu@ng  system  targe@ng   enterprise  users,  developed  in   early  2000,  open  source  since   2011    
  • 10. Deploying  Hadoop  Ecosystem  vs.  deploying  HPCC  Systems®:   Installa@on:  Hadoop   •  Self-­‐contained,  pre-­‐compiled  jar  files   •  No  installa@on  is  needed,  relies  on  shell  scripts  to  launch  component   daemons   •  Dependencies:  JDK  
  • 11. Deploying  Hadoop  Ecosystem  vs.  deploying  HPCC  Systems®:   Installa@on:  HPCC  Systems   •  Standard  configure/make/make  install   –  Assump@on  about  an  industrial  produc@on  environment  (with   administra@ve  privileges)   –  Modifica@on  to  avoid  hard-­‐coded  system  installa@on  paths   –  Modifica@on  of  template  XML  configura@on  files  to  avoid  default   HPCC  Systems-­‐specific  user  crea@on  and  administra@ve  check   •  Dependencies:     –  Not  on  Palmeko:  ICU,  Xalan,  Xerces,  APR  …   –  On  Palmeko  but  no  correct  version:  Binu@ls  
  • 12. Deploying  Hadoop  Ecosystem  vs.  deploying  HPCC  Systems:   Deployment:  Hadoop   •  Component   placement   determina@on   •  Cleanup  target   directories  from   previous   deployment   •  Create  target   directories  (log,   storage,  pid  …)   •  Synchronize  order   of  component   start-­‐up   Namenode   ResourceManager   SparkMaster   DataNode   NodeManager   SparkExecutor   DataNode   NodeManager   SparkExecutor   DataNode   NodeManager   SparkExecutor   1st  node  in   PBS_NODEFILE   2nd  node  in   PBS_NODEFILE   3rd  node  in   PBS_NODEFILE   4th  node  in   PBS_NODEFILE   5th  node  in   PBS_NODEFILE   nth  node  in   PBS_NODEFILE   •  Addi@onal  components  (Hbase,  Hive,  Ka_a  …)  can  be   added  to  this  deployment  model  
  • 13. Deploying  Hadoop  Ecosystem  vs.  deploying  HPCC  Systems:   Deployment:  HPCC  Systems     •  Determine  node   alloca@on  and  internal   IP  addresses   •  HPCC  Systems  is   configured  via  its  own   deployment  programs   (configmgr,  configgen,   hpcc-­‐init)   1st  node  in   PBS_NODEFILE   2nd  node  in   PBS_NODEFILE   1st  node  in   PBS_NODEFILE   3rd  node  in   PBS_NODEFILE   4th  node  in   PBS_NODEFILE   5th  node  in   PBS_NODEFILE   nth  node  in   PBS_NODEFILE  
  • 14. Deploying  Hadoop  Ecosystem  vs.  deploying  HPCC  Systems:   Deployment:  HPCC  Systems     •  Node  memory  constraints   •  HPCC  Systems  reserves   75%  of  available  memory   for  thor  by  default   •  Palmeko  does  not  allow   unlimited  memory   reserva@on     •  As  a  result,  thor_master     cannot  launch  new  jobs   via  fork()   •  Resolved  by  lower   memory  reserva@on   1st  node  in   PBS_NODEFILE   2nd  node  in   PBS_NODEFILE   1st  node  in   PBS_NODEFILE   3rd  node  in   PBS_NODEFILE   4th  node  in   PBS_NODEFILE   5th  node  in   PBS_NODEFILE   nth  node  in   PBS_NODEFILE  
  • 15. Lessons  Learned   •  A  common  approach  can  be  adapted  for  both  Hadoop  Ecosystem  and   HPCC  Systems   •  Limita@ons  on  non-­‐administra@ve  accounts  can  impact  the  deployment   and  performance  via  system  resource  constraints   –  Unable  to  u@lize  all  available  memory  on  allocated  node  (HPCC   Systems)   •  Dynamic  deployment  via  non-­‐administra@ve  accounts  provide  ini@a@ve   for  users  to  experiment  with  and  u@lize  new  large  scale  frameworks   without  addi@onal  burden  for  administrators  
  • 16. Lessons  Learned   •  Experience  in  deploying  as  users  is,  in  turn,  extremely  applicable  to  the   process  of  deployment  with  administra@ve  privileges.     •  E.g.:  CloudLab  cloud  compu@ng  experimental  testbed  with  non-­‐persistent,   ephemeral,  and  short-­‐term  (15  hours)  alloca@on   –  Script-­‐based  installa@on  and  deployment  are  needed,  even  with   administra@ve  right,  to  automate  the  deployment  of  the  experiment   •  Experience  in  deploying  as  administrators  is  helpful  in  debugging  user-­‐ based  deployment:   –  Iden@fica@on  and  resolu@on  of  memory  alloca@on  issue  in  HPCC   Systems  were  done  by  changing  system  limita@on  using  administra@ve   commands.    
  • 17. QUESTIONS? Linh B. Ngo1 Michael E. Payne1 Flavio Villanustre2 Richard Taylor2 Amy W. Apon1 {lngo,mpayne3,aapon}@clemson.edu 1School of Computing, Clemson University {flavio.villanustre,richard.taylor}@lexisnexis.com 2LexisNexis Risk Solutions More information about HPCCSystems can be found at http://hpccsystems.com