SlideShare ist ein Scribd-Unternehmen logo
1 von 11
Downloaden Sie, um offline zu lesen
Open Science Grid
BoscoR:	
  Transforming	
  Your	
  R	
  Desktop	
  
into	
  an	
  R	
  Super	
  Desktop	
  
	
  
Dan	
  Fraser	
  
Open	
  Science	
  Grid	
  
University	
  of	
  Chicago	
  
Argonne	
  National	
  Laboratory	
  
UseR!	
  2013,	
  Albacete,	
  Spain,	
  July	
  12,	
  2013	
  
	
  
Open Science Grid
BoscoR	
  Team	
  
bosco-­‐discuss@opensciencegrid.org	
  
l  Dan	
  Fraser	
  –	
  Team	
  Lead	
  
l  Open	
  Science	
  Grid,	
  Uchicago,	
  ANL	
  
l  Derek	
  Weitzel	
  –	
  Lead	
  Developer	
  
l  University	
  of	
  Nebraska,	
  Lincoln	
  
l  Marco	
  Mambelli	
  –	
  Support	
  /	
  Development	
  
l  University	
  of	
  Chicago	
  
l  Miha	
  Ahronovitz	
  –	
  Product	
  Manager	
  
l  University	
  of	
  Wisconsin	
  
l  Jaime	
  Frey,	
  Todd	
  Tannenbaum	
  –	
  Condor	
  Development	
  Support	
  
l  University	
  of	
  Wisconsin	
  
Open Science Grid
Approaching	
  the	
  Limit	
  	
  
of	
  Desktop	
  Computing	
  
l Solution	
  time	
  >>	
  Time	
  you	
  want	
  to	
  wait	
  L	
  
l  Multiple	
  runs	
  
l  Some	
  answers	
  require	
  100,000+	
  iterations	
  
l  Larger	
  datasets	
  
l  Complex	
  analysis	
  path,	
  …	
  
	
  
	
  
l Perhaps	
  I	
  can	
  use	
  a	
  faster	
  computer	
  
somewhere	
  on	
  campus?	
  	
  	
  	
  	
  	
  But	
  …	
  
Open Science Grid
Open Science Grid
Painful	
  Transition	
  For	
  R	
  Users	
  
Condor
PBS-Remote Login
-Learn Batch CL Environment
-Setup R on Cluster
-Move Data to Cluster
-Parallelize R Script (tool)
-Use Batch Environment
-Transfer Data to Desktop
-Analyze Data
LSF/Platform
Grid Engine
Slurm
…
Available Cluster
“A long, nonlinear process”
--Margarita Rincón Hidalgo (CSIC)
Open Science Grid
Open	
  Source	
  BoscoR	
  
Condor
PBS
-Install BoscoR
-Connect to Cluster
-Parallelize R Script (GridR)
-Analyze Data
LSF/
Platform
Grid
Engine
Slurm
…
Empowered User
BoscoR transforms your desktop into a “Super Desktop”
Open Science Grid
What	
  BoscoR	
  Does	
  for	
  You	
  
l  Transforms	
  your	
  Desktop	
  into	
  a	
  “Super-­‐Desktop”	
  
l  By	
  connecting	
  to	
  and	
  managing	
  your	
  server	
  /	
  cluster	
  
l  Straightforward	
  Path	
  to	
  Exploit	
  R	
  Parallelism	
  
l  Download	
  and	
  Install	
  “Bosco”	
  and	
  “GridR”	
  on	
  your	
  desktop	
  
l  Connect	
  your	
  cluster	
  (Username,	
  password)	
  with	
  Bosco	
  
l  Change	
  “apply”	
  to	
  “grid.apply”	
  OR	
  “lapply”	
  to	
  “grid.lapply”	
  
l  No	
  shortcuts	
  for	
  intelligence	
  and	
  thought	
  	
  in	
  this	
  step	
  !!!	
  
l  Run	
  your	
  script	
  	
  J	
  
http://bosco.opensciencegrid.org/boscor/
Open Science Grid
BoscoR	
  Simplifies	
  and	
  Automates	
  
l  Managed	
  connection	
  to	
  the	
  Server	
  /	
  Cluster	
  /	
  Supercomputer	
  
l  Installs	
  &	
  manages	
  the	
  R	
  package	
  on	
  the	
  remote	
  cluster	
  
l  	
  May	
  include	
  specialized	
  CRAN	
  packages	
  
l  Integrates	
  with	
  an	
  R	
  parallelization	
  tool	
  
l  Currently	
  GridR	
  	
  (e.g.	
  apply	
  -­‐>	
  grid.apply)	
  
l  There	
  are	
  other	
  R	
  parallelization	
  tools	
  	
  
l  Executes	
  the	
  parallelized	
  R	
  script	
  on	
  the	
  cluster	
  
l  Auto	
  data	
  movement	
  from	
  desktop	
  and	
  back	
  to	
  desktop	
  !	
  
Open Science Grid
A	
  Few	
  Details	
  
l  BoscoR	
  is	
  a	
  Beta	
  release	
  
l  Requires	
  a	
  Linux	
  (or	
  Mac)	
  desktop	
  
l  Mac	
  laptops	
  are	
  a	
  work	
  in	
  progress	
  
l  Cluster	
  access	
  requires	
  a	
  batch	
  scheduler	
  (PBS,	
  Condor,	
  ..	
  
l  Users	
  must	
  have	
  an	
  account	
  (access	
  is	
  via	
  SSH)	
  
l  Bosco	
  Integration	
  with	
  GridR	
  only	
  a	
  starting	
  point	
  
l  We	
  welcome	
  other	
  Integrations	
  &	
  Collaborations	
  
l  Special	
  thank	
  you	
  to	
  the	
  GridR	
  development	
  team	
  
Open Science Grid
Sample	
  GridR	
  code	
  in	
  BoscoR	
  
l  Loads	
  the	
  GridR	
  Library	
  
l  Initializes	
  GridR	
  with	
  Bosco	
  connection	
  
l  Creates	
  a	
  simple	
  function	
  “mult2”	
  
l  Applies	
  the	
  function	
  to	
  13,	
  store	
  result	
  in	
  x	
  
> library("GridR")!
> grid.init(service="bosco.direct", localTmpDir="tmp")!
> mult2 <- function(s) { return (s*2) }!
> grid.apply("x", mult2, 13)!
> x!
[1] 26!
Open Science Grid
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Summary	
  
l  Transform	
  your	
  R	
  desktop	
  into	
  a	
  “Super	
  Desktop”	
  	
  J	
  
l  Straightforward	
  process	
  empowers	
  the	
  R	
  researcher	
  
l  Understanding	
  the	
  parallel	
  parts	
  of	
  your	
  R	
  script	
  is	
  beneficial	
  
l  Minimal	
  infrastructure	
  support	
  required	
  
l  Collaborate	
  with	
  us	
  !	
  
l  We	
  welcome	
  academic	
  and	
  commercial	
  collaborations	
  
l  We’re	
  here	
  to	
  help:	
  
l  bosco-­‐discuss@opensciencegrid.org	
  
Open Science Grid
Useful	
  Links	
  
l  Short	
  summary	
  page:	
  
http://bosco.opensciencegrid.org/boscor/	
  
l  Wiki	
  Page	
  
https://twiki.grid.iu.edu/bin/view/CampusGrids/BoscoR	
  
l  Descriptive	
  Blog	
  Post	
  
http://derekweitzel.blogspot.com/2013/07/the-­‐next-­‐step-­‐
for-­‐bosco-­‐boscor.html	
  
l  Contact	
  us:	
  bosco-­‐discuss@opensciencegrid.org	
  

Weitere ähnliche Inhalte

Was ist angesagt?

Q4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis PresentationQ4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis PresentationRob Emanuele
 
2021 Dask Summit - Using STAC to catalog SpatioTemporal datasets
2021 Dask Summit - Using STAC to catalog SpatioTemporal datasets2021 Dask Summit - Using STAC to catalog SpatioTemporal datasets
2021 Dask Summit - Using STAC to catalog SpatioTemporal datasetsRob Emanuele
 
Karmasphere hadoop-productivity-tools
Karmasphere hadoop-productivity-toolsKarmasphere hadoop-productivity-tools
Karmasphere hadoop-productivity-toolsHadoop User Group
 
Introduction to MapReduce
Introduction to MapReduceIntroduction to MapReduce
Introduction to MapReduceHassan A-j
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce AlgorithmsAmund Tveit
 
BKK16-408B Data Analytics and Machine Learning From Node to Cluster
BKK16-408B Data Analytics and Machine Learning From Node to ClusterBKK16-408B Data Analytics and Machine Learning From Node to Cluster
BKK16-408B Data Analytics and Machine Learning From Node to ClusterLinaro
 
Optimization Techniques
Optimization TechniquesOptimization Techniques
Optimization TechniquesJoud Khattab
 
Convolutional Neural Networks at scale in Spark MLlib
Convolutional Neural Networks at scale in Spark MLlibConvolutional Neural Networks at scale in Spark MLlib
Convolutional Neural Networks at scale in Spark MLlibDataWorks Summit
 
Introduction to SARA's Hadoop Hackathon - dec 7th 2010
Introduction to SARA's Hadoop Hackathon - dec 7th 2010Introduction to SARA's Hadoop Hackathon - dec 7th 2010
Introduction to SARA's Hadoop Hackathon - dec 7th 2010Evert Lammerts
 
First NL-HUG: Large-scale data processing at SARA with Apache Hadoop
First NL-HUG: Large-scale data processing at SARA with Apache HadoopFirst NL-HUG: Large-scale data processing at SARA with Apache Hadoop
First NL-HUG: Large-scale data processing at SARA with Apache HadoopEvert Lammerts
 
GeoMesa LocationTech DC
GeoMesa LocationTech DCGeoMesa LocationTech DC
GeoMesa LocationTech DCCCRinc
 
Spark the next top compute model
Spark   the next top compute modelSpark   the next top compute model
Spark the next top compute modelDean Wampler
 
H2O World - Munging, modeling, and pipelines using Python - Hank Roark
H2O World - Munging, modeling, and pipelines using Python - Hank RoarkH2O World - Munging, modeling, and pipelines using Python - Hank Roark
H2O World - Munging, modeling, and pipelines using Python - Hank RoarkSri Ambati
 
Stefano Baghino - From Big Data to Fast Data: Apache Spark
Stefano Baghino - From Big Data to Fast Data: Apache SparkStefano Baghino - From Big Data to Fast Data: Apache Spark
Stefano Baghino - From Big Data to Fast Data: Apache SparkCodemotion
 
Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets robertlz
 
Cassandra at talkbits
Cassandra at talkbitsCassandra at talkbits
Cassandra at talkbitsMax Alexejev
 
Espresso: LinkedIn's Distributed Data Serving Platform (Talk)
Espresso: LinkedIn's Distributed Data Serving Platform (Talk)Espresso: LinkedIn's Distributed Data Serving Platform (Talk)
Espresso: LinkedIn's Distributed Data Serving Platform (Talk)Amy W. Tang
 

Was ist angesagt? (20)

Q4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis PresentationQ4 2016 GeoTrellis Presentation
Q4 2016 GeoTrellis Presentation
 
2021 Dask Summit - Using STAC to catalog SpatioTemporal datasets
2021 Dask Summit - Using STAC to catalog SpatioTemporal datasets2021 Dask Summit - Using STAC to catalog SpatioTemporal datasets
2021 Dask Summit - Using STAC to catalog SpatioTemporal datasets
 
Karmasphere hadoop-productivity-tools
Karmasphere hadoop-productivity-toolsKarmasphere hadoop-productivity-tools
Karmasphere hadoop-productivity-tools
 
Introduction to MapReduce
Introduction to MapReduceIntroduction to MapReduce
Introduction to MapReduce
 
MindRaider
MindRaiderMindRaider
MindRaider
 
Mapreduce Algorithms
Mapreduce AlgorithmsMapreduce Algorithms
Mapreduce Algorithms
 
BKK16-408B Data Analytics and Machine Learning From Node to Cluster
BKK16-408B Data Analytics and Machine Learning From Node to ClusterBKK16-408B Data Analytics and Machine Learning From Node to Cluster
BKK16-408B Data Analytics and Machine Learning From Node to Cluster
 
Optimization Techniques
Optimization TechniquesOptimization Techniques
Optimization Techniques
 
Convolutional Neural Networks at scale in Spark MLlib
Convolutional Neural Networks at scale in Spark MLlibConvolutional Neural Networks at scale in Spark MLlib
Convolutional Neural Networks at scale in Spark MLlib
 
Introduction to SARA's Hadoop Hackathon - dec 7th 2010
Introduction to SARA's Hadoop Hackathon - dec 7th 2010Introduction to SARA's Hadoop Hackathon - dec 7th 2010
Introduction to SARA's Hadoop Hackathon - dec 7th 2010
 
First NL-HUG: Large-scale data processing at SARA with Apache Hadoop
First NL-HUG: Large-scale data processing at SARA with Apache HadoopFirst NL-HUG: Large-scale data processing at SARA with Apache Hadoop
First NL-HUG: Large-scale data processing at SARA with Apache Hadoop
 
GeoMesa LocationTech DC
GeoMesa LocationTech DCGeoMesa LocationTech DC
GeoMesa LocationTech DC
 
Spark the next top compute model
Spark   the next top compute modelSpark   the next top compute model
Spark the next top compute model
 
H2O World - Munging, modeling, and pipelines using Python - Hank Roark
H2O World - Munging, modeling, and pipelines using Python - Hank RoarkH2O World - Munging, modeling, and pipelines using Python - Hank Roark
H2O World - Munging, modeling, and pipelines using Python - Hank Roark
 
Map Reduce basics
Map Reduce basicsMap Reduce basics
Map Reduce basics
 
Stefano Baghino - From Big Data to Fast Data: Apache Spark
Stefano Baghino - From Big Data to Fast Data: Apache SparkStefano Baghino - From Big Data to Fast Data: Apache Spark
Stefano Baghino - From Big Data to Fast Data: Apache Spark
 
Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets Dremel: Interactive Analysis of Web-Scale Datasets
Dremel: Interactive Analysis of Web-Scale Datasets
 
InternReport
InternReportInternReport
InternReport
 
Cassandra at talkbits
Cassandra at talkbitsCassandra at talkbits
Cassandra at talkbits
 
Espresso: LinkedIn's Distributed Data Serving Platform (Talk)
Espresso: LinkedIn's Distributed Data Serving Platform (Talk)Espresso: LinkedIn's Distributed Data Serving Platform (Talk)
Espresso: LinkedIn's Distributed Data Serving Platform (Talk)
 

Ähnlich wie Bosco r users2013

Spark Summit EU 2015: Lessons from 300+ production users
Spark Summit EU 2015: Lessons from 300+ production usersSpark Summit EU 2015: Lessons from 300+ production users
Spark Summit EU 2015: Lessons from 300+ production usersDatabricks
 
Open-source from/in the enterprise: the RDKit
Open-source from/in the enterprise: the RDKitOpen-source from/in the enterprise: the RDKit
Open-source from/in the enterprise: the RDKitGreg Landrum
 
Experiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah WatkinsExperiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah WatkinsCeph Community
 
Project Hydrogen: State-of-the-Art Deep Learning on Apache Spark
Project Hydrogen: State-of-the-Art Deep Learning on Apache SparkProject Hydrogen: State-of-the-Art Deep Learning on Apache Spark
Project Hydrogen: State-of-the-Art Deep Learning on Apache SparkDatabricks
 
Microservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCONMicroservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCONAdrian Cockcroft
 
Big Data LDN 2018: PROJECT HYDROGEN: UNIFYING AI WITH APACHE SPARK
Big Data LDN 2018: PROJECT HYDROGEN: UNIFYING AI WITH APACHE SPARKBig Data LDN 2018: PROJECT HYDROGEN: UNIFYING AI WITH APACHE SPARK
Big Data LDN 2018: PROJECT HYDROGEN: UNIFYING AI WITH APACHE SPARKMatt Stubbs
 
Through the firewall with miniCRAN
Through the firewall with miniCRANThrough the firewall with miniCRAN
Through the firewall with miniCRANRevolution Analytics
 
Dissertation defense
Dissertation defenseDissertation defense
Dissertation defensemarek_pomocka
 
Unified Big Data Processing with Apache Spark
Unified Big Data Processing with Apache SparkUnified Big Data Processing with Apache Spark
Unified Big Data Processing with Apache SparkC4Media
 
Making 'npm install' Safe
Making 'npm install' SafeMaking 'npm install' Safe
Making 'npm install' SafeC4Media
 
Analyzing Big data in R and Scala using Apache Spark 17-7-19
Analyzing Big data in R and Scala using Apache Spark  17-7-19Analyzing Big data in R and Scala using Apache Spark  17-7-19
Analyzing Big data in R and Scala using Apache Spark 17-7-19Ahmed Elsayed
 
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & AlluxioUltra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & AlluxioAlluxio, Inc.
 
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...MLconf
 
Scientific Computing @ Fred Hutch
Scientific Computing @ Fred HutchScientific Computing @ Fred Hutch
Scientific Computing @ Fred HutchDirk Petersen
 
Data science apps: beyond notebooks
Data science apps: beyond notebooksData science apps: beyond notebooks
Data science apps: beyond notebooksNatalino Busa
 
Use r 2013 tutorial - r and cloud computing for higher education and research
Use r 2013   tutorial - r and cloud computing for higher education and researchUse r 2013   tutorial - r and cloud computing for higher education and research
Use r 2013 tutorial - r and cloud computing for higher education and researchkchine3
 
A Hands-on Intro to Data Science and R Presentation.ppt
A Hands-on Intro to Data Science and R Presentation.pptA Hands-on Intro to Data Science and R Presentation.ppt
A Hands-on Intro to Data Science and R Presentation.pptSanket Shikhar
 
Apache Spark Workshop, Apr. 2016, Euangelos Linardos
Apache Spark Workshop, Apr. 2016, Euangelos LinardosApache Spark Workshop, Apr. 2016, Euangelos Linardos
Apache Spark Workshop, Apr. 2016, Euangelos LinardosEuangelos Linardos
 
Apache spark-melbourne-april-2015-meetup
Apache spark-melbourne-april-2015-meetupApache spark-melbourne-april-2015-meetup
Apache spark-melbourne-april-2015-meetupNed Shawa
 

Ähnlich wie Bosco r users2013 (20)

Spark Summit EU 2015: Lessons from 300+ production users
Spark Summit EU 2015: Lessons from 300+ production usersSpark Summit EU 2015: Lessons from 300+ production users
Spark Summit EU 2015: Lessons from 300+ production users
 
eScience Cluster Arch. Overview
eScience Cluster Arch. OvervieweScience Cluster Arch. Overview
eScience Cluster Arch. Overview
 
Open-source from/in the enterprise: the RDKit
Open-source from/in the enterprise: the RDKitOpen-source from/in the enterprise: the RDKit
Open-source from/in the enterprise: the RDKit
 
Experiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah WatkinsExperiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah Watkins
 
Project Hydrogen: State-of-the-Art Deep Learning on Apache Spark
Project Hydrogen: State-of-the-Art Deep Learning on Apache SparkProject Hydrogen: State-of-the-Art Deep Learning on Apache Spark
Project Hydrogen: State-of-the-Art Deep Learning on Apache Spark
 
Microservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCONMicroservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCON
 
Big Data LDN 2018: PROJECT HYDROGEN: UNIFYING AI WITH APACHE SPARK
Big Data LDN 2018: PROJECT HYDROGEN: UNIFYING AI WITH APACHE SPARKBig Data LDN 2018: PROJECT HYDROGEN: UNIFYING AI WITH APACHE SPARK
Big Data LDN 2018: PROJECT HYDROGEN: UNIFYING AI WITH APACHE SPARK
 
Through the firewall with miniCRAN
Through the firewall with miniCRANThrough the firewall with miniCRAN
Through the firewall with miniCRAN
 
Dissertation defense
Dissertation defenseDissertation defense
Dissertation defense
 
Unified Big Data Processing with Apache Spark
Unified Big Data Processing with Apache SparkUnified Big Data Processing with Apache Spark
Unified Big Data Processing with Apache Spark
 
Making 'npm install' Safe
Making 'npm install' SafeMaking 'npm install' Safe
Making 'npm install' Safe
 
Analyzing Big data in R and Scala using Apache Spark 17-7-19
Analyzing Big data in R and Scala using Apache Spark  17-7-19Analyzing Big data in R and Scala using Apache Spark  17-7-19
Analyzing Big data in R and Scala using Apache Spark 17-7-19
 
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & AlluxioUltra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
 
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
 
Scientific Computing @ Fred Hutch
Scientific Computing @ Fred HutchScientific Computing @ Fred Hutch
Scientific Computing @ Fred Hutch
 
Data science apps: beyond notebooks
Data science apps: beyond notebooksData science apps: beyond notebooks
Data science apps: beyond notebooks
 
Use r 2013 tutorial - r and cloud computing for higher education and research
Use r 2013   tutorial - r and cloud computing for higher education and researchUse r 2013   tutorial - r and cloud computing for higher education and research
Use r 2013 tutorial - r and cloud computing for higher education and research
 
A Hands-on Intro to Data Science and R Presentation.ppt
A Hands-on Intro to Data Science and R Presentation.pptA Hands-on Intro to Data Science and R Presentation.ppt
A Hands-on Intro to Data Science and R Presentation.ppt
 
Apache Spark Workshop, Apr. 2016, Euangelos Linardos
Apache Spark Workshop, Apr. 2016, Euangelos LinardosApache Spark Workshop, Apr. 2016, Euangelos Linardos
Apache Spark Workshop, Apr. 2016, Euangelos Linardos
 
Apache spark-melbourne-april-2015-meetup
Apache spark-melbourne-april-2015-meetupApache spark-melbourne-april-2015-meetup
Apache spark-melbourne-april-2015-meetup
 

Kürzlich hochgeladen

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 

Kürzlich hochgeladen (20)

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 

Bosco r users2013

  • 1. Open Science Grid BoscoR:  Transforming  Your  R  Desktop   into  an  R  Super  Desktop     Dan  Fraser   Open  Science  Grid   University  of  Chicago   Argonne  National  Laboratory   UseR!  2013,  Albacete,  Spain,  July  12,  2013    
  • 2. Open Science Grid BoscoR  Team   bosco-­‐discuss@opensciencegrid.org   l  Dan  Fraser  –  Team  Lead   l  Open  Science  Grid,  Uchicago,  ANL   l  Derek  Weitzel  –  Lead  Developer   l  University  of  Nebraska,  Lincoln   l  Marco  Mambelli  –  Support  /  Development   l  University  of  Chicago   l  Miha  Ahronovitz  –  Product  Manager   l  University  of  Wisconsin   l  Jaime  Frey,  Todd  Tannenbaum  –  Condor  Development  Support   l  University  of  Wisconsin  
  • 3. Open Science Grid Approaching  the  Limit     of  Desktop  Computing   l Solution  time  >>  Time  you  want  to  wait  L   l  Multiple  runs   l  Some  answers  require  100,000+  iterations   l  Larger  datasets   l  Complex  analysis  path,  …       l Perhaps  I  can  use  a  faster  computer   somewhere  on  campus?            But  …   Open Science Grid
  • 4. Open Science Grid Painful  Transition  For  R  Users   Condor PBS-Remote Login -Learn Batch CL Environment -Setup R on Cluster -Move Data to Cluster -Parallelize R Script (tool) -Use Batch Environment -Transfer Data to Desktop -Analyze Data LSF/Platform Grid Engine Slurm … Available Cluster “A long, nonlinear process” --Margarita Rincón Hidalgo (CSIC)
  • 5. Open Science Grid Open  Source  BoscoR   Condor PBS -Install BoscoR -Connect to Cluster -Parallelize R Script (GridR) -Analyze Data LSF/ Platform Grid Engine Slurm … Empowered User BoscoR transforms your desktop into a “Super Desktop”
  • 6. Open Science Grid What  BoscoR  Does  for  You   l  Transforms  your  Desktop  into  a  “Super-­‐Desktop”   l  By  connecting  to  and  managing  your  server  /  cluster   l  Straightforward  Path  to  Exploit  R  Parallelism   l  Download  and  Install  “Bosco”  and  “GridR”  on  your  desktop   l  Connect  your  cluster  (Username,  password)  with  Bosco   l  Change  “apply”  to  “grid.apply”  OR  “lapply”  to  “grid.lapply”   l  No  shortcuts  for  intelligence  and  thought    in  this  step  !!!   l  Run  your  script    J   http://bosco.opensciencegrid.org/boscor/
  • 7. Open Science Grid BoscoR  Simplifies  and  Automates   l  Managed  connection  to  the  Server  /  Cluster  /  Supercomputer   l  Installs  &  manages  the  R  package  on  the  remote  cluster   l   May  include  specialized  CRAN  packages   l  Integrates  with  an  R  parallelization  tool   l  Currently  GridR    (e.g.  apply  -­‐>  grid.apply)   l  There  are  other  R  parallelization  tools     l  Executes  the  parallelized  R  script  on  the  cluster   l  Auto  data  movement  from  desktop  and  back  to  desktop  !  
  • 8. Open Science Grid A  Few  Details   l  BoscoR  is  a  Beta  release   l  Requires  a  Linux  (or  Mac)  desktop   l  Mac  laptops  are  a  work  in  progress   l  Cluster  access  requires  a  batch  scheduler  (PBS,  Condor,  ..   l  Users  must  have  an  account  (access  is  via  SSH)   l  Bosco  Integration  with  GridR  only  a  starting  point   l  We  welcome  other  Integrations  &  Collaborations   l  Special  thank  you  to  the  GridR  development  team  
  • 9. Open Science Grid Sample  GridR  code  in  BoscoR   l  Loads  the  GridR  Library   l  Initializes  GridR  with  Bosco  connection   l  Creates  a  simple  function  “mult2”   l  Applies  the  function  to  13,  store  result  in  x   > library("GridR")! > grid.init(service="bosco.direct", localTmpDir="tmp")! > mult2 <- function(s) { return (s*2) }! > grid.apply("x", mult2, 13)! > x! [1] 26!
  • 10. Open Science Grid                                            Summary   l  Transform  your  R  desktop  into  a  “Super  Desktop”    J   l  Straightforward  process  empowers  the  R  researcher   l  Understanding  the  parallel  parts  of  your  R  script  is  beneficial   l  Minimal  infrastructure  support  required   l  Collaborate  with  us  !   l  We  welcome  academic  and  commercial  collaborations   l  We’re  here  to  help:   l  bosco-­‐discuss@opensciencegrid.org  
  • 11. Open Science Grid Useful  Links   l  Short  summary  page:   http://bosco.opensciencegrid.org/boscor/   l  Wiki  Page   https://twiki.grid.iu.edu/bin/view/CampusGrids/BoscoR   l  Descriptive  Blog  Post   http://derekweitzel.blogspot.com/2013/07/the-­‐next-­‐step-­‐ for-­‐bosco-­‐boscor.html   l  Contact  us:  bosco-­‐discuss@opensciencegrid.org