SlideShare ist ein Scribd-Unternehmen logo
1 von 52
Downloaden Sie, um offline zu lesen
High Performance GPU
Computing with Ruby
Prasun Anand
About me
● Modak Analytics
● Genenetwork project
● SciRuby Contributor
● Google Summer of Code 2016, 2017
● Ruby Grant 2017
● Fukuoka Ruby Award 2018
● Projects:
○ JRuby port of NMatrix
○ ArrayFire gem
○ RbCUDA
Data is the new Oil!
Highlights
Modak Analytics is helping implement one of the largest Life Sciences
Platform in the world.
Platform Details
2100
Structured
data sources
500k
Tables
1350
Unstructured
sources
1.3
Billion
Files
1200
Data Nodes
6
Petabytes
Usable
information
• 1000+ clinical trials being standardized to
CDISC (SDTM) model for cross-study analysis,
placebo baseline etc.
• Single integrated data platform comprising of
compound, activity results, assay protocol and
project information
• “Like Minded” data has been grounded into
Data Domains by business areas. E.g. Clinical,
Assay, Gene, Regulatory etc
• Around 17+ solutions have been developed
and deployed for business
Awarded at the prestigious ‘Strata
Data Conference 2017’ for building
this platform in record time
Governed Data Lake approach
AUTOMATED
DATA DISCOVERY
• Modak is
providing
end-to-end
service for the
platform
including
Automated
Ingestion,
Curation, and
innovative
Solutions
• Modak is also
providing 24*7
support for the
massive platform
AUTOMATED
DATA INGESTION
Data Spider
Postgres
SQL serverOracle
MySQL
Structured Data
SAS Data Sets
Unstructured Data
File shares
SharePointDocumentum
BOTS
FOUNDATION
LAYER
Ingested
Raw Data
Data Tagging
Data Masking
Data
cleansing
Data lineage
Data profiling
Augmented Data
Mapping/
Standardization
Data
Fingerprinting
A replica of the
Data is
ingested into
the Integration
Layer
INTEGRATION
LAYER
SOLUTIONS
LAYER
Data Analytics
SEMANTICLAYER
Visulaisation
Dashboards
and Reports
MetaData
Catalog
(KOSH)
Flow
controller
Streamsets
Pipelines are
generated
automatically
Data Governance
Data Security
System / Application Management
SOURCE DATA
Originators of data and serve
as “authoring” systems to
support business processes
Optimized for computing and
distribution of data Optimized for strategic BI
product development
Optimized for
Business Users
Optimized for
Analysts, Data
scientists
GWAS
Genome Wide Association
Studies(GWAS)
Matrix Multiplication ?
Arrays / Matrices
BLAS and LAPACK
GPU Computing is not easy !
CUDA and OpenCL
Af_Array
[1] pry(main)> a = ArrayFire::Af_Array.new 2, [2,2],[1,2,3,4]
No Name Array
[2 2 1 1]
Offsets: [0 0 0 0]
Strides: [1 2 4 4]
1.0000 3.0000
2.0000 4.0000
=> #<ArrayFire::Af_Array:0x000000020aeab8>
Af_Array
[1] pry(main)> a = ArrayFire::Af_Array.new 2, [2,2],[1,2,3,4]
No Name Array
[2 2 1 1]
Offsets: [0 0 0 0]
Strides: [1 2 4 4]
1.0000 3.0000
2.0000 4.0000
=> #<ArrayFire::Af_Array:0x000000020aeab8>
Af_Array
[1] pry(main)> a = ArrayFire::Af_Array.new 2, [2,2],[1,2,3,4]
No Name Array
[2 2 1 1]
Offsets: [0 0 0 0]
Strides: [1 2 4 4]
1.0000 3.0000
2.0000 4.0000
=> #<ArrayFire::Af_Array:0x000000020aeab8>
Af_Array
[1] pry(main)> a = ArrayFire::Af_Array.new 2, [2,2],[1,2,3,4]
No Name Array
[2 2 1 1]
Offsets: [0 0 0 0]
Strides: [1 2 4 4]
1.0000 3.0000
2.0000 4.0000
=> #<ArrayFire::Af_Array:0x000000020aeab8>
[2] pry(main)> b = a + a
No Name Array
[2 2 1 1]
Offsets: [0 0 0 0]
Strides: [1 2 4 4]
2.0000 6.0000
4.0000 8.0000
=> #<ArrayFire::Af_Array:0x000000020625c8>
[1] pry(main)> left = ArrayFire::Af_Array.new 2 , [3,3] , [1, 4, 6, 4, 11 , 2 ,-5, 8, 10]
No Name Array
[3 3 1 1]
1.0000 4.0000 -5.0000
4.0000 11.0000 8.0000
6.0000 2.0000 10.0000
=> #<ArrayFire::Af_Array:0x000000014e56c8>
[2] pry(main)> right = ArrayFire::Af_Array.new 2 , [3,2] , [1, 0, 8, 10, -11, 8]
No Name Array
[3 2 1 1]
1.0000 10.0000
0.0000 -11.0000
8.0000 8.0000
=> #<ArrayFire::Af_Array:0x00000001591db0>
[3] pry(main)> result = ArrayFire::BLAS.matmul(left, right, :AF_MAT_NONE, :AF_MAT_NONE)
No Name Array
[3 2 1 1]
-39.0000 -74.0000
68.0000 -17.0000
86.0000 118.0000
=> #<ArrayFire::Af_Array:0x000000016136f8>
VALUE arf_init(int argc, VALUE* argv, VALUE self)
{
afstruct* afarray;
Data_Get_Struct(self, afstruct, afarray);
dim_t ndims = (dim_t)NUM2LONG(argv[0]);
dim_t* dimensions = (dim_t*)malloc(ndims * sizeof(dim_t));
dim_t count = 1;
for (size_t index = 0; index < ndims; index++) {
dimensions[index] = (dim_t)NUM2LONG(RARRAY_AREF(argv[1], index));
count *= dimensions[index];
}
double* host_array = (double*)malloc(count * sizeof(double));
for (size_t index = 0; index < count; index++) {
host_array[index] = (double)NUM2DBL(RARRAY_AREF(argv[2], index));
}
af_create_array(&afarray->carray, host_array, ndims, dimensions, f64);
return self;
}
static VALUE arf_matmul(VALUE self, VALUE left_val, VALUE right_val, VALUE left_prop_val, VALUE
right_prop_val){
afstruct* left;
afstruct* right;
afstruct* result = ALLOC(afstruct);
Data_Get_Struct(left_val, afstruct, left);
Data_Get_Struct(right_val, afstruct, right);
af_mat_prop left_mat_prop = arf_mat_type_from_rbsymbol(left_prop_val);
af_mat_prop right_mat_prop = arf_mat_type_from_rbsymbol(right_prop_val);
af_matmul(&result->carray, left->carray, right->carray, left_mat_prop, right_mat_prop);
return Data_Wrap_Struct(CLASS_OF(left_val), NULL, arf_free, result);
}
BLAS functionalities
● Matmult
● Transpose
LAPACK functionalities
● Det
● Inverse
● Norm
● Qr
● Cholesky
● Svd
● lu
Statistics
● Mean
● Median
● Variance
Benchmarks
● AMD FX 8350 octacore processor
● Nvidia GTX 750Ti GPU
● Double dtype
10 X
Faster than NMatrix-Ruby-Lapack
10,000 X
Faster than NMatrix-Ruby
100,000 X
Faster than NMatrix-Ruby-BLAS
RbCUDA
GPU Array
● Generic pointer used to handle an array of elements on the GPU.
● Memory copying from CPU to GPU and vice-versa.
● Interfaced with NMatrix and NArray
vadd_kernel_src = <<-EOS
extern "C" {
__global__ void matSum(int *a, int *b, int *c)
{
int tid = blockIdx.x;
if (tid < 100)
c[tid] = a[tid] + b[tid];
}
}
EOS
f = compile(vadd_kernel_src)
RbCUDA::Driver.run_kernel(f.path)
● CuBLAS
● CuSolver
● CuRand
Benchmarks
● AMD FX 8350 octacore processor
● Nvidia GTX 750Ti GPU
● Double dtype
1,000,000 X
Faster than NMatrix-Ruby-BLAS
Fastest Matrix Multiplication
Library in Ruby!
Future Work
● Image Processing APIs and Indexers
● Multiple dtypes
● RbCUDA is under development.
● https://github.com/arrayfire/arrayfire-rb
● https://github.com/prasunanand/rbcuda
Contributions are Welcome!
Acknowledgements
1. Pjotr Prins
2. Pradeep Garigipati
3. Kenta Murata
4. Ruby Science Foundation
5. Ruby Association
6. Modak Analytics
Thanks!
Github: prasunanand
Twitter: @prasun_anand
Blog: prasunanand.com
Questions

Weitere ähnliche Inhalte

Was ist angesagt?

Indexing and Query Optimization
Indexing and Query OptimizationIndexing and Query Optimization
Indexing and Query Optimization
MongoDB
 
Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27
MongoDB
 

Was ist angesagt? (20)

Php forum2015 tomas_final
Php forum2015 tomas_finalPhp forum2015 tomas_final
Php forum2015 tomas_final
 
Javascript Arrays
Javascript ArraysJavascript Arrays
Javascript Arrays
 
PostgreSQL: Advanced features in practice
PostgreSQL: Advanced features in practicePostgreSQL: Advanced features in practice
PostgreSQL: Advanced features in practice
 
Indexing and Query Optimization
Indexing and Query OptimizationIndexing and Query Optimization
Indexing and Query Optimization
 
Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)Indexing and Query Optimizer (Richard Kreuter)
Indexing and Query Optimizer (Richard Kreuter)
 
Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27Mongoseattle indexing-2010-07-27
Mongoseattle indexing-2010-07-27
 
CREATE INDEX … USING VODKA. VODKA CONNECTING INDEXES, Олег Бартунов, Александ...
CREATE INDEX … USING VODKA. VODKA CONNECTING INDEXES, Олег Бартунов, Александ...CREATE INDEX … USING VODKA. VODKA CONNECTING INDEXES, Олег Бартунов, Александ...
CREATE INDEX … USING VODKA. VODKA CONNECTING INDEXES, Олег Бартунов, Александ...
 
The Ring programming language version 1.10 book - Part 39 of 212
The Ring programming language version 1.10 book - Part 39 of 212The Ring programming language version 1.10 book - Part 39 of 212
The Ring programming language version 1.10 book - Part 39 of 212
 
Postgres rules
Postgres rulesPostgres rules
Postgres rules
 
BGOUG15: JSON support in MySQL 5.7
BGOUG15: JSON support in MySQL 5.7BGOUG15: JSON support in MySQL 5.7
BGOUG15: JSON support in MySQL 5.7
 
Error based blind sqli
Error based blind sqliError based blind sqli
Error based blind sqli
 
The Ring programming language version 1.4.1 book - Part 13 of 31
The Ring programming language version 1.4.1 book - Part 13 of 31The Ring programming language version 1.4.1 book - Part 13 of 31
The Ring programming language version 1.4.1 book - Part 13 of 31
 
Mary Had a Little λ (QCon)
Mary Had a Little λ (QCon)Mary Had a Little λ (QCon)
Mary Had a Little λ (QCon)
 
Data Munging in R - Chicago R User Group
Data Munging in R - Chicago R User GroupData Munging in R - Chicago R User Group
Data Munging in R - Chicago R User Group
 
The Ring programming language version 1.5.2 book - Part 29 of 181
The Ring programming language version 1.5.2 book - Part 29 of 181The Ring programming language version 1.5.2 book - Part 29 of 181
The Ring programming language version 1.5.2 book - Part 29 of 181
 
Python seaborn cheat_sheet
Python seaborn cheat_sheetPython seaborn cheat_sheet
Python seaborn cheat_sheet
 
Rug hogan-10-03-2012
Rug hogan-10-03-2012Rug hogan-10-03-2012
Rug hogan-10-03-2012
 
The Ring programming language version 1.5.1 book - Part 43 of 180
The Ring programming language version 1.5.1 book - Part 43 of 180The Ring programming language version 1.5.1 book - Part 43 of 180
The Ring programming language version 1.5.1 book - Part 43 of 180
 
What are arrays in java script
What are arrays in java scriptWhat are arrays in java script
What are arrays in java script
 
Xm lparsers
Xm lparsersXm lparsers
Xm lparsers
 

Ähnlich wie High Performance GPU computing with Ruby, Rubykaigi 2018

Refactoring to Macros with Clojure
Refactoring to Macros with ClojureRefactoring to Macros with Clojure
Refactoring to Macros with Clojure
Dmitry Buzdin
 
Scaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersScaling MySQL Strategies for Developers
Scaling MySQL Strategies for Developers
Jonathan Levin
 

Ähnlich wie High Performance GPU computing with Ruby, Rubykaigi 2018 (20)

Rubyconfindia2018 - GPU accelerated libraries for Ruby
Rubyconfindia2018 - GPU accelerated libraries for RubyRubyconfindia2018 - GPU accelerated libraries for Ruby
Rubyconfindia2018 - GPU accelerated libraries for Ruby
 
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
A Rusty introduction to Apache Arrow and how it applies to a  time series dat...A Rusty introduction to Apache Arrow and how it applies to a  time series dat...
A Rusty introduction to Apache Arrow and how it applies to a time series dat...
 
New SQL features in latest MySQL releases
New SQL features in latest MySQL releasesNew SQL features in latest MySQL releases
New SQL features in latest MySQL releases
 
Tulsa techfest Spark Core Aug 5th 2016
Tulsa techfest Spark Core Aug 5th 2016Tulsa techfest Spark Core Aug 5th 2016
Tulsa techfest Spark Core Aug 5th 2016
 
Coscup2021 - useful abstractions at rust and it's practical usage
Coscup2021 - useful abstractions at rust and it's practical usageCoscup2021 - useful abstractions at rust and it's practical usage
Coscup2021 - useful abstractions at rust and it's practical usage
 
Unit 2 dsa LINEAR DATA STRUCTURE
Unit 2 dsa LINEAR DATA STRUCTUREUnit 2 dsa LINEAR DATA STRUCTURE
Unit 2 dsa LINEAR DATA STRUCTURE
 
20180420 hk-the powerofmysql8
20180420 hk-the powerofmysql820180420 hk-the powerofmysql8
20180420 hk-the powerofmysql8
 
Refactoring to Macros with Clojure
Refactoring to Macros with ClojureRefactoring to Macros with Clojure
Refactoring to Macros with Clojure
 
Chp4(ref dynamic)
Chp4(ref dynamic)Chp4(ref dynamic)
Chp4(ref dynamic)
 
SP-First-Lecture.ppt
SP-First-Lecture.pptSP-First-Lecture.ppt
SP-First-Lecture.ppt
 
High performance GPU computing with Ruby
High performance GPU computing with RubyHigh performance GPU computing with Ruby
High performance GPU computing with Ruby
 
database.pptx
database.pptxdatabase.pptx
database.pptx
 
Develop Python Applications with MySQL Connector/Python
Develop Python Applications with MySQL Connector/PythonDevelop Python Applications with MySQL Connector/Python
Develop Python Applications with MySQL Connector/Python
 
PHP security audits
PHP security auditsPHP security audits
PHP security audits
 
Lazy vs. Eager Loading Strategies in JPA 2.1
Lazy vs. Eager Loading Strategies in JPA 2.1Lazy vs. Eager Loading Strategies in JPA 2.1
Lazy vs. Eager Loading Strategies in JPA 2.1
 
Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]
Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]
Sergi Álvarez & Roi Martín - Radare2 Preview [RootedCON 2010]
 
Scaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersScaling MySQL Strategies for Developers
Scaling MySQL Strategies for Developers
 
Arrays 06.ppt
Arrays 06.pptArrays 06.ppt
Arrays 06.ppt
 
arrays
arraysarrays
arrays
 
Getting Functional with Scala
Getting Functional with ScalaGetting Functional with Scala
Getting Functional with Scala
 

Kürzlich hochgeladen

%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
masabamasaba
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 

Kürzlich hochgeladen (20)

%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto%in Soweto+277-882-255-28 abortion pills for sale in soweto
%in Soweto+277-882-255-28 abortion pills for sale in soweto
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 

High Performance GPU computing with Ruby, Rubykaigi 2018

  • 1. High Performance GPU Computing with Ruby Prasun Anand
  • 2.
  • 3. About me ● Modak Analytics ● Genenetwork project ● SciRuby Contributor ● Google Summer of Code 2016, 2017 ● Ruby Grant 2017 ● Fukuoka Ruby Award 2018 ● Projects: ○ JRuby port of NMatrix ○ ArrayFire gem ○ RbCUDA
  • 4. Data is the new Oil!
  • 5. Highlights Modak Analytics is helping implement one of the largest Life Sciences Platform in the world. Platform Details 2100 Structured data sources 500k Tables 1350 Unstructured sources 1.3 Billion Files 1200 Data Nodes 6 Petabytes Usable information • 1000+ clinical trials being standardized to CDISC (SDTM) model for cross-study analysis, placebo baseline etc. • Single integrated data platform comprising of compound, activity results, assay protocol and project information • “Like Minded” data has been grounded into Data Domains by business areas. E.g. Clinical, Assay, Gene, Regulatory etc • Around 17+ solutions have been developed and deployed for business Awarded at the prestigious ‘Strata Data Conference 2017’ for building this platform in record time
  • 6. Governed Data Lake approach AUTOMATED DATA DISCOVERY • Modak is providing end-to-end service for the platform including Automated Ingestion, Curation, and innovative Solutions • Modak is also providing 24*7 support for the massive platform AUTOMATED DATA INGESTION Data Spider Postgres SQL serverOracle MySQL Structured Data SAS Data Sets Unstructured Data File shares SharePointDocumentum BOTS FOUNDATION LAYER Ingested Raw Data Data Tagging Data Masking Data cleansing Data lineage Data profiling Augmented Data Mapping/ Standardization Data Fingerprinting A replica of the Data is ingested into the Integration Layer INTEGRATION LAYER SOLUTIONS LAYER Data Analytics SEMANTICLAYER Visulaisation Dashboards and Reports MetaData Catalog (KOSH) Flow controller Streamsets Pipelines are generated automatically Data Governance Data Security System / Application Management SOURCE DATA Originators of data and serve as “authoring” systems to support business processes Optimized for computing and distribution of data Optimized for strategic BI product development Optimized for Business Users Optimized for Analysts, Data scientists GWAS
  • 11. GPU Computing is not easy !
  • 13.
  • 14. Af_Array [1] pry(main)> a = ArrayFire::Af_Array.new 2, [2,2],[1,2,3,4] No Name Array [2 2 1 1] Offsets: [0 0 0 0] Strides: [1 2 4 4] 1.0000 3.0000 2.0000 4.0000 => #<ArrayFire::Af_Array:0x000000020aeab8>
  • 15. Af_Array [1] pry(main)> a = ArrayFire::Af_Array.new 2, [2,2],[1,2,3,4] No Name Array [2 2 1 1] Offsets: [0 0 0 0] Strides: [1 2 4 4] 1.0000 3.0000 2.0000 4.0000 => #<ArrayFire::Af_Array:0x000000020aeab8>
  • 16. Af_Array [1] pry(main)> a = ArrayFire::Af_Array.new 2, [2,2],[1,2,3,4] No Name Array [2 2 1 1] Offsets: [0 0 0 0] Strides: [1 2 4 4] 1.0000 3.0000 2.0000 4.0000 => #<ArrayFire::Af_Array:0x000000020aeab8>
  • 17. Af_Array [1] pry(main)> a = ArrayFire::Af_Array.new 2, [2,2],[1,2,3,4] No Name Array [2 2 1 1] Offsets: [0 0 0 0] Strides: [1 2 4 4] 1.0000 3.0000 2.0000 4.0000 => #<ArrayFire::Af_Array:0x000000020aeab8>
  • 18. [2] pry(main)> b = a + a No Name Array [2 2 1 1] Offsets: [0 0 0 0] Strides: [1 2 4 4] 2.0000 6.0000 4.0000 8.0000 => #<ArrayFire::Af_Array:0x000000020625c8>
  • 19. [1] pry(main)> left = ArrayFire::Af_Array.new 2 , [3,3] , [1, 4, 6, 4, 11 , 2 ,-5, 8, 10] No Name Array [3 3 1 1] 1.0000 4.0000 -5.0000 4.0000 11.0000 8.0000 6.0000 2.0000 10.0000 => #<ArrayFire::Af_Array:0x000000014e56c8> [2] pry(main)> right = ArrayFire::Af_Array.new 2 , [3,2] , [1, 0, 8, 10, -11, 8] No Name Array [3 2 1 1] 1.0000 10.0000 0.0000 -11.0000 8.0000 8.0000 => #<ArrayFire::Af_Array:0x00000001591db0>
  • 20. [3] pry(main)> result = ArrayFire::BLAS.matmul(left, right, :AF_MAT_NONE, :AF_MAT_NONE) No Name Array [3 2 1 1] -39.0000 -74.0000 68.0000 -17.0000 86.0000 118.0000 => #<ArrayFire::Af_Array:0x000000016136f8>
  • 21. VALUE arf_init(int argc, VALUE* argv, VALUE self) { afstruct* afarray; Data_Get_Struct(self, afstruct, afarray); dim_t ndims = (dim_t)NUM2LONG(argv[0]); dim_t* dimensions = (dim_t*)malloc(ndims * sizeof(dim_t)); dim_t count = 1; for (size_t index = 0; index < ndims; index++) { dimensions[index] = (dim_t)NUM2LONG(RARRAY_AREF(argv[1], index)); count *= dimensions[index]; } double* host_array = (double*)malloc(count * sizeof(double)); for (size_t index = 0; index < count; index++) { host_array[index] = (double)NUM2DBL(RARRAY_AREF(argv[2], index)); } af_create_array(&afarray->carray, host_array, ndims, dimensions, f64); return self; }
  • 22. static VALUE arf_matmul(VALUE self, VALUE left_val, VALUE right_val, VALUE left_prop_val, VALUE right_prop_val){ afstruct* left; afstruct* right; afstruct* result = ALLOC(afstruct); Data_Get_Struct(left_val, afstruct, left); Data_Get_Struct(right_val, afstruct, right); af_mat_prop left_mat_prop = arf_mat_type_from_rbsymbol(left_prop_val); af_mat_prop right_mat_prop = arf_mat_type_from_rbsymbol(right_prop_val); af_matmul(&result->carray, left->carray, right->carray, left_mat_prop, right_mat_prop); return Data_Wrap_Struct(CLASS_OF(left_val), NULL, arf_free, result); }
  • 23. BLAS functionalities ● Matmult ● Transpose LAPACK functionalities ● Det ● Inverse ● Norm ● Qr ● Cholesky ● Svd ● lu
  • 25. Benchmarks ● AMD FX 8350 octacore processor ● Nvidia GTX 750Ti GPU ● Double dtype
  • 26.
  • 27. 10 X Faster than NMatrix-Ruby-Lapack
  • 28.
  • 29.
  • 30.
  • 31. 10,000 X Faster than NMatrix-Ruby
  • 32.
  • 33.
  • 34.
  • 35. 100,000 X Faster than NMatrix-Ruby-BLAS
  • 36.
  • 37.
  • 39. GPU Array ● Generic pointer used to handle an array of elements on the GPU. ● Memory copying from CPU to GPU and vice-versa. ● Interfaced with NMatrix and NArray
  • 40. vadd_kernel_src = <<-EOS extern "C" { __global__ void matSum(int *a, int *b, int *c) { int tid = blockIdx.x; if (tid < 100) c[tid] = a[tid] + b[tid]; } } EOS f = compile(vadd_kernel_src) RbCUDA::Driver.run_kernel(f.path)
  • 42. Benchmarks ● AMD FX 8350 octacore processor ● Nvidia GTX 750Ti GPU ● Double dtype
  • 43.
  • 44. 1,000,000 X Faster than NMatrix-Ruby-BLAS
  • 45.
  • 47.
  • 48. Future Work ● Image Processing APIs and Indexers ● Multiple dtypes ● RbCUDA is under development.
  • 50. Acknowledgements 1. Pjotr Prins 2. Pradeep Garigipati 3. Kenta Murata 4. Ruby Science Foundation 5. Ruby Association 6. Modak Analytics