SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Downloaden Sie, um offline zu lesen
Genomic Analysis in Scala
Scala/Splash 2017
October 22, 2017
Ryan Williams
1 / 17
Overview
Intro
Genomic applications
General Scala libraries
Design-pattern deep-dive
"fun" with implicits
Slides: hammerlab.org/splash-2017
Everything discussed in this talk open source / Apache 2.0
2 / 17
Hammer Lab
Mt. Sinai School of Medicine, NYC
Research
Personal Genome Vaccine pipeline / clinical trial
Checkpoint blockade biomarkers, mutational signatures
http://www.hammerlab.org/
Tools
Genome biofx using Spark + Scala
Biofx workïŹ‚ows and tools in OCaml
The usual suspects: python, R, 

 
3 / 17
coverage-depth
4 / 17
  
coverage-depth
 
5 / 17
spark-bam
Splitting genomic BAM ïŹles
6 / 17
Related Libraries
Non-genomics-speciïŹc
Maybe you want to use them
7 / 17
magic-rdds
Collection-operations implemented for Spark RDDs
scans
{left,right}
{elements, values of tuples}
.runLengthEncode, group consecutive elements by predicate / Ordering
.reverse
reductions: .maxByKey, .minByKey
sliding/windowed traversals
.size - smart count
multiple counts in one job:
val (count1, count2) = (rdd1, rdd2).size
smart partition-tracking: reuse counts for UnionRDDs
zips
lazy partition-count, eager partition-number check
sameElements, equals
group/sample by key: ïŹrst elems or reservoir-sampled
HyperGeometric distribution handling Longs: hammerlab/math-utils
8 / 17
hammerlab/iterators
scans (in terms of cats.Monoid)
sliding/windowed traversals
eager drops/takes
by number
while
until
sorted/range zips
SimpleBufferedIterator
iterator in terms of _advance(): Option[T]
hasNext lazily buffers/caches head
etc.
9 / 17
args4j case-app
statically-checked/typed handlers
implicit resolution
inheritance vs. composition
mutable vs. immutable
case-app positional-arg support: #58
spark-commands: command-line interfaces
class Opts {
@args4j.Option(
name = "--in-path",
aliases = Array("-i"),
handler = classOf[PathOptionHandler],
usage = "Input path to read from"
)
var inPath: Option[Path] = None
@args4j.Option(
name = "--out-path",
aliases = Array("-o"),
handler = classOf[PathOptionHandler],
usage = "Output path to write to"
)
var outPath: Option[Path] = None
@args4j.Option(
name = "--overwrite",
aliases = Array("-f"),
usage = "Whether to overwrite an existing ou
)
var overwrite: Boolean = false
}
case class Opts(
@Opt("-i")
@Msg("Input path to read from")
inPath: Option[Path] = None,
@Opt("-o")
@Msg("Output path to write to")
outPath: Option[Path] = None,
@Opt("-f")
@Msg("Whether to overwrite an existing output
overwrite: Boolean = false
)
10 / 17
Design Patterns
Down the typelevel / implicit rabbit-hole
11 / 17
Deep case-class hierarchy:
case class A(n: Int)
case class B(s: String)
case class C(a: A, b: B)
case class D(b: Boolean)
case class E(c: C, d: D, a: A, a2: A)
case class F(e: E)
Instances:
val a = A(123)
val b = B("abc")
val c = C(a, b)
val d = D(true)
val e = E(c, d, A(456), A(789))
val f = F(e)
Pull out ïŹelds by type and/or name:
f.find('c) // f.e.c
f.findT[C] // f.e.c
f.field[C]('c) // f.e.c
f.field[A]('a2) // f.e.a2
f.field[B]('b) // f.e.c.b
As evidence parameters:
def findAandB[T](t: T)(
implicit
findA: Find[T, A],
findB: Find[T, B]
): (A, B) =
(findA(t), findB(t))
shapeless-utils
"recursive structural types"
12 / 17
Nesting/Mixing implicit contexts
Minimal boilerplate Spark CLI apps:
input Path
output Path (or: just a PrintStream)
SparkContext
select Broadcast variables
other argument-input objects
How to make all of these things implicitly available with minimal boilerplate?
13 / 17
def app1() = {
// call methods that want implicit 
// input Path, SparkContext
}
def app2() = {
// call methods that want implicit 
// Path, SparkContext, PrintStream
}
Nesting/Mixing implicit contexts
Minimal boilerplate Spark CLI apps:
input Path
output Path (or: just a PrintStream)
SparkContext
select Broadcast variables
other argument-input objects
How to make all of these things implicitly available with minimal boilerplate?
Ideally:
13 / 17
def run(
implicit
inPath: Path,
printStream: PrintStream,
sc: SparkContext,
ranges: Broadcast[Ranges],


): Unit = {
// do thing
}
case class Context(
inPath: Path,
printStream: PrintStream,
sc: SparkContext,
ranges: Broadcast[Ranges],


)
def run(implicit ctx: Context): Unit = {
implicit val Context(
inPath, printStream, sc, ranges, 

) = ctx
// do thing
}
Nesting/Mixing implicit contexts
Minimal boilerplate Spark CLI apps:
input Path
output Path (or: just a PrintStream)
SparkContext
select Broadcast variables
other argument-input objects
How to make all of these things implicitly available with minimal boilerplate?
14 / 17
trait HasInputPath { self: HasArgs ⇒
implicit val inPath = Path(args(0))
}
trait HasOutputPath { self: HasArgs ⇒
val outPath = Path(args(1))
}
class MinimalApp(args: Array[String])
extends HasArgs(args)
with HasInputPath
with HasPrintStream
with HasSparkContext
object Main {
def main(args: Array[String]): Unit =
new MinimalApp(args) {
// all the implicits!
}
}
}
Nesting/Mixing implicit contexts
How to make many implicits available with minimal boilerplate?   ≈
trait HasSparkContext {
implicit val sc: SparkContext = new SparkContext(
)
}
abstract class HasArgs(args: Array[String])
trait HasPrintStream extends HasOutputPath { self: Args ⇒
implicit val printStream = new PrintStream(newOutputStream(outPath))
}
15 / 17
That comes from a data structure like:
case class Result(
numPositions : Long,
compressedSize : Bytes,
compressionRatio : Double,
numReads : Long,
numFalsePositives: Long,
numFalseNegatives: Long
)
or better yet:
case class Result(
numPositions : NumPositions,
compressedSize : CompressedSize,
compressionRatio: CompressionRatio,
numReads : NumReads,
falseCounts : FalseCounts
)
{to,from}String: invertible syntax
Miscellaneous tools output "reports":
466202931615 uncompressed positions
156G compressed
Compression ratio: 2.78
1236499892 reads
22489 false positives, 0 false negatives
This is basically toString the Show type-class
twist: downstream tools want to parse these reports
want to re-hydrate Result instances
implicit val _iso: Iso[FalseCounts] =
iso"${'numFPs} false positives, ${'numFNs} false negatives" }
16 / 17
Thanks!
17 / 17

Weitere Àhnliche Inhalte

Was ist angesagt?

system software
system software system software
system software randhirlpu
 
Beginning Scala Svcc 2009
Beginning Scala Svcc 2009Beginning Scala Svcc 2009
Beginning Scala Svcc 2009David Pollak
 
Flavour of meta-programming with shapeless
Flavour of meta-programming with shapelessFlavour of meta-programming with shapeless
Flavour of meta-programming with shapelessArthur Kushka
 
ADVANCED FEATURES OF C++
ADVANCED FEATURES OF C++ADVANCED FEATURES OF C++
ADVANCED FEATURES OF C++NITHYA KUMAR
 
Streams or Loops? Java 8 Stream API by Niki Petkov - Proxiad Bulgaria
Streams or Loops? Java 8 Stream API by Niki Petkov - Proxiad BulgariaStreams or Loops? Java 8 Stream API by Niki Petkov - Proxiad Bulgaria
Streams or Loops? Java 8 Stream API by Niki Petkov - Proxiad BulgariaHackBulgaria
 
SPARQL in a nutshell
SPARQL in a nutshellSPARQL in a nutshell
SPARQL in a nutshellFabien Gandon
 
TMPA-2017: Functional Parser of Markdown Language Based on Monad Combining an...
TMPA-2017: Functional Parser of Markdown Language Based on Monad Combining an...TMPA-2017: Functional Parser of Markdown Language Based on Monad Combining an...
TMPA-2017: Functional Parser of Markdown Language Based on Monad Combining an...Iosif Itkin
 
Intentional Programming
Intentional ProgrammingIntentional Programming
Intentional Programminggiapmaster
 
C# Today and Tomorrow
C# Today and TomorrowC# Today and Tomorrow
C# Today and TomorrowBertrand Le Roy
 
Functional programming
Functional programmingFunctional programming
Functional programmingChristian Hujer
 
Lecture 4 - Comm Lab: Web @ ITP
Lecture 4 - Comm Lab: Web @ ITPLecture 4 - Comm Lab: Web @ ITP
Lecture 4 - Comm Lab: Web @ ITPyucefmerhi
 
Functional Programming Fundamentals
Functional Programming FundamentalsFunctional Programming Fundamentals
Functional Programming FundamentalsShahriar Hyder
 
OCP Java SE 8 Exam - Sample Questions - Lambda Expressions
OCP Java SE 8 Exam - Sample Questions - Lambda Expressions OCP Java SE 8 Exam - Sample Questions - Lambda Expressions
OCP Java SE 8 Exam - Sample Questions - Lambda Expressions Ganesh Samarthyam
 
Seductions of Scala
Seductions of ScalaSeductions of Scala
Seductions of ScalaDean Wampler
 

Was ist angesagt? (18)

system software
system software system software
system software
 
Beginning Scala Svcc 2009
Beginning Scala Svcc 2009Beginning Scala Svcc 2009
Beginning Scala Svcc 2009
 
Flavour of meta-programming with shapeless
Flavour of meta-programming with shapelessFlavour of meta-programming with shapeless
Flavour of meta-programming with shapeless
 
LEX & YACC TOOL
LEX & YACC TOOLLEX & YACC TOOL
LEX & YACC TOOL
 
ADVANCED FEATURES OF C++
ADVANCED FEATURES OF C++ADVANCED FEATURES OF C++
ADVANCED FEATURES OF C++
 
Lexyacc
LexyaccLexyacc
Lexyacc
 
Streams or Loops? Java 8 Stream API by Niki Petkov - Proxiad Bulgaria
Streams or Loops? Java 8 Stream API by Niki Petkov - Proxiad BulgariaStreams or Loops? Java 8 Stream API by Niki Petkov - Proxiad Bulgaria
Streams or Loops? Java 8 Stream API by Niki Petkov - Proxiad Bulgaria
 
Ch4c
Ch4cCh4c
Ch4c
 
SPARQL in a nutshell
SPARQL in a nutshellSPARQL in a nutshell
SPARQL in a nutshell
 
TMPA-2017: Functional Parser of Markdown Language Based on Monad Combining an...
TMPA-2017: Functional Parser of Markdown Language Based on Monad Combining an...TMPA-2017: Functional Parser of Markdown Language Based on Monad Combining an...
TMPA-2017: Functional Parser of Markdown Language Based on Monad Combining an...
 
Intentional Programming
Intentional ProgrammingIntentional Programming
Intentional Programming
 
C# Today and Tomorrow
C# Today and TomorrowC# Today and Tomorrow
C# Today and Tomorrow
 
Functional programming
Functional programmingFunctional programming
Functional programming
 
Lecture 4 - Comm Lab: Web @ ITP
Lecture 4 - Comm Lab: Web @ ITPLecture 4 - Comm Lab: Web @ ITP
Lecture 4 - Comm Lab: Web @ ITP
 
C++ 11
C++ 11C++ 11
C++ 11
 
Functional Programming Fundamentals
Functional Programming FundamentalsFunctional Programming Fundamentals
Functional Programming Fundamentals
 
OCP Java SE 8 Exam - Sample Questions - Lambda Expressions
OCP Java SE 8 Exam - Sample Questions - Lambda Expressions OCP Java SE 8 Exam - Sample Questions - Lambda Expressions
OCP Java SE 8 Exam - Sample Questions - Lambda Expressions
 
Seductions of Scala
Seductions of ScalaSeductions of Scala
Seductions of Scala
 

Ähnlich wie Genomic Analysis in Scala

From HelloWorld to Configurable and Reusable Apache Spark Applications in Sca...
From HelloWorld to Configurable and Reusable Apache Spark Applications in Sca...From HelloWorld to Configurable and Reusable Apache Spark Applications in Sca...
From HelloWorld to Configurable and Reusable Apache Spark Applications in Sca...Databricks
 
Elasticsearch And Apache Lucene For Apache Spark And MLlib
Elasticsearch And Apache Lucene For Apache Spark And MLlibElasticsearch And Apache Lucene For Apache Spark And MLlib
Elasticsearch And Apache Lucene For Apache Spark And MLlibJen Aman
 
SPARQL Cheat Sheet
SPARQL Cheat SheetSPARQL Cheat Sheet
SPARQL Cheat SheetLeeFeigenbaum
 
Introduction To Scala
Introduction To ScalaIntroduction To Scala
Introduction To ScalaPeter Maas
 
The Scala Programming Language
The Scala Programming LanguageThe Scala Programming Language
The Scala Programming Languageleague
 
So various polymorphism in Scala
So various polymorphism in ScalaSo various polymorphism in Scala
So various polymorphism in Scalab0ris_1
 
typemap in Perl/XS
typemap in Perl/XS  typemap in Perl/XS
typemap in Perl/XS charsbar
 
Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Martin Odersky
 
What's New In C# 7
What's New In C# 7What's New In C# 7
What's New In C# 7Paulo Morgado
 
Static types on javascript?! Type checking approaches to ensure healthy appli...
Static types on javascript?! Type checking approaches to ensure healthy appli...Static types on javascript?! Type checking approaches to ensure healthy appli...
Static types on javascript?! Type checking approaches to ensure healthy appli...Arthur Puthin
 
Scala for Java Programmers
Scala for Java ProgrammersScala for Java Programmers
Scala for Java ProgrammersEric Pederson
 
Scala - en bedre og mere effektiv Java?
Scala - en bedre og mere effektiv Java?Scala - en bedre og mere effektiv Java?
Scala - en bedre og mere effektiv Java?Jesper Kamstrup Linnet
 
Scala uma poderosa linguagem para a jvm
Scala   uma poderosa linguagem para a jvmScala   uma poderosa linguagem para a jvm
Scala uma poderosa linguagem para a jvmIsaias Barroso
 
Let's build a parser!
Let's build a parser!Let's build a parser!
Let's build a parser!Boy Baukema
 
Reading Data into R
Reading Data into RReading Data into R
Reading Data into RKazuki Yoshida
 
A Sceptical Guide to Functional Programming
A Sceptical Guide to Functional ProgrammingA Sceptical Guide to Functional Programming
A Sceptical Guide to Functional ProgrammingGarth Gilmour
 
Perl tutorial final
Perl tutorial finalPerl tutorial final
Perl tutorial finalAshoka Vanjare
 

Ähnlich wie Genomic Analysis in Scala (20)

From HelloWorld to Configurable and Reusable Apache Spark Applications in Sca...
From HelloWorld to Configurable and Reusable Apache Spark Applications in Sca...From HelloWorld to Configurable and Reusable Apache Spark Applications in Sca...
From HelloWorld to Configurable and Reusable Apache Spark Applications in Sca...
 
Scala in Places API
Scala in Places APIScala in Places API
Scala in Places API
 
Elasticsearch And Apache Lucene For Apache Spark And MLlib
Elasticsearch And Apache Lucene For Apache Spark And MLlibElasticsearch And Apache Lucene For Apache Spark And MLlib
Elasticsearch And Apache Lucene For Apache Spark And MLlib
 
SPARQL Cheat Sheet
SPARQL Cheat SheetSPARQL Cheat Sheet
SPARQL Cheat Sheet
 
Introduction To Scala
Introduction To ScalaIntroduction To Scala
Introduction To Scala
 
The Scala Programming Language
The Scala Programming LanguageThe Scala Programming Language
The Scala Programming Language
 
So various polymorphism in Scala
So various polymorphism in ScalaSo various polymorphism in Scala
So various polymorphism in Scala
 
typemap in Perl/XS
typemap in Perl/XS  typemap in Perl/XS
typemap in Perl/XS
 
Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009Scala Talk at FOSDEM 2009
Scala Talk at FOSDEM 2009
 
What's New In C# 7
What's New In C# 7What's New In C# 7
What's New In C# 7
 
Static types on javascript?! Type checking approaches to ensure healthy appli...
Static types on javascript?! Type checking approaches to ensure healthy appli...Static types on javascript?! Type checking approaches to ensure healthy appli...
Static types on javascript?! Type checking approaches to ensure healthy appli...
 
Scala for Java Programmers
Scala for Java ProgrammersScala for Java Programmers
Scala for Java Programmers
 
Introduction to Scala
Introduction to ScalaIntroduction to Scala
Introduction to Scala
 
Scala - en bedre Java?
Scala - en bedre Java?Scala - en bedre Java?
Scala - en bedre Java?
 
Scala - en bedre og mere effektiv Java?
Scala - en bedre og mere effektiv Java?Scala - en bedre og mere effektiv Java?
Scala - en bedre og mere effektiv Java?
 
Scala uma poderosa linguagem para a jvm
Scala   uma poderosa linguagem para a jvmScala   uma poderosa linguagem para a jvm
Scala uma poderosa linguagem para a jvm
 
Let's build a parser!
Let's build a parser!Let's build a parser!
Let's build a parser!
 
Reading Data into R
Reading Data into RReading Data into R
Reading Data into R
 
A Sceptical Guide to Functional Programming
A Sceptical Guide to Functional ProgrammingA Sceptical Guide to Functional Programming
A Sceptical Guide to Functional Programming
 
Perl tutorial final
Perl tutorial finalPerl tutorial final
Perl tutorial final
 

KĂŒrzlich hochgeladen

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyviewmasabamasaba
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024VictoriaMetrics
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Hararemasabamasaba
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...masabamasaba
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfWilly Marroquin (WillyDevNET)
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfonteinmasabamasaba
 
call girls in Vaishali (Ghaziabad) 🔝 >àŒ’8448380779 🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïž
call girls in Vaishali (Ghaziabad) 🔝 >àŒ’8448380779 🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïžcall girls in Vaishali (Ghaziabad) 🔝 >àŒ’8448380779 🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïž
call girls in Vaishali (Ghaziabad) 🔝 >àŒ’8448380779 🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïžDelhi Call girls
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is insideshinachiaurasa2
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...masabamasaba
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 

KĂŒrzlich hochgeladen (20)

CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
call girls in Vaishali (Ghaziabad) 🔝 >àŒ’8448380779 🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïž
call girls in Vaishali (Ghaziabad) 🔝 >àŒ’8448380779 🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïžcall girls in Vaishali (Ghaziabad) 🔝 >àŒ’8448380779 🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïž
call girls in Vaishali (Ghaziabad) 🔝 >àŒ’8448380779 🔝 genuine Escort Service đŸ”âœ”ïžâœ”ïž
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 

Genomic Analysis in Scala

  • 1. Genomic Analysis in Scala Scala/Splash 2017 October 22, 2017 Ryan Williams 1 / 17
  • 2. Overview Intro Genomic applications General Scala libraries Design-pattern deep-dive "fun" with implicits Slides: hammerlab.org/splash-2017 Everything discussed in this talk open source / Apache 2.0 2 / 17
  • 3. Hammer Lab Mt. Sinai School of Medicine, NYC Research Personal Genome Vaccine pipeline / clinical trial Checkpoint blockade biomarkers, mutational signatures http://www.hammerlab.org/ Tools Genome biofx using Spark + Scala Biofx workïŹ‚ows and tools in OCaml The usual suspects: python, R, 
   3 / 17
  • 8. magic-rdds Collection-operations implemented for Spark RDDs scans {left,right} {elements, values of tuples} .runLengthEncode, group consecutive elements by predicate / Ordering .reverse reductions: .maxByKey, .minByKey sliding/windowed traversals .size - smart count multiple counts in one job: val (count1, count2) = (rdd1, rdd2).size smart partition-tracking: reuse counts for UnionRDDs zips lazy partition-count, eager partition-number check sameElements, equals group/sample by key: ïŹrst elems or reservoir-sampled HyperGeometric distribution handling Longs: hammerlab/math-utils 8 / 17
  • 9. hammerlab/iterators scans (in terms of cats.Monoid) sliding/windowed traversals eager drops/takes by number while until sorted/range zips SimpleBufferedIterator iterator in terms of _advance(): Option[T] hasNext lazily buffers/caches head etc. 9 / 17
  • 10. args4j case-app statically-checked/typed handlers implicit resolution inheritance vs. composition mutable vs. immutable case-app positional-arg support: #58 spark-commands: command-line interfaces class Opts { @args4j.Option( name = "--in-path", aliases = Array("-i"), handler = classOf[PathOptionHandler], usage = "Input path to read from" ) var inPath: Option[Path] = None @args4j.Option( name = "--out-path", aliases = Array("-o"), handler = classOf[PathOptionHandler], usage = "Output path to write to" ) var outPath: Option[Path] = None @args4j.Option( name = "--overwrite", aliases = Array("-f"), usage = "Whether to overwrite an existing ou ) var overwrite: Boolean = false } case class Opts( @Opt("-i") @Msg("Input path to read from") inPath: Option[Path] = None, @Opt("-o") @Msg("Output path to write to") outPath: Option[Path] = None, @Opt("-f") @Msg("Whether to overwrite an existing output overwrite: Boolean = false ) 10 / 17
  • 11. Design Patterns Down the typelevel / implicit rabbit-hole 11 / 17
  • 12. Deep case-class hierarchy: case class A(n: Int) case class B(s: String) case class C(a: A, b: B) case class D(b: Boolean) case class E(c: C, d: D, a: A, a2: A) case class F(e: E) Instances: val a = A(123) val b = B("abc") val c = C(a, b) val d = D(true) val e = E(c, d, A(456), A(789)) val f = F(e) Pull out ïŹelds by type and/or name: f.find('c) // f.e.c f.findT[C] // f.e.c f.field[C]('c) // f.e.c f.field[A]('a2) // f.e.a2 f.field[B]('b) // f.e.c.b As evidence parameters: def findAandB[T](t: T)( implicit findA: Find[T, A], findB: Find[T, B] ): (A, B) = (findA(t), findB(t)) shapeless-utils "recursive structural types" 12 / 17
  • 13. Nesting/Mixing implicit contexts Minimal boilerplate Spark CLI apps: input Path output Path (or: just a PrintStream) SparkContext select Broadcast variables other argument-input objects How to make all of these things implicitly available with minimal boilerplate? 13 / 17
  • 14. def app1() = { // call methods that want implicit // input Path, SparkContext } def app2() = { // call methods that want implicit // Path, SparkContext, PrintStream } Nesting/Mixing implicit contexts Minimal boilerplate Spark CLI apps: input Path output Path (or: just a PrintStream) SparkContext select Broadcast variables other argument-input objects How to make all of these things implicitly available with minimal boilerplate? Ideally: 13 / 17
  • 15. def run( implicit inPath: Path, printStream: PrintStream, sc: SparkContext, ranges: Broadcast[Ranges], 
 ): Unit = { // do thing } case class Context( inPath: Path, printStream: PrintStream, sc: SparkContext, ranges: Broadcast[Ranges], 
 ) def run(implicit ctx: Context): Unit = { implicit val Context( inPath, printStream, sc, ranges, 
 ) = ctx // do thing } Nesting/Mixing implicit contexts Minimal boilerplate Spark CLI apps: input Path output Path (or: just a PrintStream) SparkContext select Broadcast variables other argument-input objects How to make all of these things implicitly available with minimal boilerplate? 14 / 17
  • 16. trait HasInputPath { self: HasArgs ⇒ implicit val inPath = Path(args(0)) } trait HasOutputPath { self: HasArgs ⇒ val outPath = Path(args(1)) } class MinimalApp(args: Array[String]) extends HasArgs(args) with HasInputPath with HasPrintStream with HasSparkContext object Main { def main(args: Array[String]): Unit = new MinimalApp(args) { // all the implicits! } } } Nesting/Mixing implicit contexts How to make many implicits available with minimal boilerplate?   ≈ trait HasSparkContext { implicit val sc: SparkContext = new SparkContext(
) } abstract class HasArgs(args: Array[String]) trait HasPrintStream extends HasOutputPath { self: Args ⇒ implicit val printStream = new PrintStream(newOutputStream(outPath)) } 15 / 17
  • 17. That comes from a data structure like: case class Result( numPositions : Long, compressedSize : Bytes, compressionRatio : Double, numReads : Long, numFalsePositives: Long, numFalseNegatives: Long ) or better yet: case class Result( numPositions : NumPositions, compressedSize : CompressedSize, compressionRatio: CompressionRatio, numReads : NumReads, falseCounts : FalseCounts ) {to,from}String: invertible syntax Miscellaneous tools output "reports": 466202931615 uncompressed positions 156G compressed Compression ratio: 2.78 1236499892 reads 22489 false positives, 0 false negatives This is basically toString the Show type-class twist: downstream tools want to parse these reports want to re-hydrate Result instances implicit val _iso: Iso[FalseCounts] = iso"${'numFPs} false positives, ${'numFNs} false negatives" } 16 / 17