SlideShare ist ein Scribd-Unternehmen logo
1 von 12
gene
EXTRAPOLATION
models for
TOXICOGENOMIC
data
       daniel gusenleitner
         nacho caballero
Testing for carcinogenicity
is costly
Genes show
        clustered
        responses

Expression
 correlates
   between
  platforms
We want to extrapolate the
   expression of regular genes

               11000 Genes
2K Arrays




             10K            1K
            Regular      Landmark
            Genes          Genes
We fit a linear
model to each




                                       2K Arrays
                               1K
regular gene X              Landmark
                              Genes


Predicted Expression = Xβ
Expression Gene 1 = X1β1 +X2β2 +…+X2Kβ2K
Expression Gene 2 = X1β1 +X2β2 +…+X2Kβ2K
                    …
Expression Gene 10K = X1β1 +X2β2 +…+X2Kβ2K
Elastic
Net




                                           mean error
          number of variables




          glmnet: Lasso and elastic-
          net regularized
          generalized linear models
          http://cran.r-project.org/web/
          packages/glmnet/index.html
Neural
                               Networks
                                regular
                                 genes




                                 hidden
                                   layer

nnet: Feed-forward
Neural Networks and
Multinomial Log-
Linear Models                  landmark
http://cran.r-project.org/        genes
web/packages/nnet/index.html
mean fluorescent intensity



signal

        ratio
                Intensity variation
 -to-
noise

        intensity standard deviation
SNR =     extrapolation mean error
Building 10451 models
    takes a long time…

                          runtime    single
                total
                            per       CPU
              runtime
                           model    runtime
  linear
             120 x 3 h     2 min     360 h
regression
 elastic
             120 x 16 h   11 min    1920 h
   net
  neural     50 x 0.75              7800 h
                          45 min
 network         h                    ?
Signal-to-Noise Comparison
                      E-net    LM      NN
 ENSRNOG00000013133   135.58   19.62   13.21
 ENSRNOG00000011861   209.82   28.82   12.08
 ENSRNOG00000033466   190.81   26.58   11.86
 ENSRNOG00000036816   197.82   23.09   9.93
 ENSRNOG00000003515   273.62   29.35   8.68
 ENSRNOG00000002254   53.43    8.83    7.21
 ENSRNOG00000031266   76.19    8.19    6.70
 ENSRNOG00000005963   145.06   6.99    6.49
 ENSRNOG00000008613   38.86    3.97    6.07
 ENSRNOG00000023095   13.57    2.70    5.98
 ENSRNOG00000020947   17.27    2.41    5.04
 ENSRNOG00000007258   103.77   13.71   4.91
 ENSRNOG00000019813   16.53    3.01    4.68
 ENSRNOG00000014232   61.69    9.17    4.05
 ENSRNOG00000002454   50.71    5.58    3.80
 ENSRNOG00000018201    5.04    1.64    3.39
The elastic net
      outperforms
standard linear regression
   Signal-to-noise ratio




                           Elastic     Linear
                             Net     Regression
Additional feature selection

Performance of extrapolation models
 on carcinogenicity classifiers
Correlation between Luminex and
 Affymetrix chips

Weitere ähnliche Inhalte

Ähnlich wie Gene Extrapolation Models for Toxicogenomic Data

Nural network ER. Abhishek k. upadhyay
Nural network ER. Abhishek  k. upadhyayNural network ER. Abhishek  k. upadhyay
Nural network ER. Abhishek k. upadhyay
abhishek upadhyay
 
Deep Learning Based Voice Activity Detection and Speech Enhancement
Deep Learning Based Voice Activity Detection and Speech EnhancementDeep Learning Based Voice Activity Detection and Speech Enhancement
Deep Learning Based Voice Activity Detection and Speech Enhancement
NAVER Engineering
 

Ähnlich wie Gene Extrapolation Models for Toxicogenomic Data (20)

Evolving Comprehensible Neural Network Trees
Evolving Comprehensible Neural Network TreesEvolving Comprehensible Neural Network Trees
Evolving Comprehensible Neural Network Trees
 
Microarray Analysis
Microarray AnalysisMicroarray Analysis
Microarray Analysis
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Echo state networks and locomotion patterns
Echo state networks and locomotion patternsEcho state networks and locomotion patterns
Echo state networks and locomotion patterns
 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspective
 
AINL 2016: Skornyakov
AINL 2016: SkornyakovAINL 2016: Skornyakov
AINL 2016: Skornyakov
 
MULTI-DOMAIN UNPAIRED ULTRASOUND IMAGE ARTIFACT REMOVAL USING A SINGLE CONVOL...
MULTI-DOMAIN UNPAIRED ULTRASOUND IMAGE ARTIFACT REMOVAL USING A SINGLE CONVOL...MULTI-DOMAIN UNPAIRED ULTRASOUND IMAGE ARTIFACT REMOVAL USING A SINGLE CONVOL...
MULTI-DOMAIN UNPAIRED ULTRASOUND IMAGE ARTIFACT REMOVAL USING A SINGLE CONVOL...
 
Neural networks and deep learning
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learning
 
Nural network ER. Abhishek k. upadhyay
Nural network ER. Abhishek  k. upadhyayNural network ER. Abhishek  k. upadhyay
Nural network ER. Abhishek k. upadhyay
 
Adaptive equalization
Adaptive equalizationAdaptive equalization
Adaptive equalization
 
2019-06-14:6 - Reti neurali e compressione immagine
2019-06-14:6 - Reti neurali e compressione immagine2019-06-14:6 - Reti neurali e compressione immagine
2019-06-14:6 - Reti neurali e compressione immagine
 
Alberto Massidda - Scenes from a memory - Codemotion Rome 2019
Alberto Massidda - Scenes from a memory - Codemotion Rome 2019Alberto Massidda - Scenes from a memory - Codemotion Rome 2019
Alberto Massidda - Scenes from a memory - Codemotion Rome 2019
 
Deep Learning Based Voice Activity Detection and Speech Enhancement
Deep Learning Based Voice Activity Detection and Speech EnhancementDeep Learning Based Voice Activity Detection and Speech Enhancement
Deep Learning Based Voice Activity Detection and Speech Enhancement
 
H017376369
H017376369H017376369
H017376369
 
A New Classifier Based onRecurrent Neural Network Using Multiple Binary-Outpu...
A New Classifier Based onRecurrent Neural Network Using Multiple Binary-Outpu...A New Classifier Based onRecurrent Neural Network Using Multiple Binary-Outpu...
A New Classifier Based onRecurrent Neural Network Using Multiple Binary-Outpu...
 
Deep Learning and TensorFlow
Deep Learning and TensorFlowDeep Learning and TensorFlow
Deep Learning and TensorFlow
 
Java and Deep Learning
Java and Deep LearningJava and Deep Learning
Java and Deep Learning
 
Prof. Ramez Daniel, Technion
Prof. Ramez Daniel, TechnionProf. Ramez Daniel, Technion
Prof. Ramez Daniel, Technion
 
Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)Convolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN)
 
Microarray biotechnologg ppy dna microarrays
Microarray biotechnologg ppy dna microarraysMicroarray biotechnologg ppy dna microarrays
Microarray biotechnologg ppy dna microarrays
 

Mehr von Nacho Caballero

Lassa virus detection using gene expression analysis
Lassa virus detection using gene expression analysisLassa virus detection using gene expression analysis
Lassa virus detection using gene expression analysis
Nacho Caballero
 
Squamous Cell Carcinoma: Looking for tale-tell signs
Squamous Cell Carcinoma: Looking for tale-tell signsSquamous Cell Carcinoma: Looking for tale-tell signs
Squamous Cell Carcinoma: Looking for tale-tell signs
Nacho Caballero
 

Mehr von Nacho Caballero (20)

A Spanish Daily Routine for People Who Struggle with Daily Routines
A Spanish Daily Routine for People Who Struggle with Daily RoutinesA Spanish Daily Routine for People Who Struggle with Daily Routines
A Spanish Daily Routine for People Who Struggle with Daily Routines
 
Single-Cell Transcriptome Analysis of Pluripotent Stem Cells
Single-Cell Transcriptome Analysis of Pluripotent Stem CellsSingle-Cell Transcriptome Analysis of Pluripotent Stem Cells
Single-Cell Transcriptome Analysis of Pluripotent Stem Cells
 
Using the Host Immune Response to Hemorrhagic Fever Viruses to Understand Pat...
Using the Host Immune Response to Hemorrhagic Fever Viruses to Understand Pat...Using the Host Immune Response to Hemorrhagic Fever Viruses to Understand Pat...
Using the Host Immune Response to Hemorrhagic Fever Viruses to Understand Pat...
 
A good looking pipeline
A good looking pipelineA good looking pipeline
A good looking pipeline
 
3 Ways to Obliterate Bullet Points From Your Slides - Slide Makeover 2
3 Ways to Obliterate Bullet Points From Your Slides - Slide Makeover 23 Ways to Obliterate Bullet Points From Your Slides - Slide Makeover 2
3 Ways to Obliterate Bullet Points From Your Slides - Slide Makeover 2
 
Creating effective slides without having to become a graphic designer
Creating effective slides without having to become a graphic designerCreating effective slides without having to become a graphic designer
Creating effective slides without having to become a graphic designer
 
How to Build Compelling Research Stories That People Will Remember
How to Build Compelling Research Stories That People Will RememberHow to Build Compelling Research Stories That People Will Remember
How to Build Compelling Research Stories That People Will Remember
 
Virus Hunting in French Guiana
Virus Hunting in French GuianaVirus Hunting in French Guiana
Virus Hunting in French Guiana
 
Finding the viral diversity in a biological sample
Finding the viral diversity in a biological sampleFinding the viral diversity in a biological sample
Finding the viral diversity in a biological sample
 
Viral biodiversity in rodents
Viral biodiversity in rodentsViral biodiversity in rodents
Viral biodiversity in rodents
 
Bridging data analysis and interactive visualization
Bridging data analysis and interactive visualizationBridging data analysis and interactive visualization
Bridging data analysis and interactive visualization
 
High-resolution transcriptome of human macrophages
High-resolution transcriptome of human macrophagesHigh-resolution transcriptome of human macrophages
High-resolution transcriptome of human macrophages
 
Early detection of highly pathogenic viral infections
Early detection of highly pathogenic viral infectionsEarly detection of highly pathogenic viral infections
Early detection of highly pathogenic viral infections
 
Lab meeting 25/9
Lab meeting 25/9Lab meeting 25/9
Lab meeting 25/9
 
Lassa virus detection using gene expression analysis
Lassa virus detection using gene expression analysisLassa virus detection using gene expression analysis
Lassa virus detection using gene expression analysis
 
An RNA reset button
An RNA reset buttonAn RNA reset button
An RNA reset button
 
Buck v Bell
Buck v BellBuck v Bell
Buck v Bell
 
HIV-1 Antibodies
HIV-1 AntibodiesHIV-1 Antibodies
HIV-1 Antibodies
 
Squamous Cell Carcinoma: Looking for tale-tell signs
Squamous Cell Carcinoma: Looking for tale-tell signsSquamous Cell Carcinoma: Looking for tale-tell signs
Squamous Cell Carcinoma: Looking for tale-tell signs
 
29 Mammalian Genomes
29 Mammalian Genomes29 Mammalian Genomes
29 Mammalian Genomes
 

Kürzlich hochgeladen

Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
UXDXConf
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
panagenda
 

Kürzlich hochgeladen (20)

Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 

Gene Extrapolation Models for Toxicogenomic Data

  • 1. gene EXTRAPOLATION models for TOXICOGENOMIC data daniel gusenleitner nacho caballero
  • 3. Genes show clustered responses Expression correlates between platforms
  • 4. We want to extrapolate the expression of regular genes 11000 Genes 2K Arrays 10K 1K Regular Landmark Genes Genes
  • 5. We fit a linear model to each 2K Arrays 1K regular gene X Landmark Genes Predicted Expression = Xβ Expression Gene 1 = X1β1 +X2β2 +…+X2Kβ2K Expression Gene 2 = X1β1 +X2β2 +…+X2Kβ2K … Expression Gene 10K = X1β1 +X2β2 +…+X2Kβ2K
  • 6. Elastic Net mean error number of variables glmnet: Lasso and elastic- net regularized generalized linear models http://cran.r-project.org/web/ packages/glmnet/index.html
  • 7. Neural Networks regular genes hidden layer nnet: Feed-forward Neural Networks and Multinomial Log- Linear Models landmark http://cran.r-project.org/ genes web/packages/nnet/index.html
  • 8. mean fluorescent intensity signal ratio Intensity variation -to- noise intensity standard deviation SNR = extrapolation mean error
  • 9. Building 10451 models takes a long time… runtime single total per CPU runtime model runtime linear 120 x 3 h 2 min 360 h regression elastic 120 x 16 h 11 min 1920 h net neural 50 x 0.75 7800 h 45 min network h ?
  • 10. Signal-to-Noise Comparison E-net LM NN ENSRNOG00000013133 135.58 19.62 13.21 ENSRNOG00000011861 209.82 28.82 12.08 ENSRNOG00000033466 190.81 26.58 11.86 ENSRNOG00000036816 197.82 23.09 9.93 ENSRNOG00000003515 273.62 29.35 8.68 ENSRNOG00000002254 53.43 8.83 7.21 ENSRNOG00000031266 76.19 8.19 6.70 ENSRNOG00000005963 145.06 6.99 6.49 ENSRNOG00000008613 38.86 3.97 6.07 ENSRNOG00000023095 13.57 2.70 5.98 ENSRNOG00000020947 17.27 2.41 5.04 ENSRNOG00000007258 103.77 13.71 4.91 ENSRNOG00000019813 16.53 3.01 4.68 ENSRNOG00000014232 61.69 9.17 4.05 ENSRNOG00000002454 50.71 5.58 3.80 ENSRNOG00000018201 5.04 1.64 3.39
  • 11. The elastic net outperforms standard linear regression Signal-to-noise ratio Elastic Linear Net Regression
  • 12. Additional feature selection Performance of extrapolation models on carcinogenicity classifiers Correlation between Luminex and Affymetrix chips