SlideShare a Scribd company logo
1 of 27
RAPID PROTOTYPING DATA
PRODUCTS USING SHINY
rstudio::conf 2018
2017-02-02
2004 201220072005 2006 20142013 2015
SPEAKER PROFILE
TANYA CASHORALI
@TANYACASH21
2
HUMANS ARE NOT FORTUNE TELLERS
Missing
Data
OutliersNonlinearityCollinearity
Delimiters!!
1
t;||
Of course I knew there
wouldn’t be enough data
in Oglala Lakota County
when I wrote the 25
page requirements doc!
WE’RE NOT BUILDING ASTON MARTINS
“Laugh at perfection. It’s boring and keeps you from being done.”
THE DONE MANIFESTO
• http://www.manifestoproject.it/bre-pettis-and-kio-stark/
• https://www.bakadesuyo.com/2015/09/impostor-syndrome/
“Pretending you know what you’re doing
is almost the same as knowing what you
are doing, so just accept that you know
what you’re doing even if you don’t
and do it.”
There are three states of being:
1. Not knowing
2. Action
3. Completion.
CASE STUDIES
1.6 BILLION DOCUMENTS
Problem
Need to enable scientists to query 1.6 billion
“documents” (SNP + phenotype combinations)
quickly and filter based on significance and
various other filters.
CUSTOM RMONGO PACKAGE
RMongo package built in Scala did not support authentication for Mongo 3.0
So we built an RJMongo package using Java = ACTION!
That same issue still isn’t resolved – originally reported in June 2015
PERFORMANCE?
action <- dataTableAjax(session, result,rownames = FALSE,filter = function(data, params) {
q = params
data=dataFromMongo(qs,q$search,q$start,q$length,q$column,q$order)
list(
draw = as.integer(q$draw),
recordsTotal = recordCount,
recordsFiltered =recordCount ,
data = unname(as.matrix(data)),
DT_rows_all = 5
)
})
widget <- datatable(result,
rownames = FALSE,
class = 'display cell-border compact',
selection = 'none',
options = list(ajax = list(url = action),scrollX = TRUE,serverSide = TRUE,stateSave = TRUE,
escape=FALSE,filter=FALSE,processing=TRUE,language = list(processing = "<img src='spin.gif'>"),columnDefs = list(list(targets =
c(0:4,6:25),sortable = FALSE)),order = list(list(5,'asc')))
)
* https://www.rdocumentation.org/packages/DT/versions/0.2/topics/dataTableAjax
In order to improve query performance… dataTableAjax() to the resuce!
FIRST VERSION
“Accept that everything is a draft.
It helps to get done.”
CURRENT PRODUCT
LET’S ADD 2.5 BILLION MORE!
• One node cluster w/ 512GB of RAM
• Current data size ~3 terabytes in JSON format
“Done is the engine of more.”
CMR API
Problem – API access to data from
Centre for Medicines Research (CMR)
International, which provides pharmaceutical
industry metrics and trends analysis.
Issues:
• Clunky API
• Tons of parameter combinations and
results returned in aggregate
• Time-consuming
• IT dumped some of the data
• Slow
• Poor usability on their GUI (filters are
clunky)
• Ineffective visualizations
• Data extracts contain limited details and
were difficult to use
CMR API
First iteration was just ggplots and iterating with client on necessary parameters,
don’t need thousands of indications
AUTHENTICATION (PYTHON! GASP!)
“The point of being done
is not to finish but
to get other things done.”
HOW IT WORKS
cmr_api.R
auth.py
server.R ui.R
fetch_data(token, endpt, params)
reticulate
get_token()
“Once you’re done you
can throw it away.”
CURRENT PRODUCT
DRUG MANUFACTURING
• Many combinations of raw materials in
specific order used to create final drug
substance
• Time Consuming
• Costly
• One problematic substance = lost
batches = millions of dollars
• Single user was running 100s of SQL
queries manually
Throw out massive
requirements docs
NETWORKD3
“People without dirty
hands are wrong.
Doing something makes
you right.”
FIRST VERSION – CORE FUNCTIONALITY
“There is no editing stage.”
DETAILS COME LATER
“Failure counts as done. So do mistakes.”
SHINY AND D3 COMMUNICATION
server.R: session$sendCustomMessage(type="jsondata",var_json)
www/: main.js
Shiny.addCustomMessageHandler("jsondata", function (message) {
if (typeof(message) !== 'undefined') {
var json_data = JSON.parse(message);
initTree(json_data.left);
initSide(json_data.right);
}
});
ui.R: tags$script(src=”main.js")
• http://myinspirationinformation.com/visualisation/d3-js/integrating-d3-js-into-r-shiny/
”FINAL” PRODUCT
Previously:
6 months and full
team to identify
problematic
substance
Now:
1-2 users and 1 day
to identify
problematic
substance
OVERVIEW OF RAPID PROTOTYPING PROCESS
IF WE WERE MAKING DONUTS
THANK YOU
Patrick
Brophy
Daron
Carlson
Mike
Fitzpatrick
Roland
Zhou
Olivia
Brode-Roger
Rajesh
Mikkilineni
Jason
Tetrault
Marianna
Foos

More Related Content

Similar to Rapid Prototyping Data Products in Shiny - RStudio::Conf 2018

Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataKaran Desai
 
Offline First Applications
Offline First ApplicationsOffline First Applications
Offline First Applicationstechmaddy
 
Moving to a data-centric architecture: Toronto Data Unconference 2015
Moving to a data-centric architecture: Toronto Data Unconference 2015Moving to a data-centric architecture: Toronto Data Unconference 2015
Moving to a data-centric architecture: Toronto Data Unconference 2015Adam Muise
 
Rental Cars and Industrialized Learning to Rank with Sean Downes
Rental Cars and Industrialized Learning to Rank with Sean DownesRental Cars and Industrialized Learning to Rank with Sean Downes
Rental Cars and Industrialized Learning to Rank with Sean DownesDatabricks
 
Measure All the Things! - Austin Data Day 2014
Measure All the Things! - Austin Data Day 2014Measure All the Things! - Austin Data Day 2014
Measure All the Things! - Austin Data Day 2014gdusbabek
 
Protecting privacy with fuzzy-feeling test data
Protecting privacy with fuzzy-feeling test dataProtecting privacy with fuzzy-feeling test data
Protecting privacy with fuzzy-feeling test dataMatt Bowen
 
Data Driven Design - Frontend Conference Zurich
Data Driven Design - Frontend Conference ZurichData Driven Design - Frontend Conference Zurich
Data Driven Design - Frontend Conference ZurichMemi Beltrame
 
Technologies, Data Analytics Service and Enterprise Business
Technologies, Data Analytics Service and Enterprise BusinessTechnologies, Data Analytics Service and Enterprise Business
Technologies, Data Analytics Service and Enterprise BusinessSATOSHI TAGOMORI
 
Fuck Spreadsheets - first steps to become a data-driven company
Fuck Spreadsheets - first steps to become a data-driven companyFuck Spreadsheets - first steps to become a data-driven company
Fuck Spreadsheets - first steps to become a data-driven companySteven Stadler
 
TIBCO Advanced Analytics Meetup (TAAM) - June 2015
TIBCO Advanced Analytics Meetup (TAAM) - June 2015TIBCO Advanced Analytics Meetup (TAAM) - June 2015
TIBCO Advanced Analytics Meetup (TAAM) - June 2015Bipin Singh
 
Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011Eli White
 
Building a Data Driven Organization
Building a Data Driven OrganizationBuilding a Data Driven Organization
Building a Data Driven OrganizationIT Weekend
 
Alexis max-Creating a bot experience as good as your user experience - Alexis...
Alexis max-Creating a bot experience as good as your user experience - Alexis...Alexis max-Creating a bot experience as good as your user experience - Alexis...
Alexis max-Creating a bot experience as good as your user experience - Alexis...WeLoveSEO
 
Future of data science as a profession
Future of data science as a professionFuture of data science as a profession
Future of data science as a professionJose Quesada
 
Le Web 2012 presentation - Dalton Caldwell
Le Web 2012 presentation - Dalton CaldwellLe Web 2012 presentation - Dalton Caldwell
Le Web 2012 presentation - Dalton Caldwelldaltoncaldwell
 
Inside Out and Upside Down - FOO Camp 2016 - Peter Coffee
Inside Out and Upside Down - FOO Camp 2016 - Peter CoffeeInside Out and Upside Down - FOO Camp 2016 - Peter Coffee
Inside Out and Upside Down - FOO Camp 2016 - Peter CoffeePeter Coffee
 
Python vs JLizard.... a python logging experience
Python vs JLizard.... a python logging experiencePython vs JLizard.... a python logging experience
Python vs JLizard.... a python logging experiencePython Ireland
 
WisdomEye Technologies
WisdomEye TechnologiesWisdomEye Technologies
WisdomEye TechnologiesAshish Jha
 

Similar to Rapid Prototyping Data Products in Shiny - RStudio::Conf 2018 (20)

Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Offline first geeknight
Offline first geeknightOffline first geeknight
Offline first geeknight
 
Offline First Applications
Offline First ApplicationsOffline First Applications
Offline First Applications
 
Moving to a data-centric architecture: Toronto Data Unconference 2015
Moving to a data-centric architecture: Toronto Data Unconference 2015Moving to a data-centric architecture: Toronto Data Unconference 2015
Moving to a data-centric architecture: Toronto Data Unconference 2015
 
Rental Cars and Industrialized Learning to Rank with Sean Downes
Rental Cars and Industrialized Learning to Rank with Sean DownesRental Cars and Industrialized Learning to Rank with Sean Downes
Rental Cars and Industrialized Learning to Rank with Sean Downes
 
Measure All the Things! - Austin Data Day 2014
Measure All the Things! - Austin Data Day 2014Measure All the Things! - Austin Data Day 2014
Measure All the Things! - Austin Data Day 2014
 
Protecting privacy with fuzzy-feeling test data
Protecting privacy with fuzzy-feeling test dataProtecting privacy with fuzzy-feeling test data
Protecting privacy with fuzzy-feeling test data
 
Data Driven Design - Frontend Conference Zurich
Data Driven Design - Frontend Conference ZurichData Driven Design - Frontend Conference Zurich
Data Driven Design - Frontend Conference Zurich
 
Technologies, Data Analytics Service and Enterprise Business
Technologies, Data Analytics Service and Enterprise BusinessTechnologies, Data Analytics Service and Enterprise Business
Technologies, Data Analytics Service and Enterprise Business
 
Fuck Spreadsheets - first steps to become a data-driven company
Fuck Spreadsheets - first steps to become a data-driven companyFuck Spreadsheets - first steps to become a data-driven company
Fuck Spreadsheets - first steps to become a data-driven company
 
TIBCO Advanced Analytics Meetup (TAAM) - June 2015
TIBCO Advanced Analytics Meetup (TAAM) - June 2015TIBCO Advanced Analytics Meetup (TAAM) - June 2015
TIBCO Advanced Analytics Meetup (TAAM) - June 2015
 
Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011Big data and APIs for PHP developers - SXSW 2011
Big data and APIs for PHP developers - SXSW 2011
 
Building a Data Driven Organization
Building a Data Driven OrganizationBuilding a Data Driven Organization
Building a Data Driven Organization
 
Big Data
Big DataBig Data
Big Data
 
Alexis max-Creating a bot experience as good as your user experience - Alexis...
Alexis max-Creating a bot experience as good as your user experience - Alexis...Alexis max-Creating a bot experience as good as your user experience - Alexis...
Alexis max-Creating a bot experience as good as your user experience - Alexis...
 
Future of data science as a profession
Future of data science as a professionFuture of data science as a profession
Future of data science as a profession
 
Le Web 2012 presentation - Dalton Caldwell
Le Web 2012 presentation - Dalton CaldwellLe Web 2012 presentation - Dalton Caldwell
Le Web 2012 presentation - Dalton Caldwell
 
Inside Out and Upside Down - FOO Camp 2016 - Peter Coffee
Inside Out and Upside Down - FOO Camp 2016 - Peter CoffeeInside Out and Upside Down - FOO Camp 2016 - Peter Coffee
Inside Out and Upside Down - FOO Camp 2016 - Peter Coffee
 
Python vs JLizard.... a python logging experience
Python vs JLizard.... a python logging experiencePython vs JLizard.... a python logging experience
Python vs JLizard.... a python logging experience
 
WisdomEye Technologies
WisdomEye TechnologiesWisdomEye Technologies
WisdomEye Technologies
 

More from Tanya Cashorali

When and Why to Use Shiny for Commercial Applications
When and Why to Use Shiny for Commercial ApplicationsWhen and Why to Use Shiny for Commercial Applications
When and Why to Use Shiny for Commercial ApplicationsTanya Cashorali
 
Strata 2017 NYC - How to Hire and Test for Data Skills: A One-Size-Fits-All I...
Strata 2017 NYC - How to Hire and Test for Data Skills: A One-Size-Fits-All I...Strata 2017 NYC - How to Hire and Test for Data Skills: A One-Size-Fits-All I...
Strata 2017 NYC - How to Hire and Test for Data Skills: A One-Size-Fits-All I...Tanya Cashorali
 
Rapid Prototyping Data Products in Shiny - ODSC 2017
Rapid Prototyping Data Products in Shiny - ODSC 2017 Rapid Prototyping Data Products in Shiny - ODSC 2017
Rapid Prototyping Data Products in Shiny - ODSC 2017 Tanya Cashorali
 
SportsDataViz using Plotly, Shiny and Flexdashboard - PlotCon 2016
SportsDataViz using Plotly, Shiny and Flexdashboard - PlotCon 2016SportsDataViz using Plotly, Shiny and Flexdashboard - PlotCon 2016
SportsDataViz using Plotly, Shiny and Flexdashboard - PlotCon 2016Tanya Cashorali
 
Popular Industry Applications of R
Popular Industry Applications of RPopular Industry Applications of R
Popular Industry Applications of RTanya Cashorali
 
Solving Real Business Problems with Big Data: Measuring Customer Loyalty in t...
Solving Real Business Problems with Big Data: Measuring Customer Loyalty in t...Solving Real Business Problems with Big Data: Measuring Customer Loyalty in t...
Solving Real Business Problems with Big Data: Measuring Customer Loyalty in t...Tanya Cashorali
 
Big data meetup_10_9_2013
Big data meetup_10_9_2013Big data meetup_10_9_2013
Big data meetup_10_9_2013Tanya Cashorali
 
Front endrequirements 09_25_2013
Front endrequirements 09_25_2013Front endrequirements 09_25_2013
Front endrequirements 09_25_2013Tanya Cashorali
 
Microsoft NERD Talk - R and Tableau - 2-4-2013
Microsoft NERD Talk - R and Tableau - 2-4-2013Microsoft NERD Talk - R and Tableau - 2-4-2013
Microsoft NERD Talk - R and Tableau - 2-4-2013Tanya Cashorali
 

More from Tanya Cashorali (10)

When and Why to Use Shiny for Commercial Applications
When and Why to Use Shiny for Commercial ApplicationsWhen and Why to Use Shiny for Commercial Applications
When and Why to Use Shiny for Commercial Applications
 
Strata 2017 NYC - How to Hire and Test for Data Skills: A One-Size-Fits-All I...
Strata 2017 NYC - How to Hire and Test for Data Skills: A One-Size-Fits-All I...Strata 2017 NYC - How to Hire and Test for Data Skills: A One-Size-Fits-All I...
Strata 2017 NYC - How to Hire and Test for Data Skills: A One-Size-Fits-All I...
 
Rapid Prototyping Data Products in Shiny - ODSC 2017
Rapid Prototyping Data Products in Shiny - ODSC 2017 Rapid Prototyping Data Products in Shiny - ODSC 2017
Rapid Prototyping Data Products in Shiny - ODSC 2017
 
SportsDataViz using Plotly, Shiny and Flexdashboard - PlotCon 2016
SportsDataViz using Plotly, Shiny and Flexdashboard - PlotCon 2016SportsDataViz using Plotly, Shiny and Flexdashboard - PlotCon 2016
SportsDataViz using Plotly, Shiny and Flexdashboard - PlotCon 2016
 
Popular Industry Applications of R
Popular Industry Applications of RPopular Industry Applications of R
Popular Industry Applications of R
 
Solving Real Business Problems with Big Data: Measuring Customer Loyalty in t...
Solving Real Business Problems with Big Data: Measuring Customer Loyalty in t...Solving Real Business Problems with Big Data: Measuring Customer Loyalty in t...
Solving Real Business Problems with Big Data: Measuring Customer Loyalty in t...
 
DataCon Talk
DataCon Talk DataCon Talk
DataCon Talk
 
Big data meetup_10_9_2013
Big data meetup_10_9_2013Big data meetup_10_9_2013
Big data meetup_10_9_2013
 
Front endrequirements 09_25_2013
Front endrequirements 09_25_2013Front endrequirements 09_25_2013
Front endrequirements 09_25_2013
 
Microsoft NERD Talk - R and Tableau - 2-4-2013
Microsoft NERD Talk - R and Tableau - 2-4-2013Microsoft NERD Talk - R and Tableau - 2-4-2013
Microsoft NERD Talk - R and Tableau - 2-4-2013
 

Recently uploaded

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 

Recently uploaded (20)

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Rapid Prototyping Data Products in Shiny - RStudio::Conf 2018

  • 1. RAPID PROTOTYPING DATA PRODUCTS USING SHINY rstudio::conf 2018 2017-02-02
  • 2. 2004 201220072005 2006 20142013 2015 SPEAKER PROFILE TANYA CASHORALI @TANYACASH21 2
  • 3. HUMANS ARE NOT FORTUNE TELLERS Missing Data OutliersNonlinearityCollinearity Delimiters!! 1 t;|| Of course I knew there wouldn’t be enough data in Oglala Lakota County when I wrote the 25 page requirements doc!
  • 4. WE’RE NOT BUILDING ASTON MARTINS “Laugh at perfection. It’s boring and keeps you from being done.”
  • 5. THE DONE MANIFESTO • http://www.manifestoproject.it/bre-pettis-and-kio-stark/ • https://www.bakadesuyo.com/2015/09/impostor-syndrome/ “Pretending you know what you’re doing is almost the same as knowing what you are doing, so just accept that you know what you’re doing even if you don’t and do it.” There are three states of being: 1. Not knowing 2. Action 3. Completion.
  • 7. 1.6 BILLION DOCUMENTS Problem Need to enable scientists to query 1.6 billion “documents” (SNP + phenotype combinations) quickly and filter based on significance and various other filters.
  • 8. CUSTOM RMONGO PACKAGE RMongo package built in Scala did not support authentication for Mongo 3.0 So we built an RJMongo package using Java = ACTION! That same issue still isn’t resolved – originally reported in June 2015
  • 9. PERFORMANCE? action <- dataTableAjax(session, result,rownames = FALSE,filter = function(data, params) { q = params data=dataFromMongo(qs,q$search,q$start,q$length,q$column,q$order) list( draw = as.integer(q$draw), recordsTotal = recordCount, recordsFiltered =recordCount , data = unname(as.matrix(data)), DT_rows_all = 5 ) }) widget <- datatable(result, rownames = FALSE, class = 'display cell-border compact', selection = 'none', options = list(ajax = list(url = action),scrollX = TRUE,serverSide = TRUE,stateSave = TRUE, escape=FALSE,filter=FALSE,processing=TRUE,language = list(processing = "<img src='spin.gif'>"),columnDefs = list(list(targets = c(0:4,6:25),sortable = FALSE)),order = list(list(5,'asc'))) ) * https://www.rdocumentation.org/packages/DT/versions/0.2/topics/dataTableAjax In order to improve query performance… dataTableAjax() to the resuce!
  • 10. FIRST VERSION “Accept that everything is a draft. It helps to get done.”
  • 12. LET’S ADD 2.5 BILLION MORE! • One node cluster w/ 512GB of RAM • Current data size ~3 terabytes in JSON format “Done is the engine of more.”
  • 13. CMR API Problem – API access to data from Centre for Medicines Research (CMR) International, which provides pharmaceutical industry metrics and trends analysis. Issues: • Clunky API • Tons of parameter combinations and results returned in aggregate • Time-consuming • IT dumped some of the data • Slow • Poor usability on their GUI (filters are clunky) • Ineffective visualizations • Data extracts contain limited details and were difficult to use
  • 14. CMR API First iteration was just ggplots and iterating with client on necessary parameters, don’t need thousands of indications
  • 15. AUTHENTICATION (PYTHON! GASP!) “The point of being done is not to finish but to get other things done.”
  • 16. HOW IT WORKS cmr_api.R auth.py server.R ui.R fetch_data(token, endpt, params) reticulate get_token() “Once you’re done you can throw it away.”
  • 18. DRUG MANUFACTURING • Many combinations of raw materials in specific order used to create final drug substance • Time Consuming • Costly • One problematic substance = lost batches = millions of dollars • Single user was running 100s of SQL queries manually
  • 20.
  • 21. NETWORKD3 “People without dirty hands are wrong. Doing something makes you right.”
  • 22. FIRST VERSION – CORE FUNCTIONALITY “There is no editing stage.”
  • 23. DETAILS COME LATER “Failure counts as done. So do mistakes.”
  • 24. SHINY AND D3 COMMUNICATION server.R: session$sendCustomMessage(type="jsondata",var_json) www/: main.js Shiny.addCustomMessageHandler("jsondata", function (message) { if (typeof(message) !== 'undefined') { var json_data = JSON.parse(message); initTree(json_data.left); initSide(json_data.right); } }); ui.R: tags$script(src=”main.js") • http://myinspirationinformation.com/visualisation/d3-js/integrating-d3-js-into-r-shiny/
  • 25. ”FINAL” PRODUCT Previously: 6 months and full team to identify problematic substance Now: 1-2 users and 1 day to identify problematic substance
  • 26. OVERVIEW OF RAPID PROTOTYPING PROCESS IF WE WERE MAKING DONUTS

Editor's Notes

  1. R 2005 story,
  2. Number 1 of the done manifesto
  3. Single nucleotide polymorphisms, frequently called SNPs (pronounced “snips”), are the most common type of genetic variation among people. Each SNP represents a difference in a single DNA building block, called a nucleotide. For example, a SNP may replace the nucleotide cytosine (C) with the nucleotide thymine (T) in a certain stretch of DNA. SNPs occur normally throughout a person’s DNA. They occur once in every 300 nucleotides on average, which means there are roughly 10 million SNPs in the human genome. Most commonly, these variations are found in the DNA between genes.
  4. We had authentication issues with Rmongo and Mongo 3.0, package was built in scala, we re-built it in java. Still wasn’t resolved 1 year later (jun 2015 when I reported, still open today)
  5. It is basically an implementation of server-side processing of DataTables in R. Also set up auth using the copmany’s single-sign on
  6. Full web dev team would take much longer
  7. 2 years later! Still being used and wanting to expand upon. Shiny infrastructure is there though.
  8. What are the latest trends in R&D productivity across the industry? What are the key factors that influence R&D productivity? How do different companies compare — with the industry, with competitors? What are the latest trends in industry pipeline volumes, cycle times and success rates – by therapeutic area and granular indications? What are the most effective and useful metrics for measuring and comparing R&D productivity across the global pharmaceutical industry? Are the timelines and success rates by therapy area being experienced by my company competitive with the rest of the industry and what are the drivers for above or below average performance?
  9. Add more charts
  10. Fastest way to get the data, python auth code example in their docs
  11. Refactor not throw away
  12. networkD3 wasn’t enough needed more customization
  13. Need a bi-directional tree, colors showed up that the client didn’t know existed!
  14. Send custom message to front-end This searches for the custom message of the type “jsondata”. Then it takes the contents of the message, and assigns them to a java script variable, in this case json_data