18. Sage Bionetworks
A non-profit organization with a vision to enable networked team
approaches to building better models of disease
BIOMEDICINE INFORMATION COMMONS INCUBATOR
Building Disease Maps Data Repository
Commons Pilots Discovery Platform
Sagebase.org
19. Networked Approaches within a Commons
BioMedicine Information Commons
Patients/
Citizens
Data
Generators
CURATED
DATA
Data
TOOLS/ Analysts
METHODS
RAW
DATA
ANALYZES/
MODELS
Clinicians
SYNAPSE
Experimentalists
20. Existing Barriers to 2
1
REWARDS
Networked Approaches USABLE
RECOGNITION
DATA
BioMedical Information Commons
Patients/
Citizens
Data
Generators
CURATED
DATA
Data
5
TOOLS/ 3
Analysts
REWARDS
METHODS HOW
TO
FOR
RAW DISTRIBUTE
SHARING
DATA TASKS
ANALYZES/
MODELS
Clinicians
4
PRIVACY
SYNAPSE
Experimentalists
BARRIERS
21. COMPONENTS
NEEDED
FOR
NETWORKED
APPROCHES
TO
BUILDING
EVOLVING
MODELS
OF
DISEASE:
RESEARCH
2.0
GEEKS
AND
SCIENTISTS
SANDBOX
PLACE
TO
BUILD
MODELS
SYNAPSE
OF
DISEASE
22. Two approaches to building common
scientific and technical knowledge
Every code change versioned
Every issue tracked
Text summary of the completed project Every project the starting point for new work
Assembled after the fact All evolving and accessible in real time
Social Coding
23. Synapse is GitHub for Biomedical Data
Every code change versioned
Every issue tracked
Data and code versioned Every project the starting point for new work
Analysis history captured in real time All evolving and accessible in real time
Work anywhere, and share the results with anyone Social Coding
Social Science
26. Data Analysis with Synapse
Run Any Tool
On Any Platform
Record in Synapse
Share with Anyone
27. COMPONENTS
NEEDED
FOR
NETWORKED
APPROCHES
TO
BUILDING
EVOLVING
MODELS
OF
DISEASE:
RESEARCH
2.0
SETS
RULES
FOR
SHARING
DATA
THE
FEDERATION
ALLOWS
INTERLAB
DYNAMIC
RELATIONS
GEEKS
AND
SCIENTISTS
SANDBOX
SYNAPSE
PLACE
TO
BUILD
MODELS
OF
DISEASE
29. sage federation:
model of biological age
Faster Aging
Predicted
Age
(liver
expression)
Slower Aging
Clinical Association
- Gender
- BMI
- Disease
Age Differential Genotype Association
Gene Pathway Expression
Chronological
Age
(years)
30. COMPONENTS
NEEDED
FOR
NETWORKED
APPROCHES
TO
BUILDING
EVOLVING
MODELS
OF
DISEASE:
RESEARCH
2.0
ALLOWS
PATIENT
TO
REQUEST
DATA
BACK
PORTABLE
LEGAL
GIVES
CONTROL
OF
DATA
TO
PATIENT
CONSENT
WHO
CAN
THEN
SAY
I
WANT
TO
SHARE
IT
SETS
RULES
FOR
SHARING
DATA
THE
FEDERATION
ALLOWS
INTERLAB
DYNAMIC
RELATIONS
GEEKS
AND
SCIENTISTS
SANDBOX
SYNAPSE
PLACE
TO
BUILD
MODELS
OF
DISEASE
32. COMPONENTS
NEEDED
FOR
NETWORKED
APPROCHES
TO
BUILDING
EVOLVING
MODELS
OF
DISEASE:
RESEARCH
2.0
INCLUDING
CITIZENS:
DEMOCRATIZATION
OF
MEDICINE
SETS
RULES
FOR
SHARING
DATA
THE
FEDERATION
ALLOWS
INTERLAB
DYNAMIC
RELATIONS
GEEKS
AND
SCIENTISTS
SANDBOX
SYNAPSE
PLACE
TO
BUILD
MODELS
OF
DISEASE
ALLOWS
PATIENT
TO
REQUEST
DATA
BACK
PORTABLE
LEGAL
GIVES
CONTROL
OF
DATA
TO
PATIENT
WHO
CAN
THEN
SAY
I
WANT
TO
SHARE
IT
CONSENT
ENGAGES
CITIZENS
AS
PARTNERS
PATIENTS,
RESEARCHERS,
FUNDERS
BRIDGE
33. DEMOCRATIZATION OF MEDICINE
CLINICAL
INFORMATION
MOLECULAR DATA
RESEARCH
RESOURCES
(Social Value Chain)
ASHOKA
34.
35. Crowdsourcing
projects
to
build
models
of
disease
through
use
of
Challenges
hosts
in
the
cloud
found
on
websites
for
Synapse
and
BRIDGE
36. Novel aspects of our competitions
Transparency,
Valida8on
in
novel
reproducibility
-./#++0%(*
1%/2*
(34#* 53,'6%(* !7(%,2/*
dataset
1%/2* 53,'6%(* !7(%,2/*
-./#++0%(* (34#*
-./#++0%(* -./#++0%(*
!7(%,2/* !7(%,2/*
1%/2* 1%/2*
(34#* (34#*
53,'6%(* 53,'6%(*
!#80)69*%8:*
;(#'6%(*
!#$%#'()*
'++++(,*
Publica8on
in
Science
Dona8on
of
Google-‐
Transla8onal
Medicine
scale
compute
space.
sign
up
at
synapse.sagebase.org
Organiza8on
of
drug
sensi8vity
compe88ons
to
follow.
37. REAL NAMES DISCOVERY PROJECT
LONGITUDINAL COHORT STUDY
PatientsLikeMe
ParkinsonNet
Sage Bionetworks
RESEARCH 2.0
38. THE MELANOMA HUNT
Melanoma CROWD-SOURCING PROJECT
www.melanomahunt.org
Charles Ferté and Andrew Trister
Sage Bionetworks
39. CONTEXT
• Melanoma is one of the most life-threatening forms of cancer
• A difficult clinical question is whether a suspicious skin lesion
represents a melanoma or a benign process
• The ABCDE mnemonic and the Ugly Duckling are the current
standard approaches to describe suspicious skin lesions, assign risk
and decide further workup (a biopsy is eventually performed)
• Advances in computer-aided image manipulation and in scientific
crowd-sourcing (e.g. Foldit, EteRNA: hundreds of thousands
contributors Nature journal papers) could improve the
assessment of skin lesions
40. OBJECTIVES
• To
capture image features of skin lesions that are predictive of
melanoma (to improve the diagnosis of malignant lesions) with an
emphasis on sets of multiple lesions per patient over time
• To describe associations between quantitative imaging
characteristics of skin lesions and clinical, molecular and
pathological traits in melanoma
• Toeducate the public on risks of melanoma and methods of
prevention and early detection
41. 1
2
3
Computer
Vision
supe
r co
ntrib
utor
s
Challenges
data input
user powered
interface
by Synapse
butor s
c ontri
single ABCDE
Ugly
Duckling
42. USER INTERFACE
• Presentsimages and allows the user to modify them
with the image adjustment tools and captures each trial
• Incentives
and performance assessments are provided
(gamification and adaptive replication)
• Empowers the user to complete jobs and participate in
challenges
• Will be accessible on web and mobile devices
44. DISTRIBUTED THINKING TOOL
• Enables the use of volunteers on the Internet to perform tasks that require human
intelligence, knowledge, or cognitive skills (e.g. Stardust@home, GalaxyZoo, and Amazon's
Mechanical Turk)
• Provides adaptive replication
• Some users do the same job, only better and subsequently are given harder jobs
• Some experts do more sophisticated jobs and are given even harder jobs
• Simple, powerful, and open source tools already exist (e.g. Bossa)
• Applied to MELANOMA HUNT, participants are invited to perform both ABCDE and Ugly
Duckling scoring of skin lesions
45. IMAGE ADJUSTMENT TOOL
• Basedon tools offered in GraphicsMagick with python or java
interface: www.graphicsmagick.org/
• Quantitative transformation of the images is trivial
• Easilyadaptable to gamification solutions (e.g. using adaptive
replication open-source platforms) through Python API
• Open source and distributed with MIT style license
46. SCIENTIFIC CHALLENGES
• Thescientific challenges will create a community-based
effort to provide an unbiased assessment of models
and methodologies for the prediction of melanoma.
• Imaging feature sets are translated in quantitative
variables (numeric or categorical)
• A
common dataset will be provided to all participants,
with a validation dataset held out for model evaluation
47. SCIENTIFIC CHALLENGES
• Synapse will enable transparent, reproducible model
building and analysis workflows, as well as the sharing of
data, tools, and models with the Scientific Challenges
community
• Participants
can apply their best ideas in a high
performance compute environment
• Allmodels, including computationally intensive ones, can be
shared and re-run on a common platform, enabling
transparency of the process
48. DATA DEFINITION
• Data are :
• Anonymized images of suspicious skin lesions
- collection of sets of multiple skin lesions per patient
- augmentation of the database over the time since the
evolution of the lesions are also recorded
• Anonymized demographics (age, sex, race, etc.) for each
patient and anonymized pathological, clinical and molecular
features (TNM, Breslow score, BRAF, c-kit, etc.) for each
lesion
49. POTENTIAL SOURCES OF DATA
• Who provides the data ?
• Citizens and patients upload directly from mobile
applications (iOS android) and through the web
• Medical research institutions and cooperative groups
(e.g. International Dermoscopy Society, etc)
50. DATA STORAGE (SYNAPSE)
• Both images and metadata are hosted on the Synapse
platform (https://synapse.sagebase.org/)
• Synapse is a collaborative compute space that allows
scientists to share and analyze data together
• Synapse allows for both public and private projects
• Synapse enables the data to be directly loaded into
analytical tools like R and then to store the analysis
51. REPRESENTATIVE EXAMPLE
raw data stored
computer-aided
in Synapse
image
(single or multiple transformation
images
of skin lesions)
quantitative model building and
feature correlation with
extraction
endpoint
User performance: 87.5%
52. STRENGTHS
• Generationof an unlimited and tremendous database of
paired images and metadata of suspicious skin lesions
• Crowd-sourcing will facilitate a community of citizen/patients
interested in an important public health concern
• Innovation
combining recent advances in image adjustment,
crowd-sourcing and predictive modeling
53. OPPORTUNITIES
• Few,
if any, applications exist that allows a user to easily
generate models to predict the malignancy of a skin lesion
• Engagingcitizen/patients to directly provide a complete
record of multiple skin lesions over the entire body which can
be tracked over time will generate an unparalleled dataset
• Brute-force
of crowd-sourcing both to populate a database
and to generate predictive models in an unlimited manner
54. Jul
Aug
Sep
Oct
Nov
Dec
Crowdsourcing
expert
Engage experts
Obtain data from experts
Python client for Synapse
Engage development partners
Image adjustment tool
Mobile/web interface
Adaptive replication for images
Key
PLC
Data
Collaboration
Py client for deID
Development
Potential
Congress
Engage strategic partners
54