SlideShare ist ein Scribd-Unternehmen logo
1 von 36
Downloaden Sie, um offline zu lesen
Open	
  Data	
  in	
  Bioinforma/cs	
  and	
  Required	
  
Infrastructure	
  towards	
  achieving	
  the	
  SDGs	
  
www.h3abionet.org	
  	
  
	
  
9th	
  BioVisionAlexandria	
  Conference,	
  	
  
Alexandria,	
  Egypt	
  
2018	
  	
  
	
  
Prof.	
  Samar	
  Kassim	
  
samar_kassim@med.asu.edu.eg	
  	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
Introduc/on	
  
•  Major	
  technological	
  advances	
  in	
  molecular	
  biology	
  is	
  the	
  sophis7ca7on,	
  diversity,	
  
scale	
  and	
  decreasing	
  cost	
  of	
  the	
  data	
  being	
  generated	
  i.e.	
  by	
  high	
  throughput	
  
pla;orms	
  
•  First	
  human	
  genome	
  sequence:	
  
–  Throughput	
  2.8	
  million	
  bases	
  per	
  24	
  hours	
  on	
  	
  
	
  	
  	
  	
  	
  	
  	
  AB3730xl	
  sequencers	
  
–  13	
  years	
  to	
  sequence	
  3	
  billion	
  bases	
  at	
  x10	
  
	
  	
  	
  	
  	
  	
  	
  coverage	
  
–  Cost	
  ~	
  500	
  million	
  USD	
  (lower	
  bound	
  es7mate)	
  
	
  
•  Next	
  (now)	
  genera7on	
  sequencing:	
  
–  Throughput	
  1	
  million	
  bases	
  per	
  second	
  
–  ~10	
  hours	
  to	
  sequence	
  3	
  billion	
  bases	
  at	
  x10	
  
	
  	
  	
  	
  	
  	
  	
  coverage	
  
–  Cost	
  ~	
  4,000	
  USD	
  per	
  genome	
  
	
  
	
  
hTps://www.genome.gov/sequencingcosts/	
  
hTp://en.wikipedia.org/wiki/File:Historic_cost_of_sequencing_a_human_genome.svg	
  
Author	
  =	
  Ben	
  Moore	
  	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
Data	
  driven	
  biological	
  science	
  -­‐	
  
bioinforma/cs	
  
•  Decreasing	
  data	
  genera7on	
  costs	
  shiZed	
  biological	
  sciences	
  to	
  a	
  
data	
  driven	
  science	
  with	
  bioinforma7cs	
  playing	
  a	
  major	
  
component	
  
	
  
Stephens	
  ZD,	
  Lee	
  SY,	
  Faghri	
  F,	
  Campbell	
  RH,	
  Zhai	
  C,	
  et	
  al.	
  (2015)	
  Big	
  Data:	
  Astronomical	
  or	
  Genomical?.	
  PLOS	
  Biology	
  13(7):	
  e1002195.	
  
hTps://doi.org/10.1371/journal.pbio.1002195:	
  hTp://journals.plos.org/plosbiology/ar7cle?id=10.1371/journal.pbio.1002195	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
Genomics	
  and	
  Africa	
  -­‐	
  H3Africa	
  
•  “The	
  Human	
  Heredity	
  and	
  Health	
  in	
  Africa	
  (H3Africa)	
  Ini/a/ve	
  aims	
  to	
  
facilitate	
  a	
  contemporary	
  research	
  approach	
  to	
  the	
  study	
  of	
  genomics	
  
and	
  environmental	
  determinants	
  of	
  common	
  diseases	
  with	
  the	
  goal	
  of	
  
improving	
  the	
  health	
  of	
  African	
  popula7ons.”	
  (hTp://h3africa.org/)	
  
•  “The	
  vision	
  of	
  H3Africa	
  is	
  to	
  create	
  and	
  support	
  a	
  pan-­‐con7nental	
  
network	
  of	
  laboratories	
  that	
  will	
  be	
  equipped	
  to	
  apply	
  leading-­‐edge	
  
research	
  to	
  the	
  study	
  of	
  the	
  complex	
  interplay	
  between	
  environmental	
  
and	
  gene7c	
  factors	
  which	
  determines	
  disease	
  suscep7bility	
  and	
  drug	
  
responses	
  in	
  African	
  popula7ons.”	
  (hTp://h3africa.org/about/vision)	
  	
  
	
  
	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
H3Africa	
  Phase	
  I	
  overview	
  
•  25	
  research	
  projects	
  in	
  Africa	
  
•  >	
  500	
  inves7gators	
  
•  Covers	
  27	
  African	
  countries	
  
	
  
	
  
	
  
	
  
•  Upto	
  75,000	
  research	
  par7cipants	
  
	
  
•  >	
  USD	
  76	
  million	
  invested	
  in	
  phase	
  1	
  
8	
  Collabora/ve	
  
Centers	
  
7	
  Research	
  
Projects	
  
3	
  Biorepositories	
  
6	
  Ethics	
  Grants	
  
The	
  H3Africa	
  
Consor/um	
  
Bioinforma/cs	
  
Network	
  
hTp://h3africa.org/consor7um/projects	
  	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
H3Africa	
  Bioinformatcs	
  Network	
  (H3ABioNet)	
  
•  Pan	
  African	
  Bioinforma7cs	
  Network	
  to	
  develop	
  bioinforma7cs	
  
capacity	
  in	
  Africa	
  and	
  support	
  the	
  H3Africa	
  research	
  projects	
  
•  28	
  nodes	
  in	
  17	
  African	
  countries	
  
•  PI:	
  Prof.	
  Nicky	
  Mulder,	
  CBIO-­‐UCT	
  
•  Educa7on,	
  infrastructure,	
  research	
  
•  Archive	
  African	
  genomics	
  data	
  
	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
H3Africa	
  data	
  being	
  collected	
  (Phase	
  I)	
  
•  Phenotype	
  data	
  (associated	
  with	
  genotype	
  data)	
  
–  Demographic	
  informa7on	
  
–  Anthropometric	
  data	
  
–  Disease	
  and	
  health	
  related	
  phenotype	
  data	
  
•  Gene7c	
  Varia7on	
  data	
  human	
  and	
  pathogen	
  
–  Sequence	
  data	
  (whole	
  genome,	
  exome,	
  targeted)	
  
	
  
•  Genotyping	
  chip	
  array	
  data	
  
–  ~55,000	
  samples	
  to	
  be	
  run	
  on	
  an	
  H3Africa	
  African	
  custom	
  chip	
  
	
  
•  Microbiome	
  sequence	
  data	
  
–  Pa7ent/sample	
  phenotypes	
  
–  Non-­‐human	
  16S	
  rRNA	
  sequence	
  data	
  for	
  microbiome	
  
–  Non-­‐human	
  full	
  genome	
  sequence	
  data	
  for	
  microbiome	
  
–  Possible	
  human	
  sequence	
  contamina7on	
  
•  Biospecimens	
  to	
  be	
  deposited	
  at	
  the	
  H3Africa	
  biorepositories	
  
	
  
	
  
Image	
  credits:	
  Na/onal	
  Human	
  Genome	
  Research	
  Ins/tute	
  (h]ps://www.genome.gov/imagegallery/)	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
Lack	
  of	
  repository	
  for	
  African	
  Genomics	
  data	
  
•  1,759	
  datasets	
  with	
  the	
  query	
  “African”	
  –	
  none	
  in	
  Africa	
  
hTps://discover.reposi7ve.io/	
  	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
H3Africa	
  Data	
  Archive	
  
•  Assist	
  H3Africa	
  projects	
  as	
  data	
  coordina7on	
  center:	
  
	
  
	
  
	
  
	
  
	
  
Transfer	
  Validate	
  
Store	
  
Submit	
  to	
  
EGA	
  
Obtain	
  EGA	
  accessions	
  
for	
  publica/ons	
  
0.5	
  petabytes	
  storage	
  size	
  including	
  offsite	
  
replica7on	
  
H3Africa	
  Catalogue	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
•  Online	
  catalogue	
  with	
  meta-­‐data	
  to	
  search	
  and	
  apply	
  for	
  datasets	
  and	
  
biospecimens	
  (under	
  development)	
  
Human	
  gene/c	
  data	
  privacy	
  
•  H3Africa	
  rich	
  source	
  of	
  meta-­‐data	
  (phenotypes)	
  
(1)	
  Age	
  &	
  (2)	
  Sex	
  
(3)  Country	
  of	
  birth	
  
(4)  Current	
  residence	
  
(5)  Native	
  language	
  
(6)	
  Ethno-­‐linguistic/tribal	
  affiliation	
  
(7)  Country	
  of	
  birth	
  of	
  father	
  and	
  mother	
  
(8)  Na7ve	
  language	
  of	
  father	
  and	
  mother	
  
(9) Ethno-­‐linguistic/tribal	
  affiliation	
  of	
  
mother	
  and	
  father	
  
(10)  Height	
  
(11)  Weight	
  
(12)  Current	
  medica7ons	
  
(13)  Smoking	
  history	
  
(14)  Alcohol	
  history	
  
Image	
  credits:	
  Na/onal	
  Human	
  Genome	
  Research	
  Ins/tute	
  (h]ps://www.genome.gov/imagegallery/)	
  
•  Combina7on	
  of	
  phenotype	
  and	
  gene7c	
  data	
  makes	
  it	
  possible	
  to	
  iden7fy	
  
different	
  popula7ons	
  and	
  individuals	
  –	
  restricted	
  access	
  
	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
Sharing	
  of	
  research	
  data	
  and	
  outputs	
  
•  Funders’	
  data	
  sharing	
  policies	
  
	
  
	
  
“The	
  Wellcome	
  Trust	
  is	
  commiTed	
  to	
  ensuring	
  that	
  the	
  outputs	
  of	
  the	
  research	
  
it	
  funds,	
  including	
  research	
  data,	
  are	
  managed	
  and	
  used	
  in	
  ways	
  that	
  maximise	
  
public	
  benefit.	
  Making	
  research	
  data	
  widely	
  available	
  to	
  the	
  research	
  
community	
  in	
  a	
  7mely	
  and	
  responsible	
  manner	
  ensures	
  that	
  these	
  data	
  can	
  be	
  
verified,	
  built	
  upon	
  and	
  used	
  to	
  advance	
  knowledge	
  and	
  its	
  applica7on	
  to	
  
generate	
  improvements	
  in	
  health.”	
  
hTps://wellcome.ac.uk/funding/managing-­‐grant/policy-­‐data-­‐management-­‐and-­‐sharing	
  	
  
	
  
	
  
“The	
  Na7onal	
  Ins7tutes	
  of	
  Health	
  (NIH)	
  Genomic	
  Data	
  Sharing	
  Policy	
  expects	
  
that	
  genomic	
  research	
  data	
  from	
  NIH-­‐supported	
  studies	
  involving	
  human	
  
specimens	
  as	
  well	
  as	
  non-­‐human	
  and	
  model	
  organisms	
  will	
  be	
  submiTed	
  to	
  an	
  
NIH-­‐designated	
  data	
  repository.	
  The	
  list	
  below	
  provides	
  examples	
  of	
  relevant	
  
databases.”	
  
hTps://gds.nih.gov/02dr2.html	
  	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
Limits	
  to	
  sharing	
  human	
  gene/c	
  data	
  
•  Ethics:	
  
–  Digital	
  data	
  (genomes)	
  can	
  be	
  stored	
  indefinitely,	
  biobank	
  
specimens	
  can	
  be	
  stored	
  for	
  up	
  to	
  20	
  years	
  –	
  secondary	
  use	
  
–  Rapid	
  innova7on	
  with	
  ‘omics	
  technologies	
  
•  H3Africa:	
  “Seven	
  projects	
  used	
  broad	
  consent,	
  five	
  
projects	
  used	
  7ered	
  consent	
  and	
  one	
  used	
  specific	
  
consentӤ	
  
	
  
•  History	
  of	
  vulnerable	
  popula7ons,	
  low	
  educa7on	
  
levels	
  and	
  exploita7on	
  
•  Blood	
  sample	
  collec7on	
  and	
  visits	
  to	
  clinics	
  associated	
  
with	
  disease	
  and	
  treatment	
  –	
  even	
  if	
  a	
  healthy	
  control	
  
•  “All	
  but	
  one	
  of	
  the	
  consent	
  forms	
  that	
  we	
  reviewed	
  
included	
  a	
  statement	
  about	
  data	
  sharing.”	
  §	
  
§	
  Munung	
  NS,	
  Marshall	
  P,	
  Campbell	
  M,	
  et	
  al	
  Obtaining	
  informed	
  consent	
  for	
  genomics	
  research	
  in	
  Africa:	
  analysis	
  of	
  H3Africa	
  consent	
  
documents.	
  Journal	
  of	
  Medical	
  Ethics	
  2016;42:132-­‐137)	
  
Ethical	
  
considera7ons	
  
Informed	
  
consent	
  
Par7cipant	
  
iden7fica7on	
  
S7gma7sa7on	
  
Benefit	
  
sharing	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
Limits	
  to	
  sharing	
  human	
  gene/c	
  data	
  
•  Non-­‐harmonized	
  na7on	
  /	
  regional	
  laws	
  and	
  policies	
  for	
  ethics	
  and	
  
genome	
  data	
  sharing	
  within	
  Africa	
  	
  
	
  
	
  
Image	
  credits:	
  hTps://en.wikipedia.org/wiki/African_Economic_Community	
  	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
H3Africa	
  data	
  sharing	
  and	
  access	
  policy	
  
•  Balance	
  between	
  ensuring	
  that	
  adequate	
  safeguards	
  to	
  protect	
  
par7cipants	
  while	
  not	
  being	
  a	
  barrier	
  for	
  scien7sts	
  to	
  advance	
  
research:	
  
-  Maximizing	
  the	
  availability	
  of	
  research	
  data,	
  in	
  a	
  7mely	
  and	
  responsible	
  
manner.	
  
-  Protec7ng	
  the	
  rights	
  and	
  privacy	
  of	
  human	
  subjects	
  who	
  par7cipated	
  in	
  
research	
  studies.	
  
-  Recognizing	
  the	
  scien7fic	
  contribu7on	
  of	
  researchers	
  who	
  generated	
  the	
  
data.	
  
-  Considering	
  the	
  nature	
  and	
  ethics	
  of	
  the	
  research	
  proposed	
  in	
  establishing	
  
the	
  7mely	
  release	
  of	
  data,	
  and	
  mechanisms	
  of	
  data	
  sharing.	
  	
  
-  Promo7ng	
  deposi7on	
  of	
  genomic	
  data	
  in	
  exis7ng	
  community	
  data	
  
repositories	
  whenever	
  possible	
  
	
  
hTp://h3africa.org/images/DataSARWG_folders/FinalDocsDSAR/H3Africa%20Consor7um%20Data
%20Access%20%20Release%20Policy%20Aug%202014.pdf	
  	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
Challenges	
  in	
  sharing	
  data	
  –	
  metadata	
  
standards	
  
•  Meta-­‐data	
  (phenotype)	
  data	
  is	
  collected	
  via	
  case	
  report	
  forms	
  (CRFs)	
  
	
  	
  	
  	
  	
  Project	
  1	
  CRF	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Project	
  2	
  CRF	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Project	
  3	
  CRF	
  
Female	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Woman	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  1	
  
Daily	
  units	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  Weekly	
  units	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  User	
  defined	
  7me	
  period	
  
•  Same	
  ques7on	
  –	
  data	
  coded	
  in	
  different	
  ways	
  
•  Similar	
  measure	
  –	
  collected	
  in	
  different	
  ways	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
Use	
  established	
  standards	
  -­‐	
  Ontologies	
  
•  “An	
  ontology	
  defines	
  a	
  common	
  vocabulary	
  for	
  researchers	
  who	
  need	
  to	
  
share	
  informa7on	
  in	
  a	
  domain.	
  It	
  includes	
  machine-­‐interpretable	
  
defini7ons	
  of	
  basic	
  concepts	
  in	
  the	
  domain	
  and	
  rela7ons	
  among	
  them.”*	
  
	
  
	
  
*hTp://protege.stanford.edu/publica7ons/ontology_development/ontology101-­‐noy-­‐
mcguinness.html	
  	
  	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
Op/ons	
  to	
  aid	
  data	
  sharing	
  
•  Make	
  data	
  Findable,	
  Accessible,	
  Interoperable	
  and	
  Reusable	
  (FAIR	
  compliant)	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
•  Do	
  you	
  see	
  a	
  gene7c	
  variant	
  in	
  a	
  specific	
  posi7on	
  within	
  your	
  dataset	
  –	
  Yes	
  /	
  
No	
  as	
  in	
  the	
  case	
  for	
  the	
  South	
  African	
  Human	
  Genome	
  Program	
  (SAHGP)	
  
Global	
  Alliance	
  for	
  Genomics	
  and	
  
Health:	
  hTp://ga4gh.org/#/beacon	
  	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
H3Africa	
  genotyping	
  chip	
  
•  Current	
  genotyping	
  technologies	
  are	
  designed	
  for	
  European	
  
popula7ons	
  
•  African	
  popula7ons	
  under	
  represented,	
  although	
  have	
  the	
  most	
  
diversity	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
Image	
  credits:	
  Na/onal	
  Human	
  Genome	
  Research	
  Ins/tute	
  (h]ps://www.genome.gov/imagegallery/)	
  
Designing	
  the	
  H3Africa	
  genotyping	
  chip	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
Image	
  credits:	
  Na/onal	
  Human	
  Genome	
  Research	
  Ins/tute	
  (h]ps://www.genome.gov/imagegallery/)	
  
•  Collabora7on	
  between	
  H3ABioNet	
  and	
  Na7onal	
  Center	
  for	
  
Supercompu7ng	
  Applica7ons	
  (NCSA-­‐US	
  based)	
  via	
  US	
  partner	
  at	
  
University	
  of	
  Illinois	
  
	
  
•  U7lized	
  the	
  Bluewaters	
  supercomputer	
  facili7es	
  and	
  CHPC	
  facili7es	
  
	
  
	
  	
  	
  212,000	
  Node	
  compu7ng	
  hours	
  used	
  at	
  Bluewaters	
  
	
  
	
  	
  	
  600	
  TB	
  of	
  storage	
  needed	
  
	
  
Chip	
  undergone	
  assessment	
  and	
  in	
  use	
  with	
  pos7ve	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
results	
  
	
   h]ps://twi]er.com/billgates/status/800800954790465536?lang=en	
  	
  
Connec/vity	
  for	
  data	
  transfers	
  
GO endpoints	
   Transfer speeds (Mbps)	
  
(min, max)	
  
Baylor <-> Blue Waters	
   340, 1900	
  
Blue Waters -> UCT	
   204, 322 	
  
CHPC <-> Blue Waters	
   81, 243	
  
UCT <-> CHPC 	
   34, 406	
  
Sanger <-> UCT	
   38, 76 	
  
GO	
  source	
  and	
  
des/na/on	
  
Files	
  to	
  transfer	
  and	
  size	
  per	
  
sample	
  
Total	
  size	
  of	
  transfer	
  for	
  350	
  
samples	
  
Min	
  transfer	
  
speed	
  
Time	
  to	
  
transfer	
  
Baylor	
  to	
  Blue	
  Waters	
   Baylor	
  FASTQ.gzs	
  /	
  100GB	
   75TB	
   340Mbps	
   21	
  days	
  
Blue	
  Waters	
  to	
  UCT	
   Baylor	
  FASTQ.gzs	
  /	
  100GB	
   75TB	
   200Mbps	
   35	
  days	
  
Blue	
  Waters	
  to	
  UCT	
   BW	
  BAMs	
  /	
  100GB	
   40TB	
   200Mbps	
   19	
  days	
  
UCT	
  to	
  CHPC	
   BW	
  BAMs	
  /	
  100GB	
   40TB	
   34Mbps	
   109	
  days	
  
CHPC	
  to	
  UCT	
   Union	
  set	
  /	
  VCFs	
   1TB	
   34Mbps	
   3	
  days	
  
UCT	
  to	
  Sanger	
   Union	
  set	
  /	
  VCFs	
   1TB	
   34Mbps	
   3	
  days	
  
	
  Globus	
  Online	
  installed	
  at	
  Nodes	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
Challenge	
  of	
  unequal	
  infrastuctures	
  	
  
•  Diverse	
  levels	
  of	
  exper7se	
  and	
  infrastructure	
  between	
  different	
  
countries	
  
	
  
www.project-­‐redcap.org/map_fullscreen.php	
  
	
  
SoZware	
  and	
  hardware	
  sanc7ons	
  	
  
exacerbate	
  exis7ng	
  inequali7es	
  	
  
e.g	
  Sudan	
  Node	
  
hTp://mgafrica.com/ar7cle/2015-­‐01-­‐14-­‐17-­‐
startling-­‐facts-­‐about-­‐the-­‐state-­‐of-­‐science-­‐and-­‐
research-­‐in-­‐africa	
  	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
Bioinforma/cs	
  educa/on	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
Aim:	
  
•  Basic	
  bioinforma7cs	
  training	
  for	
  interested	
  H3Africa	
  members	
  
(bioinforma7cs	
  users	
  –	
  Introduc7on	
  to	
  Bioinforma7cs	
  Training)	
  
•  Web-­‐based	
  bioinforma7cs	
  tools	
  and	
  resources	
  and	
  how	
  to	
  use	
  
them	
  
	
  Course	
  logis/cs:	
  
• 	
  3	
  months,	
  2	
  days	
  contact	
  7me	
  per	
  week	
  (3	
  hours	
  per	
  session)	
  
• 	
  Distance	
  learning	
  model	
  –	
  physical	
  classrooms	
  connected	
  	
  
	
  to	
  virtual	
  classroom	
  
• 	
  Mconf	
  –	
  video	
  conferencing	
  
• 	
  Vula	
  –	
  course	
  management	
  	
  
	
  
	
  
virtual	
  classroom	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
IBT_2017	
  classroom	
  sites	
  
	
  
27	
  in	
  total	
  
(vs.	
  20	
  classrooms	
  in	
  2016)	
  
	
  
Countries	
  that	
  have	
  
joined	
  IBT	
  in	
  2017:	
  
Ethiopia,	
  Burkina	
  Faso	
  
	
  
Some	
  par7cipants	
  from	
  
first	
  course	
  are	
  going	
  to	
  be	
  
TAs	
  
	
  
Over	
  580	
  enrolled	
  	
  
Par/cipants	
  and	
  over	
  	
  
130	
  volunteer	
  staff	
  	
  
IBT	
  2017	
  Classrooms	
  
Paper	
  published	
  on	
  
course	
  design	
  
VIRTUAL CLASSROOM
classroom site 2016
new classroom site
2017
classroom site 2016
and 2017
Conclusion	
  
•  Bioinforma7cs	
  =	
  big	
  data	
  and	
  needs	
  computa7onal	
  power,	
  storage,	
  
fast	
  read	
  and	
  write	
  for	
  processing	
  
•  Well	
  defined	
  meta-­‐data	
  standards	
  are	
  vital	
  for	
  interoperability	
  and	
  
sharing	
  of	
  data	
  
•  Cyber	
  infrastructure	
  for	
  moving	
  and	
  sharing	
  large	
  datasets	
  is	
  needed	
  
to	
  foster	
  open	
  data	
  and	
  open	
  science	
  
•  Educa7on	
  and	
  skills	
  development	
  essen7al	
  for	
  African	
  ci7zens	
  to	
  take	
  
advantage	
  of	
  the	
  data	
  revolu7on	
  
•  Percep7ons	
  and	
  a{tudes	
  –	
  no	
  amount	
  of	
  infrastructure	
  will	
  drive	
  
Open	
  data	
  and	
  Open	
  science	
  if	
  the	
  sen7ment	
  is	
  absent	
  	
  	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
Acknowledgements	
  
•  Prof	
  Nicky	
  Mulder	
  and	
  H3ABioNet	
  members	
  
•  Ina	
  Smith	
  and	
  the	
  Academy	
  of	
  Science	
  of	
  South	
  Africa	
  	
  
•  BioVisionAlexandria	
  2018	
  organizers	
  
H3ABioNet	
  Consor/um	
  Members	
  2017	
  	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
Conclusions	
  
Provide	
  data	
  archiving	
  solu7on	
  for	
  H3Africa	
  projects	
  to	
  
ensure	
  that	
  local	
  copy	
  of	
  the	
  data	
  remains	
  on	
  the	
  con7nent	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
Communica/on	
  –	
  H3Africa	
  
	
  
Image	
  credit:	
  hTps://commons.wikimedia.org/wiki/File:UTC_hue4map_X_world_Robinson.png	
  	
  	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
•  H3Africa	
  working	
  groups	
  meet	
  every	
  fortnight	
  
•  Regular	
  mee7ngs	
  are	
  challenging	
  due	
  to	
  diversity	
  of	
  7mezones	
  (most	
  funders	
  
in	
  the	
  US)	
  and	
  daylight	
  saving	
  hours	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
Communica/on	
  –	
  H3Africa	
  
	
  
	
  
•  H3Africa	
  funders	
  and	
  project	
  members	
  meet	
  face	
  to	
  face	
  every	
  six	
  
months	
  to	
  provide	
  reports	
  and	
  for	
  working	
  groups	
  to	
  also	
  wrap	
  up	
  
deliverables	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
Communica/on	
  –	
  H3ABioNet	
  
	
  
•  Within	
  H3ABioNet	
  the	
  nodes	
  are	
  located	
  in	
  Africa	
  so	
  7me	
  differences	
  are	
  not	
  
a	
  hindrance	
  
•  Working	
  groups	
  meet	
  once	
  a	
  month	
  and	
  network	
  meets	
  annually	
  for	
  SAB	
  
review	
  and	
  network	
  business	
  
•  Only	
  some	
  countries	
  have	
  toll	
  free	
  access	
  to	
  a	
  booked	
  conference	
  call,	
  costly	
  
•  Challenges:	
  communica7on	
  pla;orms	
  
	
  
	
  
	
  
hTp://mconf.org/	
  	
  
9th	
  BioVisionAlexandria	
  Conference,	
  Egypt	
  
Biomedical	
  science	
  becoming	
  “data	
  rich”	
  
OECD	
  –	
  WDS	
  Workshop,	
  Brussels	
  2017	
  
hTps://www.nlm.nih.gov/about/2017CJ_NLM.pdf	
  	
  
Mapping	
  internet	
  conec/vity	
  
OECD	
  –	
  WDS	
  Workshop,	
  Brussels	
  2017	
  
OECD	
  –	
  WDS	
  Workshop,	
  Brussels	
  2017	
  
Bioinforma/cs	
  SOPs	
  -­‐	
  Reproducible	
  
•  Developed	
  SOPs	
  
and	
  prac7ce	
  
datasets	
  for:	
  
–  NGS	
  Variant	
  calling	
  
–  Genome	
  Wide	
  
Associa7on	
  Studies	
  
(GWAS)	
  
–  16S	
  rRNA	
  diversity	
  
analysis	
  	
  
•  SOPs	
  and	
  prac7ce	
  
datasets	
  under	
  
development:	
  
–  RNA	
  Seq	
  
–  Variant	
  
priori7za7on	
  and	
  
annota7on	
  
•  Guidelines	
  on	
  
compute	
  and	
  
storage	
  
OECD	
  –	
  WDS	
  Workshop,	
  Brussels	
  2017	
  
Archive	
  dashboard	
  
OECD	
  –	
  WDS	
  Workshop,	
  Brussels	
  2017	
  
Ontologies work
35	
  
Adapting
OMIABIS
ontology to
H3Africa data
Mapping CRFs to ontologies, e.g.
phenotype or disease ontology
Mapping
genomics data to
Experimental
Factor ontology
Developing Sickle Cell Disease Ontology
OECD	
  –	
  WDS	
  Workshop,	
  Brussels	
  2017	
  
Beacons in Africa
hTps://beacon-­‐network.org//#/directory	
  	
  
•  First Beacon in Africa “lit” on October 2016 for the SAHGP

Weitere ähnliche Inhalte

Was ist angesagt?

NCI Support for Cancer Data Sharing
NCI Support for Cancer Data SharingNCI Support for Cancer Data Sharing
NCI Support for Cancer Data SharingWarren Kibbe
 
Lombardi comprehensive cancer center Georgetown
Lombardi comprehensive cancer center GeorgetownLombardi comprehensive cancer center Georgetown
Lombardi comprehensive cancer center GeorgetownEd Dodds
 
DOE-NCI Pilots presentation at the Frederick National Laboratory Advisory Com...
DOE-NCI Pilots presentation at the Frederick National Laboratory Advisory Com...DOE-NCI Pilots presentation at the Frederick National Laboratory Advisory Com...
DOE-NCI Pilots presentation at the Frederick National Laboratory Advisory Com...Warren Kibbe
 
NCI Cancer Genomics, Open Science and PMI: FAIR
NCI Cancer Genomics, Open Science and PMI: FAIR NCI Cancer Genomics, Open Science and PMI: FAIR
NCI Cancer Genomics, Open Science and PMI: FAIR Warren Kibbe
 
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveChris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveGigaScience, BGI Hong Kong
 
California Ocean Science Trust " Building a Sustainable Knowledge Base for ...
California Ocean Science Trust " Building a Sustainable Knowledge Base for ...California Ocean Science Trust " Building a Sustainable Knowledge Base for ...
California Ocean Science Trust " Building a Sustainable Knowledge Base for ...Tom Moritz
 
NCI Cancer Genomic Data Commons for NCAB September 2016
NCI Cancer Genomic Data Commons for NCAB September 2016NCI Cancer Genomic Data Commons for NCAB September 2016
NCI Cancer Genomic Data Commons for NCAB September 2016Warren Kibbe
 
Nci clinical genomics data sharing ncra sept 2016
Nci clinical genomics data sharing ncra sept 2016Nci clinical genomics data sharing ncra sept 2016
Nci clinical genomics data sharing ncra sept 2016Warren Kibbe
 
Developing a national strategy to bring pathogen genomics into practice
Developing a national strategy to bring pathogen genomics into practiceDeveloping a national strategy to bring pathogen genomics into practice
Developing a national strategy to bring pathogen genomics into practiceExternalEvents
 
Cancer Moonshot, Data sharing and the Genomic Data Commons
Cancer Moonshot, Data sharing and the Genomic Data CommonsCancer Moonshot, Data sharing and the Genomic Data Commons
Cancer Moonshot, Data sharing and the Genomic Data CommonsWarren Kibbe
 
NLBIF_NIOO_2017v3
NLBIF_NIOO_2017v3NLBIF_NIOO_2017v3
NLBIF_NIOO_2017v3Jan Kuiper
 
Health Policy and Management as it Relates to Big Data
Health Policy and Management as it Relates to Big DataHealth Policy and Management as it Relates to Big Data
Health Policy and Management as it Relates to Big DataPhilip Bourne
 
Kibbe One Voice Against Cancer 20170605
Kibbe One Voice Against Cancer 20170605Kibbe One Voice Against Cancer 20170605
Kibbe One Voice Against Cancer 20170605Warren Kibbe
 
Precision oncology ncabr kibbe oct 2017
Precision oncology ncabr kibbe oct 2017Precision oncology ncabr kibbe oct 2017
Precision oncology ncabr kibbe oct 2017Warren Kibbe
 
Towards a Platform for Global Health
Towards a Platform for Global HealthTowards a Platform for Global Health
Towards a Platform for Global HealthPhilip Bourne
 
The Global Biodiversity Information Facility and Africa Rising
The Global Biodiversity Information Facility and Africa RisingThe Global Biodiversity Information Facility and Africa Rising
The Global Biodiversity Information Facility and Africa RisingFatima Parker-Allie
 
Research integrity and data management
Research integrity and data managementResearch integrity and data management
Research integrity and data managementARDC
 
Research Ethics and Use of Restricted Access Data
Research Ethics and Use of Restricted Access DataResearch Ethics and Use of Restricted Access Data
Research Ethics and Use of Restricted Access Datalibbiestephenson
 
The Biodiversity Informatics Landscape
The Biodiversity Informatics LandscapeThe Biodiversity Informatics Landscape
The Biodiversity Informatics LandscapeVince Smith
 

Was ist angesagt? (20)

NCI Support for Cancer Data Sharing
NCI Support for Cancer Data SharingNCI Support for Cancer Data Sharing
NCI Support for Cancer Data Sharing
 
Lombardi comprehensive cancer center Georgetown
Lombardi comprehensive cancer center GeorgetownLombardi comprehensive cancer center Georgetown
Lombardi comprehensive cancer center Georgetown
 
DOE-NCI Pilots presentation at the Frederick National Laboratory Advisory Com...
DOE-NCI Pilots presentation at the Frederick National Laboratory Advisory Com...DOE-NCI Pilots presentation at the Frederick National Laboratory Advisory Com...
DOE-NCI Pilots presentation at the Frederick National Laboratory Advisory Com...
 
NCI Cancer Genomics, Open Science and PMI: FAIR
NCI Cancer Genomics, Open Science and PMI: FAIR NCI Cancer Genomics, Open Science and PMI: FAIR
NCI Cancer Genomics, Open Science and PMI: FAIR
 
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveChris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
 
California Ocean Science Trust " Building a Sustainable Knowledge Base for ...
California Ocean Science Trust " Building a Sustainable Knowledge Base for ...California Ocean Science Trust " Building a Sustainable Knowledge Base for ...
California Ocean Science Trust " Building a Sustainable Knowledge Base for ...
 
NCI Cancer Genomic Data Commons for NCAB September 2016
NCI Cancer Genomic Data Commons for NCAB September 2016NCI Cancer Genomic Data Commons for NCAB September 2016
NCI Cancer Genomic Data Commons for NCAB September 2016
 
Day 2_Global Health Workshop_Wimbush
Day 2_Global Health Workshop_WimbushDay 2_Global Health Workshop_Wimbush
Day 2_Global Health Workshop_Wimbush
 
Nci clinical genomics data sharing ncra sept 2016
Nci clinical genomics data sharing ncra sept 2016Nci clinical genomics data sharing ncra sept 2016
Nci clinical genomics data sharing ncra sept 2016
 
Developing a national strategy to bring pathogen genomics into practice
Developing a national strategy to bring pathogen genomics into practiceDeveloping a national strategy to bring pathogen genomics into practice
Developing a national strategy to bring pathogen genomics into practice
 
Cancer Moonshot, Data sharing and the Genomic Data Commons
Cancer Moonshot, Data sharing and the Genomic Data CommonsCancer Moonshot, Data sharing and the Genomic Data Commons
Cancer Moonshot, Data sharing and the Genomic Data Commons
 
NLBIF_NIOO_2017v3
NLBIF_NIOO_2017v3NLBIF_NIOO_2017v3
NLBIF_NIOO_2017v3
 
Health Policy and Management as it Relates to Big Data
Health Policy and Management as it Relates to Big DataHealth Policy and Management as it Relates to Big Data
Health Policy and Management as it Relates to Big Data
 
Kibbe One Voice Against Cancer 20170605
Kibbe One Voice Against Cancer 20170605Kibbe One Voice Against Cancer 20170605
Kibbe One Voice Against Cancer 20170605
 
Precision oncology ncabr kibbe oct 2017
Precision oncology ncabr kibbe oct 2017Precision oncology ncabr kibbe oct 2017
Precision oncology ncabr kibbe oct 2017
 
Towards a Platform for Global Health
Towards a Platform for Global HealthTowards a Platform for Global Health
Towards a Platform for Global Health
 
The Global Biodiversity Information Facility and Africa Rising
The Global Biodiversity Information Facility and Africa RisingThe Global Biodiversity Information Facility and Africa Rising
The Global Biodiversity Information Facility and Africa Rising
 
Research integrity and data management
Research integrity and data managementResearch integrity and data management
Research integrity and data management
 
Research Ethics and Use of Restricted Access Data
Research Ethics and Use of Restricted Access DataResearch Ethics and Use of Restricted Access Data
Research Ethics and Use of Restricted Access Data
 
The Biodiversity Informatics Landscape
The Biodiversity Informatics LandscapeThe Biodiversity Informatics Landscape
The Biodiversity Informatics Landscape
 

Ähnlich wie Open Data in Bioinformatics and Required Infrastructure towards achieving the SDGs/Samar Kassim

Introducing the Biosciences eastern and central Africa–International Livestoc...
Introducing the Biosciences eastern and central Africa–International Livestoc...Introducing the Biosciences eastern and central Africa–International Livestoc...
Introducing the Biosciences eastern and central Africa–International Livestoc...ILRI
 
International perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research dataInternational perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research dataARDC
 
Status of ICT structure, infrastructure and applications existed to manage an...
Status of ICT structure, infrastructure and applications existed to manage an...Status of ICT structure, infrastructure and applications existed to manage an...
Status of ICT structure, infrastructure and applications existed to manage an...RABNENA Network
 
Data is the new oil: Big data, data mining and bio - inspiring techniques
Data is the new oil: Big data, data mining and bio - inspiring techniquesData is the new oil: Big data, data mining and bio - inspiring techniques
Data is the new oil: Big data, data mining and bio - inspiring techniquesAboul Ella Hassanien
 
Data are the new oil: Big data, data mining and bio - inspiring techniques
Data are the new oil: Big data, data mining and bio - inspiring techniquesData are the new oil: Big data, data mining and bio - inspiring techniques
Data are the new oil: Big data, data mining and bio - inspiring techniquesAboul Ella Hassanien
 
CINECA webinar slides: Making cohort data FAIR
CINECA webinar slides: Making cohort data FAIRCINECA webinar slides: Making cohort data FAIR
CINECA webinar slides: Making cohort data FAIRCINECAProject
 
Visual Analytical Screening System for Disease Linked Gene Variants - Oyekan...
Visual Analytical Screening System for  Disease Linked Gene Variants - Oyekan...Visual Analytical Screening System for  Disease Linked Gene Variants - Oyekan...
Visual Analytical Screening System for Disease Linked Gene Variants - Oyekan...Human Variome Project
 
Berlin8 keizer 2010-10-25
Berlin8 keizer 2010-10-25Berlin8 keizer 2010-10-25
Berlin8 keizer 2010-10-25Johannes Keizer
 
The African Open Science Platform: Policy, Infrastructure, Skills and Incenti...
The African Open Science Platform: Policy, Infrastructure, Skills and Incenti...The African Open Science Platform: Policy, Infrastructure, Skills and Incenti...
The African Open Science Platform: Policy, Infrastructure, Skills and Incenti...African Open Science Platform
 
The African Open Science Platform: Policy | Infrastructure | Skills | Incenti...
The African Open Science Platform: Policy | Infrastructure | Skills | Incenti...The African Open Science Platform: Policy | Infrastructure | Skills | Incenti...
The African Open Science Platform: Policy | Infrastructure | Skills | Incenti...African Open Science Platform
 
International Cancer Genomics Consortium (ICGC) Data Coordinating Center
International Cancer Genomics Consortium (ICGC) Data Coordinating CenterInternational Cancer Genomics Consortium (ICGC) Data Coordinating Center
International Cancer Genomics Consortium (ICGC) Data Coordinating CenterNeuro, McGill University
 
Accelerating Science, Technology and Innovation Through Open Data and Open Sc...
Accelerating Science, Technology and Innovation Through Open Data and Open Sc...Accelerating Science, Technology and Innovation Through Open Data and Open Sc...
Accelerating Science, Technology and Innovation Through Open Data and Open Sc...African Open Science Platform
 
African Open Science Platform: Research Data Towards a Sustainable World/Ina ...
African Open Science Platform: Research Data Towards a Sustainable World/Ina ...African Open Science Platform: Research Data Towards a Sustainable World/Ina ...
African Open Science Platform: Research Data Towards a Sustainable World/Ina ...African Open Science Platform
 
Biocuration activities for the International Cancer Genome Consortium (ICGC).
Biocuration activities for the International Cancer Genome Consortium (ICGC).Biocuration activities for the International Cancer Genome Consortium (ICGC).
Biocuration activities for the International Cancer Genome Consortium (ICGC).Neuro, McGill University
 
Making agricultural knowledge globally discoverable: are we there yet?
Making agricultural knowledge globally discoverable: are we there yet?Making agricultural knowledge globally discoverable: are we there yet?
Making agricultural knowledge globally discoverable: are we there yet?Nikos Manouselis
 
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...dkNET
 
Dros africa the role of center of excellence in fostering scientific research...
Dros africa the role of center of excellence in fostering scientific research...Dros africa the role of center of excellence in fostering scientific research...
Dros africa the role of center of excellence in fostering scientific research...Prof. Mohamed Labib Salem
 

Ähnlich wie Open Data in Bioinformatics and Required Infrastructure towards achieving the SDGs/Samar Kassim (20)

Introducing the Biosciences eastern and central Africa–International Livestoc...
Introducing the Biosciences eastern and central Africa–International Livestoc...Introducing the Biosciences eastern and central Africa–International Livestoc...
Introducing the Biosciences eastern and central Africa–International Livestoc...
 
International perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research dataInternational perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research data
 
Status of ICT structure, infrastructure and applications existed to manage an...
Status of ICT structure, infrastructure and applications existed to manage an...Status of ICT structure, infrastructure and applications existed to manage an...
Status of ICT structure, infrastructure and applications existed to manage an...
 
Data is the new oil: Big data, data mining and bio - inspiring techniques
Data is the new oil: Big data, data mining and bio - inspiring techniquesData is the new oil: Big data, data mining and bio - inspiring techniques
Data is the new oil: Big data, data mining and bio - inspiring techniques
 
Data are the new oil: Big data, data mining and bio - inspiring techniques
Data are the new oil: Big data, data mining and bio - inspiring techniquesData are the new oil: Big data, data mining and bio - inspiring techniques
Data are the new oil: Big data, data mining and bio - inspiring techniques
 
CINECA webinar slides: Making cohort data FAIR
CINECA webinar slides: Making cohort data FAIRCINECA webinar slides: Making cohort data FAIR
CINECA webinar slides: Making cohort data FAIR
 
Visual Analytical Screening System for Disease Linked Gene Variants - Oyekan...
Visual Analytical Screening System for  Disease Linked Gene Variants - Oyekan...Visual Analytical Screening System for  Disease Linked Gene Variants - Oyekan...
Visual Analytical Screening System for Disease Linked Gene Variants - Oyekan...
 
Berlin8 keizer 2010-10-25
Berlin8 keizer 2010-10-25Berlin8 keizer 2010-10-25
Berlin8 keizer 2010-10-25
 
The African Open Science Platform: Policy, Infrastructure, Skills and Incenti...
The African Open Science Platform: Policy, Infrastructure, Skills and Incenti...The African Open Science Platform: Policy, Infrastructure, Skills and Incenti...
The African Open Science Platform: Policy, Infrastructure, Skills and Incenti...
 
Open Access in Agricultural Research for Development : a Global Movement
Open Access in Agricultural Research for Development : a Global MovementOpen Access in Agricultural Research for Development : a Global Movement
Open Access in Agricultural Research for Development : a Global Movement
 
Nov 2014 ouellette_windsor_icgc_final
Nov 2014 ouellette_windsor_icgc_finalNov 2014 ouellette_windsor_icgc_final
Nov 2014 ouellette_windsor_icgc_final
 
The African Open Science Platform: Policy | Infrastructure | Skills | Incenti...
The African Open Science Platform: Policy | Infrastructure | Skills | Incenti...The African Open Science Platform: Policy | Infrastructure | Skills | Incenti...
The African Open Science Platform: Policy | Infrastructure | Skills | Incenti...
 
International Cancer Genomics Consortium (ICGC) Data Coordinating Center
International Cancer Genomics Consortium (ICGC) Data Coordinating CenterInternational Cancer Genomics Consortium (ICGC) Data Coordinating Center
International Cancer Genomics Consortium (ICGC) Data Coordinating Center
 
Accelerating Science, Technology and Innovation Through Open Data and Open Sc...
Accelerating Science, Technology and Innovation Through Open Data and Open Sc...Accelerating Science, Technology and Innovation Through Open Data and Open Sc...
Accelerating Science, Technology and Innovation Through Open Data and Open Sc...
 
African Open Science Platform: Research Data Towards a Sustainable World/Ina ...
African Open Science Platform: Research Data Towards a Sustainable World/Ina ...African Open Science Platform: Research Data Towards a Sustainable World/Ina ...
African Open Science Platform: Research Data Towards a Sustainable World/Ina ...
 
Breaking new ground: the African Open Science Platform
Breaking new ground: the African Open Science PlatformBreaking new ground: the African Open Science Platform
Breaking new ground: the African Open Science Platform
 
Biocuration activities for the International Cancer Genome Consortium (ICGC).
Biocuration activities for the International Cancer Genome Consortium (ICGC).Biocuration activities for the International Cancer Genome Consortium (ICGC).
Biocuration activities for the International Cancer Genome Consortium (ICGC).
 
Making agricultural knowledge globally discoverable: are we there yet?
Making agricultural knowledge globally discoverable: are we there yet?Making agricultural knowledge globally discoverable: are we there yet?
Making agricultural knowledge globally discoverable: are we there yet?
 
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
 
Dros africa the role of center of excellence in fostering scientific research...
Dros africa the role of center of excellence in fostering scientific research...Dros africa the role of center of excellence in fostering scientific research...
Dros africa the role of center of excellence in fostering scientific research...
 

Mehr von African Open Science Platform

Science for the Future The Future of Science: Roadmap/Molapo Qhobela
Science for the Future The Future of Science: Roadmap/Molapo QhobelaScience for the Future The Future of Science: Roadmap/Molapo Qhobela
Science for the Future The Future of Science: Roadmap/Molapo QhobelaAfrican Open Science Platform
 
Science for the future The future of science: Governance/Khotso Mokhele
Science for the future The future of science: Governance/Khotso MokheleScience for the future The future of science: Governance/Khotso Mokhele
Science for the future The future of science: Governance/Khotso MokheleAfrican Open Science Platform
 
The future of science is digital. Are YOU prepared?/Ina Smith
The future of science is digital. Are YOU prepared?/Ina SmithThe future of science is digital. Are YOU prepared?/Ina Smith
The future of science is digital. Are YOU prepared?/Ina SmithAfrican Open Science Platform
 
African Open Science Platform pilot study and landscape findings
African Open Science Platform pilot study and landscape findingsAfrican Open Science Platform pilot study and landscape findings
African Open Science Platform pilot study and landscape findingsAfrican Open Science Platform
 
African Open Science Platform. Where are we? Where do we want to go? How do w...
African Open Science Platform. Where are we? Where do we want to go? How do w...African Open Science Platform. Where are we? Where do we want to go? How do w...
African Open Science Platform. Where are we? Where do we want to go? How do w...African Open Science Platform
 
Data management principles and trusted data repositories/Lynn Woolfrey
Data management principles and trusted data repositories/Lynn WoolfreyData management principles and trusted data repositories/Lynn Woolfrey
Data management principles and trusted data repositories/Lynn WoolfreyAfrican Open Science Platform
 
Europe's Open Science Policy and Policy Platform/Jean-Claude Burgelman
Europe's Open Science Policy and Policy Platform/Jean-Claude BurgelmanEurope's Open Science Policy and Policy Platform/Jean-Claude Burgelman
Europe's Open Science Policy and Policy Platform/Jean-Claude BurgelmanAfrican Open Science Platform
 
EOSC Strategic Implementation Roadmap 2018-2020/Jean-Claude Burgelman
EOSC Strategic Implementation Roadmap 2018-2020/Jean-Claude BurgelmanEOSC Strategic Implementation Roadmap 2018-2020/Jean-Claude Burgelman
EOSC Strategic Implementation Roadmap 2018-2020/Jean-Claude BurgelmanAfrican Open Science Platform
 
Building and Operating National Open Science Research Infrastructures - the e...
Building and Operating National Open Science Research Infrastructures - the e...Building and Operating National Open Science Research Infrastructures - the e...
Building and Operating National Open Science Research Infrastructures - the e...African Open Science Platform
 
Vision and Mission for a Future African Open Science Platform/Felix Dakora
Vision and Mission for a Future African Open Science Platform/Felix DakoraVision and Mission for a Future African Open Science Platform/Felix Dakora
Vision and Mission for a Future African Open Science Platform/Felix DakoraAfrican Open Science Platform
 
The Digital Revolution and Open Science for the Future/Geoffrey Boulton
The Digital Revolution and Open Science for the Future/Geoffrey BoultonThe Digital Revolution and Open Science for the Future/Geoffrey Boulton
The Digital Revolution and Open Science for the Future/Geoffrey BoultonAfrican Open Science Platform
 
Response of Academies of Science to Open Science/Roseanne Diab
Response of Academies of Science to Open Science/Roseanne DiabResponse of Academies of Science to Open Science/Roseanne Diab
Response of Academies of Science to Open Science/Roseanne DiabAfrican Open Science Platform
 
The Landscape of Open Science in Africa/Susan Veldsman & Joseph Wafula
The Landscape of Open Science in Africa/Susan Veldsman & Joseph WafulaThe Landscape of Open Science in Africa/Susan Veldsman & Joseph Wafula
The Landscape of Open Science in Africa/Susan Veldsman & Joseph WafulaAfrican Open Science Platform
 

Mehr von African Open Science Platform (20)

Science for the Future The Future of Science: Roadmap/Molapo Qhobela
Science for the Future The Future of Science: Roadmap/Molapo QhobelaScience for the Future The Future of Science: Roadmap/Molapo Qhobela
Science for the Future The Future of Science: Roadmap/Molapo Qhobela
 
Science for the future The future of science: Governance/Khotso Mokhele
Science for the future The future of science: Governance/Khotso MokheleScience for the future The future of science: Governance/Khotso Mokhele
Science for the future The future of science: Governance/Khotso Mokhele
 
The future of science is digital. Are YOU prepared?/Ina Smith
The future of science is digital. Are YOU prepared?/Ina SmithThe future of science is digital. Are YOU prepared?/Ina Smith
The future of science is digital. Are YOU prepared?/Ina Smith
 
African Open Science Platform pilot study and landscape findings
African Open Science Platform pilot study and landscape findingsAfrican Open Science Platform pilot study and landscape findings
African Open Science Platform pilot study and landscape findings
 
Climate change and variability/ Abiodun Adeola
Climate change and variability/ Abiodun AdeolaClimate change and variability/ Abiodun Adeola
Climate change and variability/ Abiodun Adeola
 
African Open Science Platform
African Open Science PlatformAfrican Open Science Platform
African Open Science Platform
 
African Open Science Platform. Where are we? Where do we want to go? How do w...
African Open Science Platform. Where are we? Where do we want to go? How do w...African Open Science Platform. Where are we? Where do we want to go? How do w...
African Open Science Platform. Where are we? Where do we want to go? How do w...
 
Data management principles and trusted data repositories/Lynn Woolfrey
Data management principles and trusted data repositories/Lynn WoolfreyData management principles and trusted data repositories/Lynn Woolfrey
Data management principles and trusted data repositories/Lynn Woolfrey
 
Why Open Science Matters to Libraries/Ina Smith
Why Open Science Matters to Libraries/Ina SmithWhy Open Science Matters to Libraries/Ina Smith
Why Open Science Matters to Libraries/Ina Smith
 
Europe's Open Science Policy and Policy Platform/Jean-Claude Burgelman
Europe's Open Science Policy and Policy Platform/Jean-Claude BurgelmanEurope's Open Science Policy and Policy Platform/Jean-Claude Burgelman
Europe's Open Science Policy and Policy Platform/Jean-Claude Burgelman
 
EOSC Strategic Implementation Roadmap 2018-2020/Jean-Claude Burgelman
EOSC Strategic Implementation Roadmap 2018-2020/Jean-Claude BurgelmanEOSC Strategic Implementation Roadmap 2018-2020/Jean-Claude Burgelman
EOSC Strategic Implementation Roadmap 2018-2020/Jean-Claude Burgelman
 
H3Africa/H3ABioNet Case Study/Nicola Mulder
H3Africa/H3ABioNet Case Study/Nicola MulderH3Africa/H3ABioNet Case Study/Nicola Mulder
H3Africa/H3ABioNet Case Study/Nicola Mulder
 
AIMS Ecosystem of Transformation/Barry Green
AIMS Ecosystem of Transformation/Barry GreenAIMS Ecosystem of Transformation/Barry Green
AIMS Ecosystem of Transformation/Barry Green
 
Building and Operating National Open Science Research Infrastructures - the e...
Building and Operating National Open Science Research Infrastructures - the e...Building and Operating National Open Science Research Infrastructures - the e...
Building and Operating National Open Science Research Infrastructures - the e...
 
Vision and Mission for a Future African Open Science Platform/Felix Dakora
Vision and Mission for a Future African Open Science Platform/Felix DakoraVision and Mission for a Future African Open Science Platform/Felix Dakora
Vision and Mission for a Future African Open Science Platform/Felix Dakora
 
The Digital Revolution and Open Science for the Future/Geoffrey Boulton
The Digital Revolution and Open Science for the Future/Geoffrey BoultonThe Digital Revolution and Open Science for the Future/Geoffrey Boulton
The Digital Revolution and Open Science for the Future/Geoffrey Boulton
 
Response of Academies of Science to Open Science/Roseanne Diab
Response of Academies of Science to Open Science/Roseanne DiabResponse of Academies of Science to Open Science/Roseanne Diab
Response of Academies of Science to Open Science/Roseanne Diab
 
The Landscape of Open Science in Africa/Susan Veldsman & Joseph Wafula
The Landscape of Open Science in Africa/Susan Veldsman & Joseph WafulaThe Landscape of Open Science in Africa/Susan Veldsman & Joseph Wafula
The Landscape of Open Science in Africa/Susan Veldsman & Joseph Wafula
 
Open Data for Socio-Economic Value/Ina Smith
Open Data for Socio-Economic Value/Ina SmithOpen Data for Socio-Economic Value/Ina Smith
Open Data for Socio-Economic Value/Ina Smith
 
Digital Citizenship for all South Africans
Digital Citizenship for all South AfricansDigital Citizenship for all South Africans
Digital Citizenship for all South Africans
 

Kürzlich hochgeladen

(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 

Kürzlich hochgeladen (20)

(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 

Open Data in Bioinformatics and Required Infrastructure towards achieving the SDGs/Samar Kassim

  • 1. Open  Data  in  Bioinforma/cs  and  Required   Infrastructure  towards  achieving  the  SDGs   www.h3abionet.org       9th  BioVisionAlexandria  Conference,     Alexandria,  Egypt   2018       Prof.  Samar  Kassim   samar_kassim@med.asu.edu.eg     9th  BioVisionAlexandria  Conference,  Egypt  
  • 2. Introduc/on   •  Major  technological  advances  in  molecular  biology  is  the  sophis7ca7on,  diversity,   scale  and  decreasing  cost  of  the  data  being  generated  i.e.  by  high  throughput   pla;orms   •  First  human  genome  sequence:   –  Throughput  2.8  million  bases  per  24  hours  on                  AB3730xl  sequencers   –  13  years  to  sequence  3  billion  bases  at  x10                coverage   –  Cost  ~  500  million  USD  (lower  bound  es7mate)     •  Next  (now)  genera7on  sequencing:   –  Throughput  1  million  bases  per  second   –  ~10  hours  to  sequence  3  billion  bases  at  x10                coverage   –  Cost  ~  4,000  USD  per  genome       hTps://www.genome.gov/sequencingcosts/   hTp://en.wikipedia.org/wiki/File:Historic_cost_of_sequencing_a_human_genome.svg   Author  =  Ben  Moore     9th  BioVisionAlexandria  Conference,  Egypt  
  • 3. Data  driven  biological  science  -­‐   bioinforma/cs   •  Decreasing  data  genera7on  costs  shiZed  biological  sciences  to  a   data  driven  science  with  bioinforma7cs  playing  a  major   component     Stephens  ZD,  Lee  SY,  Faghri  F,  Campbell  RH,  Zhai  C,  et  al.  (2015)  Big  Data:  Astronomical  or  Genomical?.  PLOS  Biology  13(7):  e1002195.   hTps://doi.org/10.1371/journal.pbio.1002195:  hTp://journals.plos.org/plosbiology/ar7cle?id=10.1371/journal.pbio.1002195   9th  BioVisionAlexandria  Conference,  Egypt  
  • 4. Genomics  and  Africa  -­‐  H3Africa   •  “The  Human  Heredity  and  Health  in  Africa  (H3Africa)  Ini/a/ve  aims  to   facilitate  a  contemporary  research  approach  to  the  study  of  genomics   and  environmental  determinants  of  common  diseases  with  the  goal  of   improving  the  health  of  African  popula7ons.”  (hTp://h3africa.org/)   •  “The  vision  of  H3Africa  is  to  create  and  support  a  pan-­‐con7nental   network  of  laboratories  that  will  be  equipped  to  apply  leading-­‐edge   research  to  the  study  of  the  complex  interplay  between  environmental   and  gene7c  factors  which  determines  disease  suscep7bility  and  drug   responses  in  African  popula7ons.”  (hTp://h3africa.org/about/vision)         9th  BioVisionAlexandria  Conference,  Egypt  
  • 5. H3Africa  Phase  I  overview   •  25  research  projects  in  Africa   •  >  500  inves7gators   •  Covers  27  African  countries           •  Upto  75,000  research  par7cipants     •  >  USD  76  million  invested  in  phase  1   8  Collabora/ve   Centers   7  Research   Projects   3  Biorepositories   6  Ethics  Grants   The  H3Africa   Consor/um   Bioinforma/cs   Network   hTp://h3africa.org/consor7um/projects     9th  BioVisionAlexandria  Conference,  Egypt  
  • 6. H3Africa  Bioinformatcs  Network  (H3ABioNet)   •  Pan  African  Bioinforma7cs  Network  to  develop  bioinforma7cs   capacity  in  Africa  and  support  the  H3Africa  research  projects   •  28  nodes  in  17  African  countries   •  PI:  Prof.  Nicky  Mulder,  CBIO-­‐UCT   •  Educa7on,  infrastructure,  research   •  Archive  African  genomics  data     9th  BioVisionAlexandria  Conference,  Egypt  
  • 7. H3Africa  data  being  collected  (Phase  I)   •  Phenotype  data  (associated  with  genotype  data)   –  Demographic  informa7on   –  Anthropometric  data   –  Disease  and  health  related  phenotype  data   •  Gene7c  Varia7on  data  human  and  pathogen   –  Sequence  data  (whole  genome,  exome,  targeted)     •  Genotyping  chip  array  data   –  ~55,000  samples  to  be  run  on  an  H3Africa  African  custom  chip     •  Microbiome  sequence  data   –  Pa7ent/sample  phenotypes   –  Non-­‐human  16S  rRNA  sequence  data  for  microbiome   –  Non-­‐human  full  genome  sequence  data  for  microbiome   –  Possible  human  sequence  contamina7on   •  Biospecimens  to  be  deposited  at  the  H3Africa  biorepositories       Image  credits:  Na/onal  Human  Genome  Research  Ins/tute  (h]ps://www.genome.gov/imagegallery/)   9th  BioVisionAlexandria  Conference,  Egypt  
  • 8. Lack  of  repository  for  African  Genomics  data   •  1,759  datasets  with  the  query  “African”  –  none  in  Africa   hTps://discover.reposi7ve.io/     9th  BioVisionAlexandria  Conference,  Egypt  
  • 9. 9th  BioVisionAlexandria  Conference,  Egypt   H3Africa  Data  Archive   •  Assist  H3Africa  projects  as  data  coordina7on  center:             Transfer  Validate   Store   Submit  to   EGA   Obtain  EGA  accessions   for  publica/ons   0.5  petabytes  storage  size  including  offsite   replica7on  
  • 10. H3Africa  Catalogue   9th  BioVisionAlexandria  Conference,  Egypt   •  Online  catalogue  with  meta-­‐data  to  search  and  apply  for  datasets  and   biospecimens  (under  development)  
  • 11. Human  gene/c  data  privacy   •  H3Africa  rich  source  of  meta-­‐data  (phenotypes)   (1)  Age  &  (2)  Sex   (3)  Country  of  birth   (4)  Current  residence   (5)  Native  language   (6)  Ethno-­‐linguistic/tribal  affiliation   (7)  Country  of  birth  of  father  and  mother   (8)  Na7ve  language  of  father  and  mother   (9) Ethno-­‐linguistic/tribal  affiliation  of   mother  and  father   (10)  Height   (11)  Weight   (12)  Current  medica7ons   (13)  Smoking  history   (14)  Alcohol  history   Image  credits:  Na/onal  Human  Genome  Research  Ins/tute  (h]ps://www.genome.gov/imagegallery/)   •  Combina7on  of  phenotype  and  gene7c  data  makes  it  possible  to  iden7fy   different  popula7ons  and  individuals  –  restricted  access     9th  BioVisionAlexandria  Conference,  Egypt  
  • 12. Sharing  of  research  data  and  outputs   •  Funders’  data  sharing  policies       “The  Wellcome  Trust  is  commiTed  to  ensuring  that  the  outputs  of  the  research   it  funds,  including  research  data,  are  managed  and  used  in  ways  that  maximise   public  benefit.  Making  research  data  widely  available  to  the  research   community  in  a  7mely  and  responsible  manner  ensures  that  these  data  can  be   verified,  built  upon  and  used  to  advance  knowledge  and  its  applica7on  to   generate  improvements  in  health.”   hTps://wellcome.ac.uk/funding/managing-­‐grant/policy-­‐data-­‐management-­‐and-­‐sharing         “The  Na7onal  Ins7tutes  of  Health  (NIH)  Genomic  Data  Sharing  Policy  expects   that  genomic  research  data  from  NIH-­‐supported  studies  involving  human   specimens  as  well  as  non-­‐human  and  model  organisms  will  be  submiTed  to  an   NIH-­‐designated  data  repository.  The  list  below  provides  examples  of  relevant   databases.”   hTps://gds.nih.gov/02dr2.html     9th  BioVisionAlexandria  Conference,  Egypt  
  • 13. Limits  to  sharing  human  gene/c  data   •  Ethics:   –  Digital  data  (genomes)  can  be  stored  indefinitely,  biobank   specimens  can  be  stored  for  up  to  20  years  –  secondary  use   –  Rapid  innova7on  with  ‘omics  technologies   •  H3Africa:  “Seven  projects  used  broad  consent,  five   projects  used  7ered  consent  and  one  used  specific   consent”§     •  History  of  vulnerable  popula7ons,  low  educa7on   levels  and  exploita7on   •  Blood  sample  collec7on  and  visits  to  clinics  associated   with  disease  and  treatment  –  even  if  a  healthy  control   •  “All  but  one  of  the  consent  forms  that  we  reviewed   included  a  statement  about  data  sharing.”  §   §  Munung  NS,  Marshall  P,  Campbell  M,  et  al  Obtaining  informed  consent  for  genomics  research  in  Africa:  analysis  of  H3Africa  consent   documents.  Journal  of  Medical  Ethics  2016;42:132-­‐137)   Ethical   considera7ons   Informed   consent   Par7cipant   iden7fica7on   S7gma7sa7on   Benefit   sharing   9th  BioVisionAlexandria  Conference,  Egypt  
  • 14. Limits  to  sharing  human  gene/c  data   •  Non-­‐harmonized  na7on  /  regional  laws  and  policies  for  ethics  and   genome  data  sharing  within  Africa         Image  credits:  hTps://en.wikipedia.org/wiki/African_Economic_Community     9th  BioVisionAlexandria  Conference,  Egypt  
  • 15. H3Africa  data  sharing  and  access  policy   •  Balance  between  ensuring  that  adequate  safeguards  to  protect   par7cipants  while  not  being  a  barrier  for  scien7sts  to  advance   research:   -  Maximizing  the  availability  of  research  data,  in  a  7mely  and  responsible   manner.   -  Protec7ng  the  rights  and  privacy  of  human  subjects  who  par7cipated  in   research  studies.   -  Recognizing  the  scien7fic  contribu7on  of  researchers  who  generated  the   data.   -  Considering  the  nature  and  ethics  of  the  research  proposed  in  establishing   the  7mely  release  of  data,  and  mechanisms  of  data  sharing.     -  Promo7ng  deposi7on  of  genomic  data  in  exis7ng  community  data   repositories  whenever  possible     hTp://h3africa.org/images/DataSARWG_folders/FinalDocsDSAR/H3Africa%20Consor7um%20Data %20Access%20%20Release%20Policy%20Aug%202014.pdf     9th  BioVisionAlexandria  Conference,  Egypt  
  • 16. Challenges  in  sharing  data  –  metadata   standards   •  Meta-­‐data  (phenotype)  data  is  collected  via  case  report  forms  (CRFs)            Project  1  CRF                                            Project  2  CRF                                    Project  3  CRF   Female                                                              Woman                                                                1   Daily  units                                                    Weekly  units                                User  defined  7me  period   •  Same  ques7on  –  data  coded  in  different  ways   •  Similar  measure  –  collected  in  different  ways   9th  BioVisionAlexandria  Conference,  Egypt  
  • 17. Use  established  standards  -­‐  Ontologies   •  “An  ontology  defines  a  common  vocabulary  for  researchers  who  need  to   share  informa7on  in  a  domain.  It  includes  machine-­‐interpretable   defini7ons  of  basic  concepts  in  the  domain  and  rela7ons  among  them.”*       *hTp://protege.stanford.edu/publica7ons/ontology_development/ontology101-­‐noy-­‐ mcguinness.html       9th  BioVisionAlexandria  Conference,  Egypt  
  • 18. Op/ons  to  aid  data  sharing   •  Make  data  Findable,  Accessible,  Interoperable  and  Reusable  (FAIR  compliant)                     •  Do  you  see  a  gene7c  variant  in  a  specific  posi7on  within  your  dataset  –  Yes  /   No  as  in  the  case  for  the  South  African  Human  Genome  Program  (SAHGP)   Global  Alliance  for  Genomics  and   Health:  hTp://ga4gh.org/#/beacon     9th  BioVisionAlexandria  Conference,  Egypt  
  • 19. H3Africa  genotyping  chip   •  Current  genotyping  technologies  are  designed  for  European   popula7ons   •  African  popula7ons  under  represented,  although  have  the  most   diversity   9th  BioVisionAlexandria  Conference,  Egypt   Image  credits:  Na/onal  Human  Genome  Research  Ins/tute  (h]ps://www.genome.gov/imagegallery/)  
  • 20. Designing  the  H3Africa  genotyping  chip   9th  BioVisionAlexandria  Conference,  Egypt   Image  credits:  Na/onal  Human  Genome  Research  Ins/tute  (h]ps://www.genome.gov/imagegallery/)   •  Collabora7on  between  H3ABioNet  and  Na7onal  Center  for   Supercompu7ng  Applica7ons  (NCSA-­‐US  based)  via  US  partner  at   University  of  Illinois     •  U7lized  the  Bluewaters  supercomputer  facili7es  and  CHPC  facili7es          212,000  Node  compu7ng  hours  used  at  Bluewaters          600  TB  of  storage  needed     Chip  undergone  assessment  and  in  use  with  pos7ve                               results     h]ps://twi]er.com/billgates/status/800800954790465536?lang=en    
  • 21. Connec/vity  for  data  transfers   GO endpoints   Transfer speeds (Mbps)   (min, max)   Baylor <-> Blue Waters   340, 1900   Blue Waters -> UCT   204, 322   CHPC <-> Blue Waters   81, 243   UCT <-> CHPC   34, 406   Sanger <-> UCT   38, 76   GO  source  and   des/na/on   Files  to  transfer  and  size  per   sample   Total  size  of  transfer  for  350   samples   Min  transfer   speed   Time  to   transfer   Baylor  to  Blue  Waters   Baylor  FASTQ.gzs  /  100GB   75TB   340Mbps   21  days   Blue  Waters  to  UCT   Baylor  FASTQ.gzs  /  100GB   75TB   200Mbps   35  days   Blue  Waters  to  UCT   BW  BAMs  /  100GB   40TB   200Mbps   19  days   UCT  to  CHPC   BW  BAMs  /  100GB   40TB   34Mbps   109  days   CHPC  to  UCT   Union  set  /  VCFs   1TB   34Mbps   3  days   UCT  to  Sanger   Union  set  /  VCFs   1TB   34Mbps   3  days    Globus  Online  installed  at  Nodes   9th  BioVisionAlexandria  Conference,  Egypt  
  • 22. Challenge  of  unequal  infrastuctures     •  Diverse  levels  of  exper7se  and  infrastructure  between  different   countries     www.project-­‐redcap.org/map_fullscreen.php     SoZware  and  hardware  sanc7ons     exacerbate  exis7ng  inequali7es     e.g  Sudan  Node   hTp://mgafrica.com/ar7cle/2015-­‐01-­‐14-­‐17-­‐ startling-­‐facts-­‐about-­‐the-­‐state-­‐of-­‐science-­‐and-­‐ research-­‐in-­‐africa     9th  BioVisionAlexandria  Conference,  Egypt  
  • 23. Bioinforma/cs  educa/on   9th  BioVisionAlexandria  Conference,  Egypt   Aim:   •  Basic  bioinforma7cs  training  for  interested  H3Africa  members   (bioinforma7cs  users  –  Introduc7on  to  Bioinforma7cs  Training)   •  Web-­‐based  bioinforma7cs  tools  and  resources  and  how  to  use   them    Course  logis/cs:   •   3  months,  2  days  contact  7me  per  week  (3  hours  per  session)   •   Distance  learning  model  –  physical  classrooms  connected      to  virtual  classroom   •   Mconf  –  video  conferencing   •   Vula  –  course  management         virtual  classroom  
  • 24. 9th  BioVisionAlexandria  Conference,  Egypt   IBT_2017  classroom  sites     27  in  total   (vs.  20  classrooms  in  2016)     Countries  that  have   joined  IBT  in  2017:   Ethiopia,  Burkina  Faso     Some  par7cipants  from   first  course  are  going  to  be   TAs     Over  580  enrolled     Par/cipants  and  over     130  volunteer  staff     IBT  2017  Classrooms   Paper  published  on   course  design   VIRTUAL CLASSROOM classroom site 2016 new classroom site 2017 classroom site 2016 and 2017
  • 25. Conclusion   •  Bioinforma7cs  =  big  data  and  needs  computa7onal  power,  storage,   fast  read  and  write  for  processing   •  Well  defined  meta-­‐data  standards  are  vital  for  interoperability  and   sharing  of  data   •  Cyber  infrastructure  for  moving  and  sharing  large  datasets  is  needed   to  foster  open  data  and  open  science   •  Educa7on  and  skills  development  essen7al  for  African  ci7zens  to  take   advantage  of  the  data  revolu7on   •  Percep7ons  and  a{tudes  –  no  amount  of  infrastructure  will  drive   Open  data  and  Open  science  if  the  sen7ment  is  absent       9th  BioVisionAlexandria  Conference,  Egypt  
  • 26. Acknowledgements   •  Prof  Nicky  Mulder  and  H3ABioNet  members   •  Ina  Smith  and  the  Academy  of  Science  of  South  Africa     •  BioVisionAlexandria  2018  organizers   H3ABioNet  Consor/um  Members  2017     9th  BioVisionAlexandria  Conference,  Egypt  
  • 27. Conclusions   Provide  data  archiving  solu7on  for  H3Africa  projects  to   ensure  that  local  copy  of  the  data  remains  on  the  con7nent   9th  BioVisionAlexandria  Conference,  Egypt  
  • 28. Communica/on  –  H3Africa     Image  credit:  hTps://commons.wikimedia.org/wiki/File:UTC_hue4map_X_world_Robinson.png                           •  H3Africa  working  groups  meet  every  fortnight   •  Regular  mee7ngs  are  challenging  due  to  diversity  of  7mezones  (most  funders   in  the  US)  and  daylight  saving  hours   9th  BioVisionAlexandria  Conference,  Egypt  
  • 29. Communica/on  –  H3Africa       •  H3Africa  funders  and  project  members  meet  face  to  face  every  six   months  to  provide  reports  and  for  working  groups  to  also  wrap  up   deliverables   9th  BioVisionAlexandria  Conference,  Egypt  
  • 30. Communica/on  –  H3ABioNet     •  Within  H3ABioNet  the  nodes  are  located  in  Africa  so  7me  differences  are  not   a  hindrance   •  Working  groups  meet  once  a  month  and  network  meets  annually  for  SAB   review  and  network  business   •  Only  some  countries  have  toll  free  access  to  a  booked  conference  call,  costly   •  Challenges:  communica7on  pla;orms         hTp://mconf.org/     9th  BioVisionAlexandria  Conference,  Egypt  
  • 31. Biomedical  science  becoming  “data  rich”   OECD  –  WDS  Workshop,  Brussels  2017   hTps://www.nlm.nih.gov/about/2017CJ_NLM.pdf    
  • 32. Mapping  internet  conec/vity   OECD  –  WDS  Workshop,  Brussels  2017  
  • 33. OECD  –  WDS  Workshop,  Brussels  2017   Bioinforma/cs  SOPs  -­‐  Reproducible   •  Developed  SOPs   and  prac7ce   datasets  for:   –  NGS  Variant  calling   –  Genome  Wide   Associa7on  Studies   (GWAS)   –  16S  rRNA  diversity   analysis     •  SOPs  and  prac7ce   datasets  under   development:   –  RNA  Seq   –  Variant   priori7za7on  and   annota7on   •  Guidelines  on   compute  and   storage  
  • 34. OECD  –  WDS  Workshop,  Brussels  2017   Archive  dashboard  
  • 35. OECD  –  WDS  Workshop,  Brussels  2017   Ontologies work 35   Adapting OMIABIS ontology to H3Africa data Mapping CRFs to ontologies, e.g. phenotype or disease ontology Mapping genomics data to Experimental Factor ontology Developing Sickle Cell Disease Ontology
  • 36. OECD  –  WDS  Workshop,  Brussels  2017   Beacons in Africa hTps://beacon-­‐network.org//#/directory     •  First Beacon in Africa “lit” on October 2016 for the SAHGP