1. Peter WeiDer Li
1027 Wicklow Ln SW, Rochester, MN 55902
Phone: 301-335-8688 Fax: 507-292-0833 E-Mail: jenli.peter@gmail.com
Director | Informatician | Architect | Innovator
Experienced informatician/architect, with over 20 years of informatics, software, and scientific
expertise, seeks to improve clinical care and research through creative informatics approaches
that are robust, scalable, and accurate.
Ø Conceived, architected, and implemented a full software stack from secure cloud
deployment to intuitive UI for surgical clinical decision support.
Ø Conceived and developed massively threaded graph-based solutions for genome assembly
and “semantic web”-based solutions for clinical applications on supercomputer.
Ø Conceived and developed a highly extensible and evolvable patient and sample database
architecture for oncology and epidemiology.
Ø Developed grid-based genome and network analysis tools for oncology and vaccinology.
Ø Directed 60+ bioinformaticians to analyze and annotate the first genomes of fruit fly, human,
mouse, and rat.
Ø Directed 10+ engineers to develop a novel 3-tier, distributed genome mapping database
(GDB) and a novel XML-based medical genetics database (OMIM).
A very broad background that enable me to: a) effectively communicate and interact with
multiple stakeholders and users; b) quickly understand a domain, its problems, and synthesize
novel solutions; and c) assume different roles to move a project forward.
Professional Experience
Independent Consultant (TargetCW, Apervita, PrecisionBioinformatics) 5/2014-present
n Clients: Mayo Clinic. Information architect for real-time visual analytics. Architected and
implemented a complete vertical stack delivering interactive data mining interface for
clinical decision support; a model-driven, specification code library for ElasticSearch query
generation and result mapping; RIM/FHIR/SQL data conversion; and a HIPAA-compliant
deployment framework for PHI-data in public Azure cloud.
n Technologies: Azure, Linux, ElasticSearch, nodejs, and angular.
Mayo Clinic, Rochester, MN 1/2007-4/2014
n Assistant Professor, Biomedical Informatics.
Consultant, Division of Biostatistics and Informatics.
n Developed grid-based software for high-dimensional genomic data (GWAS, next gen
sequencing, mRNA-seq, methylation, miRNA) for Parkinson, vaccine, and cancer research.
Supercomputer-based graph mining for genome assembly graph and clinical semantic
web/RDF data mining. Temporal EAV SQL-based database tools for medium scale clinical
registries and ElasticSearch-based tools with advanced querying for large scale clinical
registries.
n Taught graduate-level bioinformatics and clinical informatics courses. Mentored post-
doctoral fellows. Program chair for Mayo’s inaugural Individualizing Medicine Conference.
Served on institutional committees for Research Core Resource Oversight, Enterprise Data
Modeling, Univ-Illinois Urbana-Champaign Strategic Alliance, and Department of Lab
Medicine and Pathology IT Management Program Committees.
n Technologies: Multi-threaded programming, Cray XMT2, AWS cloud-based computing,
NOSQL and Graph database architectures. Perl, C/C++, and Javascript.
2. Page 2
Applied Biosystems Inc, Rockville, MD 3/2004-12/2006
n Director of Content Bioinformatics.
Scientific Advisory Council.
n Directed, managed, and developed analysis pipelines for bioinformatics annotations for
commercial genotyping and gene expression assays. Architected and developed new
pipelines for microarray probes, SNP-mining, resequencing reagents, and gene
annotations. Developed business plans, planned corporate-wide scientific conferences.
n Technology: Genome analysis, informatics for PCR, microarray, and mass spec proteomics.
Celera Genomics, Rockville, MD 9/1998-2/2004
n Director for Chromosome Databases.
Director for Advanced Solutions and Pipelines.
n Built a 60-person world-class informatics team. Directed, managed, and developed large-
scale analysis pipelines for genome assembly, quality control, gene annotations, and
comparative genomics (drosophila, human, mouse, and rat).
n Lead a consulting team to provide comprehensive and novel informatics solutions for
customers. Negotiated and drafted custom project contracts.
n Technology: Cluster and grid-based computing, large-scale systems development,
deployment, and migration.
Johns Hopkins University, Baltimore MD 5/1992-8/1998
n Assistant Professor of Biomedical Informatics.
Associate Director of OMIM, Director of GDB.
n Architected and directed XML-based Online Mendelian Inheritance in Man databases for
print and web distribution; object schema-driven, 3-tier system for Genome Database with
group ownership/privacy, novel communication protocol stack. Managed 15-person
development and curation team. Deployed clinical registries using GDB-based software.
n Technology: SunOS, Sybase and Postgres DBMS, web servers. C/C++, Perl, Java.
Interactive Development Environment, Inc., San Francisco, CA 1987-1992
n Software Engineer.
n Developed GUI-based computer-aided software engineering (CASE) tool suite, “Software-
Through-Pictures”.
n Technology: Sun, Apollo, DEC, HP, and X11 windowing systems. SCCS, RCS, CMS, and
DSEE code revision and management systems. UNIX, X11, C/C++.
Education
n Ph.D. – Medical Information Sciences, Univ of California, San Francisco, CA 12/1991
Thesis: Sequence Modeling and an Extensible Data Model for Genomic Databases.
Designed and built an interpreted high-level object-DBMS language.
n M.S. – Biochemistry, Univ of Colorado, Denver, CO 5/1983
Project: Sequencing the Substance-P gene.
Used synthetic, degenerate-codon probes to clone the complete gene.
n Medical School – Univ of Colorado, Denver, CO 1978-1983
Withdrew after 3rd
year.
n B.S. – Biology, B.A. – Mathematics, Colorado State Univ, Pueblo, CO 5/1978
References Available on Request
3. Page 3
Publications
Genome Database (GDB) and Online Mendelian Inheritance in Man (OMIM)
1. Pearson P, Francomano C, Foster P, Bocchini C, Li P, McKusick V. The status of online
Mendelian inheritance in man (OMIM) medio 1994. Nucleic Acids Res. 1994 Sep;
22(17):3470-3. PMID:793704. PMCID:1878955.
2. Li P, Kramer L, Pineo S, Kulp D. Evolving a legacy system: restructuring the mendelian
inheritance in Man Database. JAMIA Symposium Supplement: SCAMC Proceedings.
Washington, DC. 1994 Nov:344-8.
3. Li P, Fasman K. Toolkit technology for database federation. 1995 Annual Meeting of
CODATA Task Group on Biological Macromolecules. George Mason University, Fairfax,
Virginia, 1995 Jun.
4. Li P, Waldo D, Pineo S, Foster P. An efficient delivery historical information for the
mendelian inheritance in Man Database. JAMIA Symposium Supplement:1995 SCAMC
Proceedings. New Orleans,LA, 1995 Nov:127-31.
5. Fasman KH, Letovsky SI, Li P, Cottingham RW, Kingsbury DT. The GDB Human Genome
Database Anno 1997. Nucleic Acids Res 1 1997 Jan 1; 25(1):72-81. PMID:9016507.
PMCID:146370.
6. Letovsky S, Cottingham R, Porter C, Li P. GDB: the Human Genome Database. Nucleic
Acid Research. 1998; 26(1):94-9. PMID:9399808. PMCID:147203.
Celera Genomes and Genomic Analysis
7. Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, Amanatides PG, Scherer, SE,
Li PW, et al. The Genome Sequence of Drosophila melanogaster. Science. 2000 Mar 24;
287(5461):2185-95. PMID:10731132.
8. Rubin GM, Yandell MD, Wortman JR, Gabor Miklos GL, Nelson CR, Hariharan IK, Fortini
ME, Li PW, et al. Comparative genomics of the eukaryotes. Science. 2000 Mar 24;
287(5461):2204-15. PMID:10731134. PMCID:2754258.
9. Venter JC, Adams MD, Myers EW, Li PW, et al. The sequence of the human genome.
Science. 2001 Feb 16; 291(5507):1304-51. PMID:11181995.
DOI:10.1126/science.1058040.
10.Kerlavage A, Bonazzi V, di Tommaso M, Lawrence C, Li P, Mayberry F, Mural R, Nodell M,
Yandell M, Zhang J, Thomas P. The Celera Discovery System. Nucleic Acids Res. 2002
Jan 1; 30(1):129-36. PMID:11752274. PMCID:99167.
11.Mural RJ, Adams MD, Myers EW, Smith HO, Miklos GL, Wides R, Halpern A, Li PW, et al.
A comparison of whole-genome shotgun-derived mouse chromosome 16 and the human
genome. Science. 2002 May 31; 296(5573):1661-71. PMID:12040188.
12.Bailey JA, Gu Z, Clark RA, Reinert K, Samonte RV, Schwartz S, Adams MD, Myers EW, Li
PW, Eichler EE. Recent segmental duplications in the human genome. Science. 2002 Aug
9; 297(5583):1003-7. PMID:12169732. DOI:10.1126/science.1072047.
13.Li P. Biological Data Extinction. OMICS. 2003 Spr; 7(1):49-50. PMID:12831556.
4. Page 4
14.Scherer SW, Cheung J, Venter JC, Li PW, Mural RJ, Adams MD, Tsui LC. Human
chromosome 7: DNA sequence and biology. Science. 2003 May 2; 300(5620):767-72.
PMID:12690205. PMCID:2882961.
15.Florea L, Di Francesco V, Miller J, Turner R, Yao A, Harris M, Walenz B, Mobarry C,
Merkulov GV, Charlab R, Dew I, Deng Z, Istrail S, Li P, Sutton G. Gene and alternative
splicing annotation with AIR. Genome Res. 2005 Jan; 15(1):54-66. PMID:15632090.
PMCID:540277.
Assay Bioinformatics
16.Brzoska PM, Brown C, Cassel M, Ceccardi T, Di Francisco V, Dubman A, Evans J, Fang R,
Harris M, Hoover J, Hu F, Larry C, Li P, Malicdem M, Maltchenko S, Shannon M, Perkins S,
Poulter K, Webster-Laig M, Xiao C, Young S, Spier G, Guegler K, Gilbert D, Samaha RR.
An efficient and high-throughput approach for experimental validation of novel human gene
predictions. Genomics. 2006 Apr; 87(4):437-45. Epub 2006 Jan 09. PMID:16406193.
DOI:10.1016/j.ygeno.2005.11.016.
17.Yao A, Charlab R, Li P. Systematic identification of pseudogenes through whole genome
expression evidence profiling. Nucleic Acids Res. 2006; 34(16):4477-85. Epub 2006 Aug
31. PMID:16945953. PMCID:1636364. DOI:10.1093/nar/gkl591.
18.Hellmann I, Mang Y, Gu Z, Li P, de la Vega FM, Clark AG, Nielsen R. Population genetic
analysis of shotgun assemblies of genomic sequences from multiple individuals. Genome
Res. 2008 Jul; 18(7):1020-9. Epub 2008 Apr 14 PMID:18411405. PMCID:2493391.
DOI:10.1101/gr.074187.107.
Association Rule Mining
19.Simon GJ, Li PW, Jack CR Jr, Vemuri P. Understanding atrophy trajectories in Alzheimer's
disease using association rules on MRI images. Proceedings of the ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining. 2011:369-76.
20.Simon GJ, Kumar V, Li PW. A simple statistical model and association rule filtering for
classification. Proceedings of the ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining. 2011; 823-31. Oberg AL, Kennedy RB, Li P, Ovsyannikova
IG, Poland GA. Systems biology approaches to new vaccine development. Curr Opin
Immunol. 2011 Jun; 23(3):436-43. Epub 2011 May 11. PMID:21570272. PMCID:3129601.
DOI:10.1016/j.coi.2011.04.005.
21.Simon GJ, Schrom J, Castro MR, Li PW, Caraballo PJ. Survival Association Rule Mining
Towards Type 2 Diabetes Risk Assessment. Proceedings of AMIA Annual Symposium
2013, Washington DC. pp 1293-1302. 2013 Nov.
22.Simon GJ, Caraballo PJ, Therneau TM, Cha SS, Castro MR, Li PW. Extending Association
Rule Summarization Techniques to Assess Risk of Diabetes Mellitus. IEEE Transactions
On Knowledge And Data Engineering. 2015 Jan; 27(1):130-141.
doi:10.1109/TKDE.2013.76.
5. Page 5
Clinical Research
23.Kolbert CP, Feddersen RM, Rakhshan F, Grill DE, Simon G, Middha S, Jang JS, Simon V,
Schultz DA, Zschunke M, Lingle W, Carr JM, Thompson EA, Oberg AL, Eckloff BW,
Wieben ED, Li P, Yang P, Jen J. Multi-platform analysis of microRNA expression
measurements in RNA from fresh frozen and FFPE tissues. PLoS One. 2013; 8(1):e52517.
Epub 2013 Jan 31. PMID:23382819. PMCID:3561362.
DOI:10.1371/journal.pone.0052517.
24.McKinney BA, White BC, Grill DE, Li PW, Kennedy RB, Poland GA, Oberg AL. ReliefSeq:
A Gene-Wise Adaptive-K Nearest-Neighbor Feature Selection Tool for Finding Gene-Gene
Interactions and Main Effects in mRNA-Seq Gene Expression Data. PLoS One. 2013;
8(12):e81527. Epub 2013 Dec 10. PMID:24339943. PMCID:3858248.
DOI:10.1371/journal.pone.0081527.
25.Techentin R, Foti D, Al-Saffar S, Li P, Daniel E, Gilbert B, Holmes D. Characterization of
Semi-Synthetic Dataset for Big Data Semantic Analysis. IEEE High Performance Extreme
Computing Conference 2014. Waltham, MA.
26.Jang JS, Lee A, Li J, Liyanage H, Yang Y, Guo L, Asmann Y, Li P, et al. Common
Oncogene Mutations and Novel SND1-BRAF Transcript Fusion in Lung Adenocarcinoma
from Never Smokers”. Nature Scientific Reports (in press).
27.Zhu Y, Tchkonia T, Stout MB, Giorgadze N, Wang L, Li PW, Heppelmann CJ, Bouloumie A,
Jensen MD, Bergen HR 3rd, Kirkland JL. Inflammation and the depot-specific secretome of
human preadipocytes. Obesity (Silver Spring). 2015 May; 23(5):989-99. Epub 2015 Apr 10.