2. Typical Scientific Data Workflow
Acquisition Exploration Analysis
Analysis
1. Data Acquisition 3.Data Analysis
• Excel, Access, homebrews • SAS, R
• (Electronic?!) forms, notes • Spotfire
• LIMS & instruments output • Tableau
• Labmatrix forms & records • Statisticians
2. Data Exploration
• Other enterprise resources • etc…
• etc…
• Easy, graphical queries
• ETL & data cleaning tools
• Formulas & calculations
• Visualize charts & graphs
2
3. Once you have:
1. Collected… ()
2. Standardized… (Not yet? Use built-in data cleaning tools)
3. Normalized… (Not yet? Use built-in formula calculation tools)
…some, or all of your project data,
how do you best make use of them?
4. The Problem: subject matter experts having to go through a
(limited) pipeline of IT expertise to answer complex questions
about their domain-specific data.
DB DB IT
DB DB
Programmers
DB
Piles of project data Domain experts with many
from various sources complex data questions
5. Clashing of Expertise
Domain Experts / Researchers IT / Programmers
DNA! Primary key!
Biomarkers! Data type!
Transcription! Object model!
• Can’t access data by myself • Too many throw-away or one-off
• My data inquiries are taking project requests
too long to process • They keep changing their minds
• I have many more inquiries about how to cut the data
but afraid to ask • Nothing is standardized
• IT misinterprets my inquiries • No prioritization: using brute
• Changed my mind about force approach to grind through
inquiries in process already all data instead of critical path
• Data result doesn’t look right • Could use more domain
• Didn’t IT know I need to relate expertise when processing piles
A with B in this specific way? of complex data
• … • …
6. 1. Common workspace
The Solution:
2. Shared “language”
IT / Programmers
DB DB
All raw & prepared data
can be centralized here.
The data processes and
data queries are shown
DB DB centralize graphically, so they are
easily understood by both
IT and domain experts.
DB
Domain Experts
7. Symbiotic Expertise
Domain Experts / Researchers IT / Programmers
• Can explore data by myself • Centralized environment to
• Get results from complex questions prepare and present data sets
in minutes instead of weeks • Built-in import, data cleaning,
• Gain actionable insights even from standardization & ontology tools
rough or messy data (within • Centrally manage data access and
institutional guidelines) audit all changes and activities
• Visually share interesting data • Prepare and fix data issues with
queries with colleagues guided priority from end-users
• Visually share data workflows and • Develop & reuse code for projects
issues with IT personnel via programmatic interface
• Help IT identify data issues and • Self-serve model allows IT to work
prioritize fixes on other things
• … • …
8. Symbiotic Expertise = smarter & less IT efforts,
faster & better data access for domain experts
SEA OF DATA
With the ability to explore data easily, domain experts can quickly
identify relevant data, gain actionable insights, and better drive efforts
9. How does work?
Step 1. Drag & drop a set of data Step 3. Expand the scope and detail of
on top of another. your question with additional data sets,
filter conditions, calculations, or other
kinds of transformations as necessary.
Patients Meds
Combine
Step 2. Data sets are intelligently Pivot
and automatically connected to Result Result
each other. Set 1 Set 2
Filter
Patients
Patients
on Meds
Each “node” is live, so you can retrieve
Filter and review the results from each step
Meds
as you build a complex query.
You are now trained in using Qiagram.
10. Current Client Application Areas:
• Clinical & Translational Research
• Biomarker Discovery
• Healthcare Data Utilization/Consumption
• In silico Clinical Trial Feasibility
• Consortium Collaborations
• Cheminformatics Research
• …
11. Case Study: Common Problem in Translational Research
Cryptic DB you’ll never
have easy access to
12. Qiagram: our award-winning “draw-your-question”
The Solution interface - SQL or programming training NOT required!
Just drag & drop, and run your query!
13. Qiagram: a visual data query tool
Example 1: “reporting & operational statistics” data query
14. Qiagram: a visual data query tool
Example 2: hypothesis-driven data exploration
15. Qiagram: a better BI tool for translational research (TR)
Traditional BI TR Informatics
Budget $$$ $
Purpose Operational Exploratory
Questions Simple Complex
Data Cleaning & Precursor to Parallel to meaningful
Standardization meaningful queries queries
Data Sources Well understood Ever-changing
Data organization Hierarchical Ad hoc
Perspective Static Individualized
Collaboration Limited Extensive
... the exploratory & discovery nature of TR requires tools specifically designed
for TR endeavors, instead of shoe-horning traditional BI technologies.
15
16. Many ways to get data into the system:
Large Flat DB DB
An enterprise, scalable solution that Files DB
Federation
communicates with all data sources Engine
SQL Scripts
DB tab-delimited
text
Data SOAP
ETL
Transformer Framework
Web Forms,
Data Files
HTTP WEB UI
.TXT
Qiagram
Core API
Java Objects
Enterprise RMI RMI API
System
DB Qiagram
XML
Enterprise SOAP
Custom Web
Framework
System Services
17. KEY FRAMEWORK FEATURES
Centralize Data: web-accessible system enables immediate data staging, multi-
site collaboration, data/site management, data QA/review/reports, and instant data
querying results; scalable enterprise deployment
Clean & Standardize: improve data quality via built-in data cleaning and
standardization tools; establish or import vocabularies & standardized data models
Enforce User Roles & Permissions: flexible configurations of how
different users/groups/TAs can access specific data sets in collaborative settings
Maintain Security & Compliance: transmit data securely, facilitate
regulatory compliance, and track all data changes via detailed audit logs in this
HIPAA/PHI-compliant system; customizable data backup & recovery plans
Integration & Interoperability: multiple interfaces to communicate with
other data systems in your IT infrastructure; vocabulary & ontology definitions