SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Downloaden Sie, um offline zu lesen
Big Data Conference 2013:
Analytics and Applications for Federal Big Data

Data Tactics Corp: A Blended Approach to Big
Data Analytics
!

Richard Heimann,
Data Scientist at Data Tactics Corporation
!

Data Tactics Analytics Practice
The Team:
(Nathan D., Shrayes R., David P., Adam VE., Geoffrey B., Rich H.)

Graduates from top universities...


!
Advanced degrees include:

mathematics, computer science, astrophysics, electrical
engineering, mechanical engineering, statistics, social sciences.

!
Base competencies (horizontals): clustering, association rules,
regression, naive bayesian classifier, decision trees, time-series,
text analysis.

!
Going beyond the base (verticals)...
th

an

pl

st

RT

CA

Ra

ru

nd
om
se
ct
nt
ni
co
ur
ng
im Fo
ns
al
en res
alg
tra
eq
ta
t
in
or
ua
na
ed
ith
tio
to
lys
m
op
n
pi
ec
s
is
m
tim
c
on
od
m
om
od iza
eli
ng
els tion fac
et
sp
ri
to
s
ra
at cs
ial
na
ec
di
lys
au
ba
m
on
is
to
ye
en
om
re
sia
sio
gr
et
n
es
na
ric
st
siv
lr
at
s
ed
ist
e
m
uc
lat
ics
od
tio PC
en
els
n
tc
A
las
IC
s
A
as
an
hi
tro
gr
aly
er
ph
ap
ar
ys sis
ch
h
th
ica
ica
eo
lt
lm
ry
im
od
DL
alg
enu IRT
els
se
IS
or
m
A
rie
ith
er
s
m
ica
an
s
l in
aly
te
sis
m
gr
ba
ixt
at
gg
ur
io
SV
e
in
n
m
g/
M
te
od
bo
ch
m
els
os
ni
ax
qu
tin
en
es
g
t

pa

Horizontals & Verticals

Clustering || Regression || Decision Trees || Text Analysis

Association Rules || Naive Bayesian Classifier || Time Series Analysis
Data Tactics Analytics Practice
Hierarchy of Data Scientists
Why Analytics [Business]???
Why are analytics important? 

(Business, Analytics, Practical)

!
!

!

"We need to stop reinventing the cloud
and start using it!"
(Dave Boyd)
!
!
!
!
Why Analytics [Analytics]???
Why are analytics important? 

(Business, Analytics, Practical)
!
!
No Free Lunch (NFL): no algorithm performs better than
any other when their performance is averaged uniformly
over all possible problems of a particular type. Algorithms
must be designed for a particular domain or style of
problem, and that there is no such thing as a general
purpose algorithm.

!
!
!
Why Analytics [Practical]???
Academic Publications Scale

N

Web Scales
IC Scales

t

If this guy doesn’t scale - none of us do.

t
algo to users > algo to data
Development
Deployment
Machine

User

Parallel

Distributed

Objective

Subjective

M/R

HDFS

Valid

Useful

MPP

SOA

Nontrivial

Novel

Accurate

Comprehensible

GPU
Shiny
Open Sourced by RStudio in November 2012

!
Not the first to wrap R in the browser but perhaps the
easiest for R developers 

!
Don’t need to know HTML, CSS and javascript to get
started 

!
Reactive Programming model 

!
Web sockets for communication
server.R
# Define server logic required to generate and plot a random
# distribution!
shinyServer(function(input, output) {!
!
# Expression that generates a plot of the distribution.!
# renderPlot:!
#!
# 1: Is "reactive" and will therefore automatically !
#
re-executed when inputs change.!
# 2: Its output type is a plot. !
!
output$distPlot <- renderPlot({!
!
# generate an rnorm distribution and plot it!
dist <- rnorm(input$obs)!
hist(dist)!
})!
})
ui.R
library(shiny)!

!

# Define UI for application that plots random distributions !
shinyUI(pageWithSidebar(!
!
# Application title:!
headerPanel("My Shiny App!"),!
!
# Sidebar with a slider input for number of observations:!
sidebarPanel(!
sliderInput("obs", !
"Number of observations:", !
min = 0, !
max = 1000, !
value = 500)!
),!
# Show a plot of the generated distribution:!
mainPanel(!
plotOutput("distPlot")!
)!
))
ui.R
headerPanel()

sidebarPanel()

mainPanel()
server.R + ui.R = microscope
adjustable parameters (knobs): 0 < knobs < small k
knobs = lighting, varying objectives, focusing (fine and course)

!
knobs: 

fine and course filtering: 

geography

time

variable of interest 

observations of interest

promote significant (objective) patterns

change model parameters
BDE + Shiny
Overlapping Solutions
Multiple models allow more nuanced
learning from data.

Latent Spatial Traffic Patterns

!

Convergent results serve as crossvalidation.

!

2

Points of divergence provide additional
insights and allow models to be
calibrated further.

!

Different models can provide answers to
different questions or answers to the
same question for different analysts.

!

Multi-method excels to diverse teams
with mutable missions.

!
smooth + rough = data
!

New paradigm where the question, “Are
there multiple, overlapping ways to solve
this problem” dominate.

3

1
Overlapping Solutions
Are there multiple, overlapping ways to solve this problem?

yt
ic

yt

al

A


An

An

B

al

ic

A+B

+

+

B

C

A+B+C

A

C

Analytic C
Summary:

# our blended approach !
dt.philosophy <- lm(analytics ~ bigdata +
smalldata + objective +
subjective:overlapping.solutions,
data=data)
Overlapping Solutions
Data Science for Government (DS4G)
About (DS4G):

!

1: Improve on definitions of analytics.

2: Outline optimal interactions with Data Scientists.

3: Provide a life-cycle for Data Science.

4: Most importantly, share a taxonomy to identify analytical questions one
could ask of data (Causal Effects, Classification, Outlier Detection, Big Data and
Analytics, Measurement Models, & Text Analysis)

!

Presented by Data Tactics Analytics Team

Location: TBD 

Time: 1Q 2014

Duration: ~ 5 hrs.

Cost: FREE

Audience: Government managers and Data Tactics partners with their
customers.
LUBAP goes wild!
421 attending!

http://www.meetup.com/Data-Science-DC/events/146953142/
Thank you...	

Questions?
Homepage: http://www.data-tactics.com
Blog: http://datatactics.blogspot.com
Twitter: @DataTactics
Slideshare: http://www.slideshare.net/DataTactics/presentations
Or, me (Rich Heimann): rheimann@data-tactics-corp.com

Weitere ähnliche Inhalte

Was ist angesagt?

Cheat sheets for data scientists
Cheat sheets for data scientistsCheat sheets for data scientists
Cheat sheets for data scientistsAjay Ohri
 
Data Science - Part II - Working with R & R studio
Data Science - Part II -  Working with R & R studioData Science - Part II -  Working with R & R studio
Data Science - Part II - Working with R & R studioDerek Kane
 
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...Simplilearn
 
Data science | What is Data science
Data science | What is Data scienceData science | What is Data science
Data science | What is Data scienceShilpaKrishna6
 
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2Connected Data World
 
Machine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & OpportunitiesMachine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & OpportunitiesCodePolitan
 
How Graph Databases used in Police Department?
How Graph Databases used in Police Department?How Graph Databases used in Police Department?
How Graph Databases used in Police Department?Samet KILICTAS
 
GRAKN.AI - The Knowledge Graph
GRAKN.AI - The Knowledge GraphGRAKN.AI - The Knowledge Graph
GRAKN.AI - The Knowledge GraphVaticle
 
Social Network Analysis with Spark
Social Network Analysis with SparkSocial Network Analysis with Spark
Social Network Analysis with SparkGhulam Imaduddin
 
Deutsche Telecom Expert System - Router Troubleshooting
Deutsche Telecom Expert System - Router TroubleshootingDeutsche Telecom Expert System - Router Troubleshooting
Deutsche Telecom Expert System - Router TroubleshootingVaticle
 
Graph-Powered Machine Learning
Graph-Powered Machine LearningGraph-Powered Machine Learning
Graph-Powered Machine LearningDatabricks
 
The Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & PythonThe Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & PythonKrishna Sankar
 
Social media monitoring with ML-powered Knowledge Graph
Social media monitoring with ML-powered Knowledge GraphSocial media monitoring with ML-powered Knowledge Graph
Social media monitoring with ML-powered Knowledge GraphGraphAware
 
How To Become a Data Scientist in Iran Marketplace
How To Become a Data Scientist in Iran Marketplace How To Become a Data Scientist in Iran Marketplace
How To Become a Data Scientist in Iran Marketplace Mohamadreza Mohtat
 
Predictive Text Analytics
Predictive Text AnalyticsPredictive Text Analytics
Predictive Text AnalyticsSeth Grimes
 
Top 8 Data Science Tools | Open Source Tools for Data Scientists | Edureka
Top 8 Data Science Tools | Open Source Tools for Data Scientists | EdurekaTop 8 Data Science Tools | Open Source Tools for Data Scientists | Edureka
Top 8 Data Science Tools | Open Source Tools for Data Scientists | EdurekaEdureka!
 
Webinar : Introduction to R Programming and Machine Learning
Webinar : Introduction to R Programming and Machine LearningWebinar : Introduction to R Programming and Machine Learning
Webinar : Introduction to R Programming and Machine LearningEdureka!
 
Data Science: Not Just For Big Data
Data Science: Not Just For Big DataData Science: Not Just For Big Data
Data Science: Not Just For Big DataRevolution Analytics
 

Was ist angesagt? (20)

Cheat sheets for data scientists
Cheat sheets for data scientistsCheat sheets for data scientists
Cheat sheets for data scientists
 
Data Science - Part II - Working with R & R studio
Data Science - Part II -  Working with R & R studioData Science - Part II -  Working with R & R studio
Data Science - Part II - Working with R & R studio
 
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
 
Data science | What is Data science
Data science | What is Data scienceData science | What is Data science
Data science | What is Data science
 
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
 
Machine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & OpportunitiesMachine Learning - Challenges, Learnings & Opportunities
Machine Learning - Challenges, Learnings & Opportunities
 
How Graph Databases used in Police Department?
How Graph Databases used in Police Department?How Graph Databases used in Police Department?
How Graph Databases used in Police Department?
 
GRAKN.AI - The Knowledge Graph
GRAKN.AI - The Knowledge GraphGRAKN.AI - The Knowledge Graph
GRAKN.AI - The Knowledge Graph
 
Social Network Analysis with Spark
Social Network Analysis with SparkSocial Network Analysis with Spark
Social Network Analysis with Spark
 
Deutsche Telecom Expert System - Router Troubleshooting
Deutsche Telecom Expert System - Router TroubleshootingDeutsche Telecom Expert System - Router Troubleshooting
Deutsche Telecom Expert System - Router Troubleshooting
 
Graph-Powered Machine Learning
Graph-Powered Machine LearningGraph-Powered Machine Learning
Graph-Powered Machine Learning
 
Graph Realities
Graph RealitiesGraph Realities
Graph Realities
 
The Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & PythonThe Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & Python
 
Poster
PosterPoster
Poster
 
Social media monitoring with ML-powered Knowledge Graph
Social media monitoring with ML-powered Knowledge GraphSocial media monitoring with ML-powered Knowledge Graph
Social media monitoring with ML-powered Knowledge Graph
 
How To Become a Data Scientist in Iran Marketplace
How To Become a Data Scientist in Iran Marketplace How To Become a Data Scientist in Iran Marketplace
How To Become a Data Scientist in Iran Marketplace
 
Predictive Text Analytics
Predictive Text AnalyticsPredictive Text Analytics
Predictive Text Analytics
 
Top 8 Data Science Tools | Open Source Tools for Data Scientists | Edureka
Top 8 Data Science Tools | Open Source Tools for Data Scientists | EdurekaTop 8 Data Science Tools | Open Source Tools for Data Scientists | Edureka
Top 8 Data Science Tools | Open Source Tools for Data Scientists | Edureka
 
Webinar : Introduction to R Programming and Machine Learning
Webinar : Introduction to R Programming and Machine LearningWebinar : Introduction to R Programming and Machine Learning
Webinar : Introduction to R Programming and Machine Learning
 
Data Science: Not Just For Big Data
Data Science: Not Just For Big DataData Science: Not Just For Big Data
Data Science: Not Just For Big Data
 

Andere mochten auch

NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATANETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATADataTactics
 
Data Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtcData Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtcDataTactics
 
Data Tactics Semantic and Interoperability Summit Feb 12, 2013
Data Tactics Semantic and Interoperability Summit Feb 12, 2013Data Tactics Semantic and Interoperability Summit Feb 12, 2013
Data Tactics Semantic and Interoperability Summit Feb 12, 2013DataTactics
 
Ontology and Reports
Ontology and ReportsOntology and Reports
Ontology and ReportsDataTactics
 
Multi Discipline Intelligence Production Teams 1
Multi Discipline Intelligence Production Teams 1Multi Discipline Intelligence Production Teams 1
Multi Discipline Intelligence Production Teams 1DataTactics
 
Big Data Conference
Big Data ConferenceBig Data Conference
Big Data ConferenceDataTactics
 
Data Tactics and Nervve Integrated Big Data v3
Data Tactics and Nervve Integrated Big Data v3Data Tactics and Nervve Integrated Big Data v3
Data Tactics and Nervve Integrated Big Data v3DataTactics
 
ODSC_Cherven_20160518
ODSC_Cherven_20160518ODSC_Cherven_20160518
ODSC_Cherven_20160518Ken Cherven
 
Horizontal Integration of Big Intelligence Data
Horizontal Integration of Big Intelligence DataHorizontal Integration of Big Intelligence Data
Horizontal Integration of Big Intelligence DataDataTactics
 

Andere mochten auch (10)

NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATANETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
NETWORK CENTRALITY IN SUB-NATIONAL AREAS OF INTEREST USING GDELT DATA
 
Data Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtcData Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtc
 
Data Tactics Semantic and Interoperability Summit Feb 12, 2013
Data Tactics Semantic and Interoperability Summit Feb 12, 2013Data Tactics Semantic and Interoperability Summit Feb 12, 2013
Data Tactics Semantic and Interoperability Summit Feb 12, 2013
 
Ontology and Reports
Ontology and ReportsOntology and Reports
Ontology and Reports
 
Multi Discipline Intelligence Production Teams 1
Multi Discipline Intelligence Production Teams 1Multi Discipline Intelligence Production Teams 1
Multi Discipline Intelligence Production Teams 1
 
Οι Λάπωνες
Οι ΛάπωνεςΟι Λάπωνες
Οι Λάπωνες
 
Big Data Conference
Big Data ConferenceBig Data Conference
Big Data Conference
 
Data Tactics and Nervve Integrated Big Data v3
Data Tactics and Nervve Integrated Big Data v3Data Tactics and Nervve Integrated Big Data v3
Data Tactics and Nervve Integrated Big Data v3
 
ODSC_Cherven_20160518
ODSC_Cherven_20160518ODSC_Cherven_20160518
ODSC_Cherven_20160518
 
Horizontal Integration of Big Intelligence Data
Horizontal Integration of Big Intelligence DataHorizontal Integration of Big Intelligence Data
Horizontal Integration of Big Intelligence Data
 

Ähnlich wie A Blended Approach to Analytics at Data Tactics Corporation

The Rensselaer IDEA: Data Exploration
The Rensselaer IDEA: Data Exploration The Rensselaer IDEA: Data Exploration
The Rensselaer IDEA: Data Exploration James Hendler
 
392_SannaReddyBharath (1)
392_SannaReddyBharath (1)392_SannaReddyBharath (1)
392_SannaReddyBharath (1)bharath reddy
 
Data Structure Graph DMZ #DMZone
Data Structure Graph DMZ #DMZoneData Structure Graph DMZ #DMZone
Data Structure Graph DMZ #DMZoneDoug Needham
 
Data science presentation
Data science presentationData science presentation
Data science presentationMSDEVMTL
 
438_AmeeruddinMohammed
438_AmeeruddinMohammed438_AmeeruddinMohammed
438_AmeeruddinMohammedAmeeruddin MD
 
(Big) Data (Science) Skills
(Big) Data (Science) Skills(Big) Data (Science) Skills
(Big) Data (Science) SkillsOscar Corcho
 
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...ryanorban
 
Imtiaz khan data_science_analytics
Imtiaz khan data_science_analyticsImtiaz khan data_science_analytics
Imtiaz khan data_science_analyticsimtiaz khan
 
Model evaluation in the land of deep learning
Model evaluation in the land of deep learningModel evaluation in the land of deep learning
Model evaluation in the land of deep learningPramit Choudhary
 

Ähnlich wie A Blended Approach to Analytics at Data Tactics Corporation (20)

566_SriramDandamudi_CEE
566_SriramDandamudi_CEE566_SriramDandamudi_CEE
566_SriramDandamudi_CEE
 
587_EswarPrasadReddyMachireddy_CEE
587_EswarPrasadReddyMachireddy_CEE587_EswarPrasadReddyMachireddy_CEE
587_EswarPrasadReddyMachireddy_CEE
 
The Rensselaer IDEA: Data Exploration
The Rensselaer IDEA: Data Exploration The Rensselaer IDEA: Data Exploration
The Rensselaer IDEA: Data Exploration
 
662_AravindKumarN_CEE
662_AravindKumarN_CEE662_AravindKumarN_CEE
662_AravindKumarN_CEE
 
671_JeevanRavula_CEE
671_JeevanRavula_CEE671_JeevanRavula_CEE
671_JeevanRavula_CEE
 
598_RamaSrikanthJakkam_CEE
598_RamaSrikanthJakkam_CEE598_RamaSrikanthJakkam_CEE
598_RamaSrikanthJakkam_CEE
 
603_SaiKiranPutta_CEE
603_SaiKiranPutta_CEE603_SaiKiranPutta_CEE
603_SaiKiranPutta_CEE
 
392_SannaReddyBharath (1)
392_SannaReddyBharath (1)392_SannaReddyBharath (1)
392_SannaReddyBharath (1)
 
Data Structure Graph DMZ #DMZone
Data Structure Graph DMZ #DMZoneData Structure Graph DMZ #DMZone
Data Structure Graph DMZ #DMZone
 
Data analysis
Data analysisData analysis
Data analysis
 
Data Analytics_BigData Cert
Data Analytics_BigData CertData Analytics_BigData Cert
Data Analytics_BigData Cert
 
Data science presentation
Data science presentationData science presentation
Data science presentation
 
421_PrakashMudholkar
421_PrakashMudholkar421_PrakashMudholkar
421_PrakashMudholkar
 
402_DheerajKura
402_DheerajKura402_DheerajKura
402_DheerajKura
 
438_AmeeruddinMohammed
438_AmeeruddinMohammed438_AmeeruddinMohammed
438_AmeeruddinMohammed
 
(Big) Data (Science) Skills
(Big) Data (Science) Skills(Big) Data (Science) Skills
(Big) Data (Science) Skills
 
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
Bridging the Gap Between Data Science & Engineer: Building High-Performance T...
 
Imtiaz khan data_science_analytics
Imtiaz khan data_science_analyticsImtiaz khan data_science_analytics
Imtiaz khan data_science_analytics
 
Welcome to CS310!
Welcome to CS310!Welcome to CS310!
Welcome to CS310!
 
Model evaluation in the land of deep learning
Model evaluation in the land of deep learningModel evaluation in the land of deep learning
Model evaluation in the land of deep learning
 

Mehr von Rich Heimann

Guest Talk for Data Society's "INTRO TO DATA SCIENCE BOOT CAMP"
Guest Talk for Data Society's "INTRO TO DATA SCIENCE BOOT CAMP"Guest Talk for Data Society's "INTRO TO DATA SCIENCE BOOT CAMP"
Guest Talk for Data Society's "INTRO TO DATA SCIENCE BOOT CAMP"Rich Heimann
 
Big Data Analytics: Discovering Latent Structure in Twitter; A Case Study in ...
Big Data Analytics: Discovering Latent Structure in Twitter; A Case Study in ...Big Data Analytics: Discovering Latent Structure in Twitter; A Case Study in ...
Big Data Analytics: Discovering Latent Structure in Twitter; A Case Study in ...Rich Heimann
 
Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)Rich Heimann
 
Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)Rich Heimann
 
Data Tactics Analytics Brown Bag (November 2013)
Data Tactics Analytics Brown Bag (November 2013)Data Tactics Analytics Brown Bag (November 2013)
Data Tactics Analytics Brown Bag (November 2013)Rich Heimann
 
Spatial Analysis; The Primitives at UMBC
Spatial Analysis; The Primitives at UMBCSpatial Analysis; The Primitives at UMBC
Spatial Analysis; The Primitives at UMBCRich Heimann
 
Spatial Analysis and Geomatics
Spatial Analysis and GeomaticsSpatial Analysis and Geomatics
Spatial Analysis and GeomaticsRich Heimann
 
Week 1 Lecture @ UMBC
Week 1 Lecture @ UMBCWeek 1 Lecture @ UMBC
Week 1 Lecture @ UMBCRich Heimann
 
Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)Rich Heimann
 

Mehr von Rich Heimann (9)

Guest Talk for Data Society's "INTRO TO DATA SCIENCE BOOT CAMP"
Guest Talk for Data Society's "INTRO TO DATA SCIENCE BOOT CAMP"Guest Talk for Data Society's "INTRO TO DATA SCIENCE BOOT CAMP"
Guest Talk for Data Society's "INTRO TO DATA SCIENCE BOOT CAMP"
 
Big Data Analytics: Discovering Latent Structure in Twitter; A Case Study in ...
Big Data Analytics: Discovering Latent Structure in Twitter; A Case Study in ...Big Data Analytics: Discovering Latent Structure in Twitter; A Case Study in ...
Big Data Analytics: Discovering Latent Structure in Twitter; A Case Study in ...
 
Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)
 
Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)
 
Data Tactics Analytics Brown Bag (November 2013)
Data Tactics Analytics Brown Bag (November 2013)Data Tactics Analytics Brown Bag (November 2013)
Data Tactics Analytics Brown Bag (November 2013)
 
Spatial Analysis; The Primitives at UMBC
Spatial Analysis; The Primitives at UMBCSpatial Analysis; The Primitives at UMBC
Spatial Analysis; The Primitives at UMBC
 
Spatial Analysis and Geomatics
Spatial Analysis and GeomaticsSpatial Analysis and Geomatics
Spatial Analysis and Geomatics
 
Week 1 Lecture @ UMBC
Week 1 Lecture @ UMBCWeek 1 Lecture @ UMBC
Week 1 Lecture @ UMBC
 
Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)Human Terrain Analysis at George Mason University (DAY 1)
Human Terrain Analysis at George Mason University (DAY 1)
 

Kürzlich hochgeladen

How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationIES VE
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 

Kürzlich hochgeladen (20)

How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve DecarbonizationUsing IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
Using IESVE for Loads, Sizing and Heat Pump Modeling to Achieve Decarbonization
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 

A Blended Approach to Analytics at Data Tactics Corporation

  • 1. Big Data Conference 2013: Analytics and Applications for Federal Big Data Data Tactics Corp: A Blended Approach to Big Data Analytics ! Richard Heimann, Data Scientist at Data Tactics Corporation
  • 2. ! Data Tactics Analytics Practice The Team: (Nathan D., Shrayes R., David P., Adam VE., Geoffrey B., Rich H.) Graduates from top universities... ! Advanced degrees include: mathematics, computer science, astrophysics, electrical engineering, mechanical engineering, statistics, social sciences. ! Base competencies (horizontals): clustering, association rules, regression, naive bayesian classifier, decision trees, time-series, text analysis. ! Going beyond the base (verticals)...
  • 3. th an pl st RT CA Ra ru nd om se ct nt ni co ur ng im Fo ns al en res alg tra eq ta t in or ua na ed ith tio to lys m op n pi ec s is m tim c on od m om od iza eli ng els tion fac et sp ri to s ra at cs ial na ec di lys au ba m on is to ye en om re sia sio gr et n es na ric st siv lr at s ed ist e m uc lat ics od tio PC en els n tc A las IC s A as an hi tro gr aly er ph ap ar ys sis ch h th ica ica eo lt lm ry im od DL alg enu IRT els se IS or m A rie ith er s m ica an s l in aly te sis m gr ba ixt at gg ur io SV e in n m g/ M te od bo ch m els os ni ax qu tin en es g t pa Horizontals & Verticals Clustering || Regression || Decision Trees || Text Analysis Association Rules || Naive Bayesian Classifier || Time Series Analysis
  • 4. Data Tactics Analytics Practice Hierarchy of Data Scientists
  • 5. Why Analytics [Business]??? Why are analytics important? (Business, Analytics, Practical) ! ! ! "We need to stop reinventing the cloud and start using it!" (Dave Boyd) ! ! ! !
  • 6. Why Analytics [Analytics]??? Why are analytics important? (Business, Analytics, Practical) ! ! No Free Lunch (NFL): no algorithm performs better than any other when their performance is averaged uniformly over all possible problems of a particular type. Algorithms must be designed for a particular domain or style of problem, and that there is no such thing as a general purpose algorithm. ! ! !
  • 7. Why Analytics [Practical]??? Academic Publications Scale N Web Scales IC Scales t If this guy doesn’t scale - none of us do. t
  • 8. algo to users > algo to data Development Deployment Machine User Parallel Distributed Objective Subjective M/R HDFS Valid Useful MPP SOA Nontrivial Novel Accurate Comprehensible GPU
  • 9. Shiny Open Sourced by RStudio in November 2012 ! Not the first to wrap R in the browser but perhaps the easiest for R developers ! Don’t need to know HTML, CSS and javascript to get started ! Reactive Programming model ! Web sockets for communication
  • 10. server.R # Define server logic required to generate and plot a random # distribution! shinyServer(function(input, output) {! ! # Expression that generates a plot of the distribution.! # renderPlot:! #! # 1: Is "reactive" and will therefore automatically ! # re-executed when inputs change.! # 2: Its output type is a plot. ! ! output$distPlot <- renderPlot({! ! # generate an rnorm distribution and plot it! dist <- rnorm(input$obs)! hist(dist)! })! })
  • 11. ui.R library(shiny)! ! # Define UI for application that plots random distributions ! shinyUI(pageWithSidebar(! ! # Application title:! headerPanel("My Shiny App!"),! ! # Sidebar with a slider input for number of observations:! sidebarPanel(! sliderInput("obs", ! "Number of observations:", ! min = 0, ! max = 1000, ! value = 500)! ),! # Show a plot of the generated distribution:! mainPanel(! plotOutput("distPlot")! )! ))
  • 13. server.R + ui.R = microscope adjustable parameters (knobs): 0 < knobs < small k knobs = lighting, varying objectives, focusing (fine and course) ! knobs: fine and course filtering: geography time variable of interest observations of interest promote significant (objective) patterns change model parameters
  • 15. Overlapping Solutions Multiple models allow more nuanced learning from data. Latent Spatial Traffic Patterns ! Convergent results serve as crossvalidation. ! 2 Points of divergence provide additional insights and allow models to be calibrated further. ! Different models can provide answers to different questions or answers to the same question for different analysts. ! Multi-method excels to diverse teams with mutable missions. ! smooth + rough = data ! New paradigm where the question, “Are there multiple, overlapping ways to solve this problem” dominate. 3 1
  • 16. Overlapping Solutions Are there multiple, overlapping ways to solve this problem? yt ic yt al A An An B al ic A+B + + B C A+B+C A C Analytic C
  • 17. Summary: # our blended approach ! dt.philosophy <- lm(analytics ~ bigdata + smalldata + objective + subjective:overlapping.solutions, data=data)
  • 19. Data Science for Government (DS4G) About (DS4G): ! 1: Improve on definitions of analytics. 2: Outline optimal interactions with Data Scientists. 3: Provide a life-cycle for Data Science. 4: Most importantly, share a taxonomy to identify analytical questions one could ask of data (Causal Effects, Classification, Outlier Detection, Big Data and Analytics, Measurement Models, & Text Analysis) ! Presented by Data Tactics Analytics Team Location: TBD Time: 1Q 2014 Duration: ~ 5 hrs. Cost: FREE Audience: Government managers and Data Tactics partners with their customers.
  • 20. LUBAP goes wild! 421 attending! http://www.meetup.com/Data-Science-DC/events/146953142/
  • 21. Thank you... Questions? Homepage: http://www.data-tactics.com Blog: http://datatactics.blogspot.com Twitter: @DataTactics Slideshare: http://www.slideshare.net/DataTactics/presentations Or, me (Rich Heimann): rheimann@data-tactics-corp.com