SlideShare ist ein Scribd-Unternehmen logo
1 von 68
Downloaden Sie, um offline zu lesen
STAT I ST I CA L P R O G RA M M I N G
I N JAVAS C R I PT
D av i d S i m o n s
@ Swa m Wi t h Tu rt l e s
slides:
www.tinyurl.com/stats-js
demos:
swamwithturtles.github.io/js-statistics
code:
github.com/SwamWithTurtles/js-statistics
W H O A M I ?
Freelance
Software
Developer
@SwamWithTurtles
Java and
JavaScript
Afraid of goats?
W H O A M I ?
DATA
NERD
C O N T E N T S
T H E O RY CA S E S T U D I E S
JAVA S C R I P T
A P P L I CAT I O N
W H AT I S
DATA ?
G A I N I N G
I N S I G H T S
R A N D O M N E S S S I M U L AT I O N
L E A R N I N G T H R O U G H
Reward: What shape is the internet?
Data
B E H I N D T H E H O O D
A P I
D B
A D M I N
I N T E R F A C E
S C H E D U L E D
T A S K S
3 R D
P A R T Y
A P I S
W H AT D ATA
WA S T H E R E ?
S O …
W H AT D ATA
WA S T H E R E ?
• Counts of lists (e.g. brands,
products etc.)
• Stock levels and prices of
products
• Days an item has been out
of stock
W H AT D ATA
WA S T H E R E ?
• Non-functional data
• Numbers of users
• Performance for users
• Performance of third
party APIs
• Robustness of system
(Uptime, status codes,
frequency of errors)
T H E R E I S D ATA
E V E RY W H E R E
T H E L E S S O N ?
What is data?
What is good data?
W H AT D ATA
S H O U L D I C A R E
A B O U T ?
• Data you get repeatedly
• Data you can extract
‘information’ from
• Normally this means
numerical data, though
NLP is getting big!
• Data that answers valuable
questions
Gaining Insights
A d a t a s e t :
Identification WIND CEILING TEMP DEWPT RHX
USAF NCDC Date HrMn I Type QCP Dir Q I Spd Q Hgt Q I I Temp Q Dewpt Q RHx
865300,99999,19860401,0000,4,FM-12, ,110,1,N, 7.2,1,22000,1,C,N, 21.6,1, 19.2,1, 86,
865300,99999,19860401,0300,4,FM-12, ,110,1,N, 5.1,1,22000,1,C,N, 19.4,1, 18.5,1, 95,
865300,99999,19860401,0600,4,FM-12, ,070,1,N, 7.2,1,03600,1,C,N, 19.2,1, 999.9,9,999,
865300,99999,19860401,0900,4,FM-12, ,070,1,N, 6.2,1,00120,1,C,N, 19.2,1, 18.9,1, 98,
865300,99999,19860401,1200,4,FM-12, ,070,1,N, 7.7,1,03600,1,C,N, 21.6,1, 18.3,1, 82,
865300,99999,19860401,1500,4,FM-12, ,040,1,N, 9.8,1,03600,1,C,N, 23.0,1, 18.8,1, 77,
865300,99999,19860401,1800,4,FM-12, ,030,1,N, 6.2,1,03600,1,C,N, 19.6,1, 19.0,1, 96,
865300,99999,19860401,2100,4,FM-12, ,050,1,N, 6.7,1,03600,1,C,N, 19.0,1, 18.7,1, 98,
865300,99999,19860402,0000,4,FM-12, ,340,1,N, 7.2,1,03600,1,C,N, 20.0,1, 19.4,1, 96,
865300,99999,19860402,0300,4,FM-12, ,360,1,N, 4.1,1,03600,1,C,N, 19.4,1, 19.1,1, 98,
865300,99999,19860402,0600,4,FM-12, ,999,1,C, 0.0,1,03600,1,C,N, 19.2,1, 18.9,1, 98,
865300,99999,19860402,0900,4,FM-12, ,999,1,C, 0.0,1,00210,1,C,N, 19.0,1, 18.7,1, 98,
865300,99999,19860402,1200,4,FM-12, ,200,1,N, 2.6,1,00210,1,C,N, 20.4,1, 20.1,1, 98,
865300,99999,19860402,1500,4,FM-12, ,210,1,N, 5.1,1,00750,1,C,N, 23.2,1, 19.3,1, 79,
865300,99999,19860402,1800,4,FM-12, ,200,1,N, 3.1,1,00750,1,C,N, 26.4,1, 18.4,1, 62,
865300,99999,19860402,2100,4,FM-12, ,999,1,C, 0.0,1,22000,1,C,N, 26.2,1, 17.1,1, 57,
865300,99999,19860403,0000,4,FM-12, ,140,1,N, 4.1,1,22000,1,C,N, 19.2,1, 17.0,1, 87,
865300,99999,19860403,0300,4,FM-12, ,999,1,C, 0.0,1,22000,1,C,N, 15.8,1, 15.2,1, 96,
865300,99999,19860403,0600,4,FM-12, ,999,1,C, 0.0,1,22000,1,C,N, 15.4,1, 14.0,1, 91,
865300,99999,19860403,1200,4,FM-12, ,060,1,N, 5.1,1,22000,1,C,N, 21.0,1, 19.8,1, 93,
865300,99999,19860403,1500,4,FM-12, ,060,1,N, 4.1,1,00900,1,C,N, 24.8,1, 21.3,1, 81,
865300,99999,19860403,1800,4,FM-12, ,050,1,N, 7.7,1,09000,1,C,N, 28.0,1, 21.4,1, 67,
865300,99999,19860403,2100,4,FM-12, ,040,1,N, 5.1,1,09000,1,C,N, 25.4,1, 21.4,1, 79,
865300,99999,19860404,0000,4,FM-12, ,060,1,N, 6.2,1,03600,1,C,N, 22.2,1, 21.3,1, 95,
865300,99999,19860404,0300,4,FM-12, ,050,1,N, 5.1,1,09000,1,C,N, 21.0,1, 20.7,1, 98,
865300,99999,19860404,0600,4,FM-12, ,060,1,N, 6.2,1,22000,1,C,N, 20.2,1, 19.9,1, 98,
865300,99999,19860404,1200,4,FM-12, ,040,1,N, 5.1,1,00120,1,C,N, 20.4,1, 19.5,1, 95,
865300,99999,19860404,1500,4,FM-12, ,020,1,N, 7.7,1,00420,1,C,N, 24.2,1, 20.4,1, 79,
865300,99999,19860404,1800,4,FM-12, ,250,1,N, 4.1,1,00750,1,C,N, 25.6,1, 20.7,1, 74,
865300,99999,19860404,2100,4,FM-12, ,250,1,N, 5.1,1,00750,1,C,N, 23.6,1, 20.4,1, 82,
865300,99999,19860405,0000,4,FM-12, ,180,1,N, 6.2,1,00420,1,C,N, 20.2,1, 19.6,1, 96,
s u m m a r y s t a t i s t i c s
S U M M A RY
S TAT I S T I C S
• A statistic is a function of
the data we have inputed
• It aims to capture
information about values
to make it more
understandable
T H E FA M O U S
O N E :
• Mean (‘average’)
• Sum all of the data
and divide by the
number of items
• Gives a sense of ‘size’
Group 1:
Group 2:
O T H E R
S TAT I S T I C S
• “Location”
• Mean, Mode, Median
• “Spread”
• Standard Deviation
• “Shape”
• Skew, Kurtosis
D E M O
Distributions
What is a random variable?
Discrete Variables
Can be any of a list of values, each with its own probability
H E A D S 0 . 5
TA I L S 0 . 5
2 1 / 3 6
3 2 / 3 6
4 3 / 3 6
5 4 / 3 6
6 5 / 3 6
7 6 / 3 6
8 5 / 3 6
9 4 / 3 6
1 0 3 / 3 6
1 1 2 / 3 6
1 2 1 / 3 6
This makes sense:
X = Result of a coin flip
H E A D S 0 . 5
TA I L S 0 . 5 But:
X won’t always have the
same value
R A N D O M VA R I A B L E S
X = Result of a coin flip
H E A D S 0 . 5
TA I L S 0 . 5
X is a
Random Variable
This is its distribution
D E M O …
Continuous
A numerical variable,
that can be any number
(sometimes within a range)
height
weight
Math.random()
H O W D O W E D E F I N E T H E
D I S T R I B U T I O N ?
Math.random() height
D E M O
S O W H AT ?
E R R R …
• When we do data analysis,
we’re really looking at the
range of values a random
variable can be…
• … and asking questions
about its distribution.
Y O U ’ R E A N
A U D I T O R
I M A G I N E …
A U D I T I N G A
L E D G E R
• Make a list of all ingoing
and outgoing transactions
• These are random
variables.
• What is their distribution?
Does it deviate from what
we expect?
B E N F O R D ’ S L A W
http://www.journalofaccountancy.com/Issues/1999/May/nigrini
I N T U I T I V E
U S E R I N P U T S
D E S I G N I N G
O U R TA S K …
• Designing a system that
tries to understand what
happens under financial
system “shocks”
• So: a user would input a
shock, its impacts would
propagate and we would
see our bottom line.
O U R F I R S T AT T E M P T
• Shock ‘sliders’ that scaled linearly
0 %
2 5 %
B O O M
9 0 %
B U S T
D I S T R I B U T I O N O F F I N A N C I A L
C H A N G E S
S O …
• Shock ‘sliders’ that scaled linearly
0 %
8 %
B O O M
1 0 5 %
B U S T
Change that happens
with 75% chance
Change that happens
with 10% chance
Randomness
M A K I N G R A N D O M VA R I A B L E S
S O M E
WA R N I N G S
• Exactly what randomness
means is a fuzzy question.
• These numbers are not
‘cryptographically’
random.
J AVA S C R I P T ’ S
E N T RY T O
R A N D O M N E S S
• Different runtimes can
implement it differently.
• V8 implements Multiply-With-
Carry:
• Take a sequence of ‘seed’
values
• Iteratively perform modular
arithmetic-based operations
• Extend the initial seed values
to a longer sequence.
Math.random()
W H AT A B O U T
O T H E R
D I S T R I B U T I O N S ?
B U T …
T H E S H O R T A N S W E R
Math.random()= f( )
T H E S H O R T A N S W E R
=
H E A D S 0 . 5
TA I L S 0 . 5
=
W H AT ’ S T H E F U N C T I O N ?
jStat
beta
centralF
cauchy
chi-squared
exponential
gamma
inverse gamma
kumaraswamy
lognormal
normal
pareto
student t
uniform
weibull
binomial
negative binomial
hypergeometric
poisson
triangular
OR
U S I N G R A N D O M N E S S
w hy w o u l d i w a n t
t o u s e
R A N D O M N E S S
?
S T U B B E D
T E S T D ATA
• Avoid coupling yourself to
specific test
implementations
• Spin-up life-like
environments for load
testing
N O N -
D E T E R M I N I S T I C
A L G O R I T H M S
• Modelling underlying or
random data
• Solving a problem that is
expensive or impossible to
solve perfectly
P I T FA L L S
C H O O S I N G T H E
D I S T R I B U T I O N
• What if a ‘uniform’
distribution isn’t enough?
• What if we want random
data that isn’t just
numbers?
E X A M P L E : S O C I A L N E T W O R K
E X A M P L E : S O C I A L N E T W O R K
11 Traversals
D E M O
B a r a b a s i - A l b e r t
R a n d o m M o d e l
B A R A B A S I - A L B E R T
R A N D O M M O D E L
• Start with two linked
objects
• Add one new object at a
time
• Link that object to one
existing object, with
already ‘popular’ objects
more likely to be chosen.
T H I S
M O D E L S …
• Academic Citations
• Actor filmographies
• Spread of Infectious
diseases
• Social Networks
C O N T E N T S
T H E O RY CA S E S T U D I E S
JAVA S C R I P T
A P P L I CAT I O N
W H AT I S
DATA ?
G A I N I N G
I N S I G H T S
R A N D O M N E S S S I M U L AT I O N
L E A R N I N G T H R O U G H
Reward: What shape is the internet?
We’reOUTof
TIME
• Data is any information we collect. Not all data is
valuable.
• Seeing trends in lots of numbers is hard. Summary
statistics and charts help us unpick its meaning.
• Data can be treated as random ‘realisations’ from a
backing distribution.
• Making random variables is easy, and can be done in
different shapes for different purposes.
W H AT I S
DATA ?
G A I N I N G
I N S I G H T S
R A N D O M N E S S S I M U L AT I O N
L I B R A R I E S W E U S E D
G E N E R A L L I B R A R I E S
K N O C K O U T. J S
R E Q U I R E . J S
B O O T S T R A P
D ATA M A N I P U L AT I O N
L O D A S H
J S TAT
D ATA I M P O RT PA PA PA R S E
C H A RT I N G
D 3
C H A R T. J S
T H A N K YO U
D av i d S i m o n s
@ Swa m Wi t h Tu rt l e s

Weitere ähnliche Inhalte

Was ist angesagt?

SharePoint Saturday Redmond - Building solutions with the future in mind
SharePoint Saturday Redmond - Building solutions with the future in mindSharePoint Saturday Redmond - Building solutions with the future in mind
SharePoint Saturday Redmond - Building solutions with the future in mind
Chris Johnson
 
10 d bs in 30 minutes
10 d bs in 30 minutes10 d bs in 30 minutes
10 d bs in 30 minutes
David Simons
 

Was ist angesagt? (20)

Gain Maximum Visibility into Your Applications
Gain Maximum Visibility into Your Applications Gain Maximum Visibility into Your Applications
Gain Maximum Visibility into Your Applications
 
100% Visibility - Jason Yee - Codemotion Amsterdam 2018
100% Visibility - Jason Yee - Codemotion Amsterdam 2018100% Visibility - Jason Yee - Codemotion Amsterdam 2018
100% Visibility - Jason Yee - Codemotion Amsterdam 2018
 
SharePoint Saturday Redmond - Building solutions with the future in mind
SharePoint Saturday Redmond - Building solutions with the future in mindSharePoint Saturday Redmond - Building solutions with the future in mind
SharePoint Saturday Redmond - Building solutions with the future in mind
 
100% de visibilidade nas suas aplicações - DEM03 - Sao Paulo Summit
100% de visibilidade nas suas aplicações -  DEM03 - Sao Paulo Summit100% de visibilidade nas suas aplicações -  DEM03 - Sao Paulo Summit
100% de visibilidade nas suas aplicações - DEM03 - Sao Paulo Summit
 
Yammer time
Yammer timeYammer time
Yammer time
 
eHarmony @ Phoenix Con 2016
eHarmony @ Phoenix Con 2016eHarmony @ Phoenix Con 2016
eHarmony @ Phoenix Con 2016
 
Wrangle Your Defense Using Offensive Tactics BSides CT 2019
Wrangle Your Defense Using Offensive Tactics BSides CT 2019Wrangle Your Defense Using Offensive Tactics BSides CT 2019
Wrangle Your Defense Using Offensive Tactics BSides CT 2019
 
Gain Maximum Visibility - DEM06 - Anaheim AWS Summit
Gain Maximum Visibility - DEM06 - Anaheim AWS SummitGain Maximum Visibility - DEM06 - Anaheim AWS Summit
Gain Maximum Visibility - DEM06 - Anaheim AWS Summit
 
Data Interoperability for Learning Analytics and Lifelong Learning
Data Interoperability for Learning Analytics and Lifelong LearningData Interoperability for Learning Analytics and Lifelong Learning
Data Interoperability for Learning Analytics and Lifelong Learning
 
Gain Maximum Visibility into Your Applications - DEM04 - Atlanta AWS Summit
Gain Maximum Visibility into Your Applications - DEM04 - Atlanta AWS SummitGain Maximum Visibility into Your Applications - DEM04 - Atlanta AWS Summit
Gain Maximum Visibility into Your Applications - DEM04 - Atlanta AWS Summit
 
10 d bs in 30 minutes
10 d bs in 30 minutes10 d bs in 30 minutes
10 d bs in 30 minutes
 
Wrangle Your Defense Using Offensive Tactics - ISSA May Meeting
Wrangle Your Defense Using Offensive Tactics - ISSA May MeetingWrangle Your Defense Using Offensive Tactics - ISSA May Meeting
Wrangle Your Defense Using Offensive Tactics - ISSA May Meeting
 
AWS Seminar Series 2015 Melbourne
AWS Seminar Series 2015 MelbourneAWS Seminar Series 2015 Melbourne
AWS Seminar Series 2015 Melbourne
 
Thinking like a Network
Thinking like a NetworkThinking like a Network
Thinking like a Network
 
AWS SeMINAR SERIES 2015 Sydney
AWS SeMINAR SERIES 2015 SydneyAWS SeMINAR SERIES 2015 Sydney
AWS SeMINAR SERIES 2015 Sydney
 
AWS Seminar Series 2015 Brisbane
AWS Seminar Series 2015 BrisbaneAWS Seminar Series 2015 Brisbane
AWS Seminar Series 2015 Brisbane
 
AWS SEMINAR SERIES 2015 Perth
AWS SEMINAR SERIES 2015 PerthAWS SEMINAR SERIES 2015 Perth
AWS SEMINAR SERIES 2015 Perth
 
Auckland AWS Seminar Series
Auckland AWS Seminar SeriesAuckland AWS Seminar Series
Auckland AWS Seminar Series
 
Beyond the Retrospective: Embracing Complexity on the Road to Service Ownership
Beyond the Retrospective: Embracing Complexity on the Road to Service OwnershipBeyond the Retrospective: Embracing Complexity on the Road to Service Ownership
Beyond the Retrospective: Embracing Complexity on the Road to Service Ownership
 
Ellicium Solutions - Making Data Science Work
Ellicium  Solutions - Making Data Science Work Ellicium  Solutions - Making Data Science Work
Ellicium Solutions - Making Data Science Work
 

Ähnlich wie Statistical Programming with JavaScript

From Content Strategy to Drupal Site Building - Connecting the dots
From Content Strategy to Drupal Site Building - Connecting the dotsFrom Content Strategy to Drupal Site Building - Connecting the dots
From Content Strategy to Drupal Site Building - Connecting the dots
Ronald Ashri
 
From Content Strategy to Drupal Site Building - Connecting the Dots
From Content Strategy to Drupal Site Building - Connecting the DotsFrom Content Strategy to Drupal Site Building - Connecting the Dots
From Content Strategy to Drupal Site Building - Connecting the Dots
Ronald Ashri
 

Ähnlich wie Statistical Programming with JavaScript (20)

Why Every Product Manager Needs to Know Big Data
Why Every Product Manager Needs to Know Big DataWhy Every Product Manager Needs to Know Big Data
Why Every Product Manager Needs to Know Big Data
 
Graph theory in Practise
Graph theory in PractiseGraph theory in Practise
Graph theory in Practise
 
R - what do the numbers mean? #RStats
R - what do the numbers mean? #RStatsR - what do the numbers mean? #RStats
R - what do the numbers mean? #RStats
 
Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...
Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...
Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...
 
Star Schema Overview
Star Schema OverviewStar Schema Overview
Star Schema Overview
 
Vikram emerging technologies
Vikram emerging technologiesVikram emerging technologies
Vikram emerging technologies
 
Four Architectural Patterns
Four Architectural Patterns Four Architectural Patterns
Four Architectural Patterns
 
Six Things You Need to Know About the Modern Call Center
Six Things You Need to Know About the Modern Call CenterSix Things You Need to Know About the Modern Call Center
Six Things You Need to Know About the Modern Call Center
 
GW Intro to Digital Communications Class 6
GW Intro to Digital Communications Class 6 GW Intro to Digital Communications Class 6
GW Intro to Digital Communications Class 6
 
Agree to Disagree
Agree to DisagreeAgree to Disagree
Agree to Disagree
 
Scaling your Tableau - Migrating from Tableau Online to a proper DWH solution...
Scaling your Tableau - Migrating from Tableau Online to a proper DWH solution...Scaling your Tableau - Migrating from Tableau Online to a proper DWH solution...
Scaling your Tableau - Migrating from Tableau Online to a proper DWH solution...
 
Scientific visualization
Scientific visualizationScientific visualization
Scientific visualization
 
SEWM'14 keynote: Mining Events from Multimedia Streams
SEWM'14 keynote: Mining Events from Multimedia StreamsSEWM'14 keynote: Mining Events from Multimedia Streams
SEWM'14 keynote: Mining Events from Multimedia Streams
 
AUA Data Science Meetup
AUA Data Science MeetupAUA Data Science Meetup
AUA Data Science Meetup
 
Graph Modelling
Graph ModellingGraph Modelling
Graph Modelling
 
Mining Events from Multimedia Streams (WAIS Research group seminar June 2014)
Mining Events from Multimedia Streams (WAIS Research group seminar June 2014)Mining Events from Multimedia Streams (WAIS Research group seminar June 2014)
Mining Events from Multimedia Streams (WAIS Research group seminar June 2014)
 
Data Visualizations in Digital Products (ProductCamp Boston 2016)
Data Visualizations in Digital Products (ProductCamp Boston 2016)Data Visualizations in Digital Products (ProductCamp Boston 2016)
Data Visualizations in Digital Products (ProductCamp Boston 2016)
 
From Content Strategy to Drupal Site Building - Connecting the dots
From Content Strategy to Drupal Site Building - Connecting the dotsFrom Content Strategy to Drupal Site Building - Connecting the dots
From Content Strategy to Drupal Site Building - Connecting the dots
 
From Content Strategy to Drupal Site Building - Connecting the Dots
From Content Strategy to Drupal Site Building - Connecting the DotsFrom Content Strategy to Drupal Site Building - Connecting the Dots
From Content Strategy to Drupal Site Building - Connecting the Dots
 
Artificial Intelligence and Machine Learning
Artificial Intelligence and Machine LearningArtificial Intelligence and Machine Learning
Artificial Intelligence and Machine Learning
 

Mehr von David Simons (7)

Non-Functional Requirements
Non-Functional RequirementsNon-Functional Requirements
Non-Functional Requirements
 
Build Tools & Maven
Build Tools & MavenBuild Tools & Maven
Build Tools & Maven
 
Decoupled APIs through microservices
Decoupled APIs through microservicesDecoupled APIs through microservices
Decoupled APIs through microservices
 
TDD: What is it good for?
TDD: What is it good for?TDD: What is it good for?
TDD: What is it good for?
 
Domain Driven Design: A Precis
Domain Driven Design: A PrecisDomain Driven Design: A Precis
Domain Driven Design: A Precis
 
Using Clojure to Marry Neo4j and Open Democracy
Using Clojure to Marry Neo4j and Open DemocracyUsing Clojure to Marry Neo4j and Open Democracy
Using Clojure to Marry Neo4j and Open Democracy
 
Exploring Election Results with Neo4J
Exploring Election Results with Neo4JExploring Election Results with Neo4J
Exploring Election Results with Neo4J
 

Kürzlich hochgeladen

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Kürzlich hochgeladen (20)

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

Statistical Programming with JavaScript

  • 1. STAT I ST I CA L P R O G RA M M I N G I N JAVAS C R I PT D av i d S i m o n s @ Swa m Wi t h Tu rt l e s
  • 4. W H O A M I ? Freelance Software Developer @SwamWithTurtles Java and JavaScript Afraid of goats?
  • 5. W H O A M I ? DATA NERD
  • 6. C O N T E N T S T H E O RY CA S E S T U D I E S JAVA S C R I P T A P P L I CAT I O N W H AT I S DATA ? G A I N I N G I N S I G H T S R A N D O M N E S S S I M U L AT I O N L E A R N I N G T H R O U G H Reward: What shape is the internet?
  • 8.
  • 9. B E H I N D T H E H O O D A P I D B A D M I N I N T E R F A C E S C H E D U L E D T A S K S 3 R D P A R T Y A P I S
  • 10. W H AT D ATA WA S T H E R E ? S O …
  • 11. W H AT D ATA WA S T H E R E ? • Counts of lists (e.g. brands, products etc.) • Stock levels and prices of products • Days an item has been out of stock
  • 12. W H AT D ATA WA S T H E R E ? • Non-functional data • Numbers of users • Performance for users • Performance of third party APIs • Robustness of system (Uptime, status codes, frequency of errors)
  • 13. T H E R E I S D ATA E V E RY W H E R E T H E L E S S O N ?
  • 15. What is good data?
  • 16. W H AT D ATA S H O U L D I C A R E A B O U T ? • Data you get repeatedly • Data you can extract ‘information’ from • Normally this means numerical data, though NLP is getting big! • Data that answers valuable questions
  • 18. A d a t a s e t : Identification WIND CEILING TEMP DEWPT RHX USAF NCDC Date HrMn I Type QCP Dir Q I Spd Q Hgt Q I I Temp Q Dewpt Q RHx 865300,99999,19860401,0000,4,FM-12, ,110,1,N, 7.2,1,22000,1,C,N, 21.6,1, 19.2,1, 86, 865300,99999,19860401,0300,4,FM-12, ,110,1,N, 5.1,1,22000,1,C,N, 19.4,1, 18.5,1, 95, 865300,99999,19860401,0600,4,FM-12, ,070,1,N, 7.2,1,03600,1,C,N, 19.2,1, 999.9,9,999, 865300,99999,19860401,0900,4,FM-12, ,070,1,N, 6.2,1,00120,1,C,N, 19.2,1, 18.9,1, 98, 865300,99999,19860401,1200,4,FM-12, ,070,1,N, 7.7,1,03600,1,C,N, 21.6,1, 18.3,1, 82, 865300,99999,19860401,1500,4,FM-12, ,040,1,N, 9.8,1,03600,1,C,N, 23.0,1, 18.8,1, 77, 865300,99999,19860401,1800,4,FM-12, ,030,1,N, 6.2,1,03600,1,C,N, 19.6,1, 19.0,1, 96, 865300,99999,19860401,2100,4,FM-12, ,050,1,N, 6.7,1,03600,1,C,N, 19.0,1, 18.7,1, 98, 865300,99999,19860402,0000,4,FM-12, ,340,1,N, 7.2,1,03600,1,C,N, 20.0,1, 19.4,1, 96, 865300,99999,19860402,0300,4,FM-12, ,360,1,N, 4.1,1,03600,1,C,N, 19.4,1, 19.1,1, 98, 865300,99999,19860402,0600,4,FM-12, ,999,1,C, 0.0,1,03600,1,C,N, 19.2,1, 18.9,1, 98, 865300,99999,19860402,0900,4,FM-12, ,999,1,C, 0.0,1,00210,1,C,N, 19.0,1, 18.7,1, 98, 865300,99999,19860402,1200,4,FM-12, ,200,1,N, 2.6,1,00210,1,C,N, 20.4,1, 20.1,1, 98, 865300,99999,19860402,1500,4,FM-12, ,210,1,N, 5.1,1,00750,1,C,N, 23.2,1, 19.3,1, 79, 865300,99999,19860402,1800,4,FM-12, ,200,1,N, 3.1,1,00750,1,C,N, 26.4,1, 18.4,1, 62, 865300,99999,19860402,2100,4,FM-12, ,999,1,C, 0.0,1,22000,1,C,N, 26.2,1, 17.1,1, 57, 865300,99999,19860403,0000,4,FM-12, ,140,1,N, 4.1,1,22000,1,C,N, 19.2,1, 17.0,1, 87, 865300,99999,19860403,0300,4,FM-12, ,999,1,C, 0.0,1,22000,1,C,N, 15.8,1, 15.2,1, 96, 865300,99999,19860403,0600,4,FM-12, ,999,1,C, 0.0,1,22000,1,C,N, 15.4,1, 14.0,1, 91, 865300,99999,19860403,1200,4,FM-12, ,060,1,N, 5.1,1,22000,1,C,N, 21.0,1, 19.8,1, 93, 865300,99999,19860403,1500,4,FM-12, ,060,1,N, 4.1,1,00900,1,C,N, 24.8,1, 21.3,1, 81, 865300,99999,19860403,1800,4,FM-12, ,050,1,N, 7.7,1,09000,1,C,N, 28.0,1, 21.4,1, 67, 865300,99999,19860403,2100,4,FM-12, ,040,1,N, 5.1,1,09000,1,C,N, 25.4,1, 21.4,1, 79, 865300,99999,19860404,0000,4,FM-12, ,060,1,N, 6.2,1,03600,1,C,N, 22.2,1, 21.3,1, 95, 865300,99999,19860404,0300,4,FM-12, ,050,1,N, 5.1,1,09000,1,C,N, 21.0,1, 20.7,1, 98, 865300,99999,19860404,0600,4,FM-12, ,060,1,N, 6.2,1,22000,1,C,N, 20.2,1, 19.9,1, 98, 865300,99999,19860404,1200,4,FM-12, ,040,1,N, 5.1,1,00120,1,C,N, 20.4,1, 19.5,1, 95, 865300,99999,19860404,1500,4,FM-12, ,020,1,N, 7.7,1,00420,1,C,N, 24.2,1, 20.4,1, 79, 865300,99999,19860404,1800,4,FM-12, ,250,1,N, 4.1,1,00750,1,C,N, 25.6,1, 20.7,1, 74, 865300,99999,19860404,2100,4,FM-12, ,250,1,N, 5.1,1,00750,1,C,N, 23.6,1, 20.4,1, 82, 865300,99999,19860405,0000,4,FM-12, ,180,1,N, 6.2,1,00420,1,C,N, 20.2,1, 19.6,1, 96,
  • 19. s u m m a r y s t a t i s t i c s
  • 20. S U M M A RY S TAT I S T I C S • A statistic is a function of the data we have inputed • It aims to capture information about values to make it more understandable
  • 21. T H E FA M O U S O N E : • Mean (‘average’) • Sum all of the data and divide by the number of items • Gives a sense of ‘size’
  • 23. O T H E R S TAT I S T I C S • “Location” • Mean, Mode, Median • “Spread” • Standard Deviation • “Shape” • Skew, Kurtosis
  • 24. D E M O
  • 26. What is a random variable?
  • 27. Discrete Variables Can be any of a list of values, each with its own probability H E A D S 0 . 5 TA I L S 0 . 5 2 1 / 3 6 3 2 / 3 6 4 3 / 3 6 5 4 / 3 6 6 5 / 3 6 7 6 / 3 6 8 5 / 3 6 9 4 / 3 6 1 0 3 / 3 6 1 1 2 / 3 6 1 2 1 / 3 6
  • 28. This makes sense: X = Result of a coin flip H E A D S 0 . 5 TA I L S 0 . 5 But: X won’t always have the same value
  • 29. R A N D O M VA R I A B L E S X = Result of a coin flip H E A D S 0 . 5 TA I L S 0 . 5 X is a Random Variable This is its distribution
  • 30. D E M O …
  • 31. Continuous A numerical variable, that can be any number (sometimes within a range) height weight Math.random()
  • 32. H O W D O W E D E F I N E T H E D I S T R I B U T I O N ? Math.random() height
  • 33. D E M O
  • 34. S O W H AT ? E R R R …
  • 35. • When we do data analysis, we’re really looking at the range of values a random variable can be… • … and asking questions about its distribution.
  • 36. Y O U ’ R E A N A U D I T O R I M A G I N E …
  • 37. A U D I T I N G A L E D G E R • Make a list of all ingoing and outgoing transactions • These are random variables. • What is their distribution? Does it deviate from what we expect?
  • 38. B E N F O R D ’ S L A W http://www.journalofaccountancy.com/Issues/1999/May/nigrini
  • 39. I N T U I T I V E U S E R I N P U T S D E S I G N I N G
  • 40. O U R TA S K … • Designing a system that tries to understand what happens under financial system “shocks” • So: a user would input a shock, its impacts would propagate and we would see our bottom line.
  • 41. O U R F I R S T AT T E M P T • Shock ‘sliders’ that scaled linearly 0 % 2 5 % B O O M 9 0 % B U S T
  • 42. D I S T R I B U T I O N O F F I N A N C I A L C H A N G E S
  • 43. S O … • Shock ‘sliders’ that scaled linearly 0 % 8 % B O O M 1 0 5 % B U S T Change that happens with 75% chance Change that happens with 10% chance
  • 45. M A K I N G R A N D O M VA R I A B L E S
  • 46. S O M E WA R N I N G S • Exactly what randomness means is a fuzzy question. • These numbers are not ‘cryptographically’ random.
  • 47. J AVA S C R I P T ’ S E N T RY T O R A N D O M N E S S • Different runtimes can implement it differently. • V8 implements Multiply-With- Carry: • Take a sequence of ‘seed’ values • Iteratively perform modular arithmetic-based operations • Extend the initial seed values to a longer sequence. Math.random()
  • 48. W H AT A B O U T O T H E R D I S T R I B U T I O N S ? B U T …
  • 49. T H E S H O R T A N S W E R Math.random()= f( )
  • 50. T H E S H O R T A N S W E R = H E A D S 0 . 5 TA I L S 0 . 5 =
  • 51. W H AT ’ S T H E F U N C T I O N ? jStat beta centralF cauchy chi-squared exponential gamma inverse gamma kumaraswamy lognormal normal pareto student t uniform weibull binomial negative binomial hypergeometric poisson triangular OR
  • 52. U S I N G R A N D O M N E S S
  • 53. w hy w o u l d i w a n t t o u s e R A N D O M N E S S ?
  • 54. S T U B B E D T E S T D ATA • Avoid coupling yourself to specific test implementations • Spin-up life-like environments for load testing
  • 55. N O N - D E T E R M I N I S T I C A L G O R I T H M S • Modelling underlying or random data • Solving a problem that is expensive or impossible to solve perfectly
  • 56. P I T FA L L S
  • 57. C H O O S I N G T H E D I S T R I B U T I O N • What if a ‘uniform’ distribution isn’t enough? • What if we want random data that isn’t just numbers?
  • 58. E X A M P L E : S O C I A L N E T W O R K
  • 59. E X A M P L E : S O C I A L N E T W O R K 11 Traversals
  • 60. D E M O
  • 61. B a r a b a s i - A l b e r t R a n d o m M o d e l
  • 62. B A R A B A S I - A L B E R T R A N D O M M O D E L • Start with two linked objects • Add one new object at a time • Link that object to one existing object, with already ‘popular’ objects more likely to be chosen.
  • 63. T H I S M O D E L S … • Academic Citations • Actor filmographies • Spread of Infectious diseases • Social Networks
  • 64. C O N T E N T S T H E O RY CA S E S T U D I E S JAVA S C R I P T A P P L I CAT I O N W H AT I S DATA ? G A I N I N G I N S I G H T S R A N D O M N E S S S I M U L AT I O N L E A R N I N G T H R O U G H Reward: What shape is the internet?
  • 66. • Data is any information we collect. Not all data is valuable. • Seeing trends in lots of numbers is hard. Summary statistics and charts help us unpick its meaning. • Data can be treated as random ‘realisations’ from a backing distribution. • Making random variables is easy, and can be done in different shapes for different purposes. W H AT I S DATA ? G A I N I N G I N S I G H T S R A N D O M N E S S S I M U L AT I O N
  • 67. L I B R A R I E S W E U S E D G E N E R A L L I B R A R I E S K N O C K O U T. J S R E Q U I R E . J S B O O T S T R A P D ATA M A N I P U L AT I O N L O D A S H J S TAT D ATA I M P O RT PA PA PA R S E C H A RT I N G D 3 C H A R T. J S
  • 68. T H A N K YO U D av i d S i m o n s @ Swa m Wi t h Tu rt l e s