In this webinar, we show how CARTO can be used in site planning applications to analyze multivariate geolocated data and derive data-driven insights when opening, relocating or consolidating location sites.
Watch it now at: https://go.carto.com/how-spatial-data-science-site-planning-webinar-recorded
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
How to Use Spatial Data Science in your Site Planning Process? [CARTOframes]
1. How to Use Spatial Data
Science in Your Site Planning
Process
FOLLOW @CARTO ON TWITTER
2. The Sum of Our Parts
Today’s Speakers
Giulia Carella Steve Isaac
Data Scientist Content Marketing Manager
3. CARTO — Turn Location Data into Business Outcomes
CARTO is the platform to build
powerful Location Intelligence apps
with the best data streams available.
5. CARTO — Turn Location Data into Business Outcomes
The Complete Journey
1. Data Ingestion & Management
2. Enrichment
3. Analysis
4. Solutions & Visualization
5. Integration
6. CARTO — Turn Location Data into Business Outcomes
The Complete Journey
1. Data Ingestion & Management
2. Enrichment
3. Analysis
4. Solutions & Visualization
5. Integration
7. Enrichment
● Save time in gathering spatial data,
augmenting your existing data with
demographics from across the globe
● Create locations from addresses and
understand travel time all from within
CARTO
● Develop robust ETL processes and update
mechanisms so your data is always enriched
● Premium data to understand and analyze
deeper trends and behavior
Data
Observatory
ETL
Processing
CARTO
Grid
Data Services
API
Routing &
Traffic
Geocoding
8. Analysis
● Bring maps and data into your Data Science
workflows and the Python data science
ecosystem with CARTOframes
● Machine learning embedded in CARTO as
simple SQL calls for clustering, outliers analysis,
time series predictions, and geospatial
weighted regression
● Use the power of PostGIS and our APIs to
productionalize analysis workflows in your
CARTO platform
CARTO Frames Analysis
API
SQL
API
Python
SDK
10. Financial
Housing
Human Mobility
Road Traffic Points of Interest
Demographics
Merchant and ATM transaction
data from leading banks and
credit card companies
Mobile device and GPS data
provide insight into human
movement patterns
The most recent census data
including: age, income, household
types and more
Property statistics, prices, and
history to drive decisions in
investment portfolios
Data from routing apps and GPS
to analyse traffic patterns and
commuter behaviour
Location data for business
establishments, restaurants,
schools, attractions, and more
11. CARTO — Turn Location Data into Business Outcomes
The Age of Data Abundance?
12. AND ITS HIDDEN PITFALLS
Sampling Bias
Data may not be collected using
random samples, e.g. need
extrapolation to the total
population
13. AND ITS HIDDEN PITFALLS
Sampling Bias
Data may not be collected using
random samples, e.g. need
extrapolation to the total
population
Anonymisation
Data needs to be anonymised
to meet regulations, and
vendors have different
approaches for that
14. AND ITS HIDDEN PITFALLS
Sampling Bias
Data may not be collected using
random samples, e.g. need
extrapolation to the total
population
Anonymisation
Data need to be anonymised to
meet regulations, and vendors
have different approaches for
that
Different Aggregations
Data comes in different spatial
aggregations such as grid cells
of different sizes or
administrative boundaries
17. Which spatial scale is correct?
How do we change from one spatial scale to another?
THE CHANGE OF SUPPORT PROBLEM
Statistical downscale/upscale model to
DISAGGREGATE/AGGREGATE
the data at different spatial resolutions
18. A PRELIMINARY SOLUTION
AREA WEIGHTENING
Which spatial scale is correct?
How do we change from one spatial scale to another?
20. Viz using vector maps
Connector to CARTO platform
WHAT IS CARTOframes?
● Python package
● To be used in Jupyter Notebooks
● Built for Data Scientists
● Part of CARTO Analysis stack
CARTOFrames Analysis API SQL API Python SDK
27. CARTO — Turn Location Data into Business Outcomes
WITH SOME CAVEATS:
1. Different variances?
2. Correlated variables?
3. Missing data?
4. When is a distance small enough? Or how to define
similarity?
TWIN AREA MODEL
31. 1. Eigen-decomposition of the sample covariance matrix
2. Rearrange the columns in the eigenvector matrix in order of decreasing eigenvalue
3. Keep only the eigenvectors that correspond to the p-largest eigenvalues
4. Compute the principal components (PC)
5. Reconstruct the original data
How many PCs? Let’s use an ensemble!
33. 1. PCA can also be described as the ML solution of a probabilistic latent variable model (PPCA)
2. Find the ML estimate for the model parameters using the EM algorithm
2.1. E-step:
2.2. M-step
34. Similarity Score
HOW TO DEFINE SIMILARITY
So far we have only computed distances in the variable space
0 1
Actually since we are computing an K-ensemble of distances...
Let’s compare instead the score for each target location to the score from the mean vector data
35. Takeaways
CARTO Data Observatory
(DO) for data enrichment
CARTOframes as a connector
to the DO and for powerful
vector visualizations
Site-planning applications
require various sources of
location data streams
Easily derive data-driven
insights when opening,
relocating or consolidating
location sites
36. Thanks for listening! Any
questions?
Request a demo at CARTO.COM
Giulia Carella
Data Scientist // giulia@carto.com
Steve Isaac
Content Marketing Manager // sisaac@carto.com