Trusted analytics and predictive data models require accurate, consistent, and contextual data. The more attributes used to fuel models, the more accurate their results. However, building comprehensive models with trusted data is not easy. Accessing data from multiple disparate sources, making spatial data consumable, and enriching models with reliable third-party data is challenging.
In this webinar you will learn how to:
Organize and manage address data and assign a unique and persistent identifier Enrich addresses with standard and dynamic attributes from our curated data portfolio Analyze enriched data to uncover relationships and create dashboard visualizations Understand high-level solution architecture
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Â
Learn How to Turbocharge Your AI/ML Data Workflows with Data Enrichment
1. Learn How to
Turbocharge Your
AI/ML Data Workflows
with Data Enrichment
Tim McKenzie | Director, Solution Architecture
1
2. Location data challenges
⢠Location is Messy- Addresses, Lat/Long,
Shapes, Lines, Formats
⢠Complexity of Joining Location Based Data
Sources (3rd Party and Internal)
⢠Data Sourcing Challenges- Many Providers,
Many Formats, Many Pricing and Licensing
Differences
⢠Global Extensibility- Data Sources Tend to
Be Regional Yet Use Cases are Often
Global
⢠Need to Identify and Process Multi-Family
and Condo Properties
⢠De-centralized repositories of data
⢠Complex properties can often have multiple
valid addresses, parcels and buildings.
⢠Legal descriptions in variety of format
leading to discrepancy, inefficiencies, errors
and non-compliance
2
âFor every minute spent in
organizing, an hour is earned.â
Benjamin Franklin
Inventor, Statesman, Insurer
3. Data prep slows data science
3%
19%
9%
4%
5%
What data
scientists spend
the most time
doing
Building data sets
Cleaning and organizing data
Collecting datasets
Mining data for patterns
Refining algorithms
Other
accounts for about 80%
of the work of data
scientists
3
3
4. Location enabling strategies for data analytics
03.
Analyze
Apply data science at
scale to gain a
competitive advantage
02.
Enrich
Leverage trusted ID to
join massive amounts of
your own and 3rd party
data sources
01.
Organize
Assign a trusted ID that is
unique and persistent to
each address
4
5. Fast, easy, and consistent data enrichment
5
Preciselyâs Geo Addressing with hyper-accurate Master Location Data (MLD) reference data
⢠Belgium & Luxembourg
⢠Canada
⢠Finland
⢠France
⢠Germany
⢠Great Britain
⢠Ireland
⢠Netherlands
⢠Sweden
⢠Singapore
⢠United States
⢠More coming soon!
International
Coverage
Data
Sources
⢠Postal Authorities
⢠Government
datasets: local city,
county, and state
⢠Global Vendors
⢠Local Players
⢠Open Sources
⢠Proprietary
Sources
⢠Largest & Best available
⢠Unparalleled &
⢠Parent-child relationship,
⢠Unique and Persistent Identifier,
⢠Multi-sourced,
⢠Simplify data enrichment process,
MLD Attributes
6. Data Enrichment â A global product portfolio
Addresses & Property
Verified and validated address and
property data for map display and
analytics
Boundaries
Administrative, community, and
industry-specific boundaries for data
enrichment and territory analysis
Demographics
Demographic and consumer context
data for better understanding people
and behavior
Points of Interest
Detailed business, leisure, and
geographic features for location
and competitive intelligence
Streets
Robust street-level data for mapping,
analysis, routing, and geocoding
Risk
Natural hazard boundaries related to
flood, fire, earthquakes, and weather
Expertly curated datasets containing thousands of attributes for faster, confident decisions
6
7. Uniquely positioned to address data enrichment needs
Global coverage location enrichment data. Our portfolio includes:
⢠400+ datasets
⢠250+ countries and territories
⢠100s of millions of data points
Datasets that are interoperable and are managed to quality standard, with consistent documentation, and
support e.g.
⢠Property Graph
⢠Market and Community Link
Ability to enrich with dynamic data (Dynamic Weather and Dynamic Demographics)
⢠Data that includes time as a dimension
⢠Creating insights from data that is updated at regular and short time intervals (e.g. 5 min)
Data experience through deep-domain expertise
⢠Adding data through, development, partnerships, and acquisitions
Best-in-class addressing and property datasets with a unique and persistent ID
⢠Link Precisely and customer address, buildings, demographics, risk, and more data using the PreciselyID,
a unique and persistent location identifier
7
8. Cloud-based location analytics technology
8
Spatial
Functions
30+ Common
Spatial Processes
Global
Geocoding
Forward & Reverse
Global Geocoding
and Trusted ID
Global
Addressing
Validate,
standardize and
parse global
addresses
Global Tax
Jurisdiction
Assign highly
granular tax
jurisdictions
globally.
Map
Visualization
Visualize Location
Data at Scale
Global Street
Routing
Assign isochrones
and isodistance
anywhere in the
world.
9. Location-enabled analytics
Bank Branch & ATM
Call Center/ Web
Customers by Product
Commercial & Mortgage
Active Mortgages
Historical Defaults
Geocoding and location
intelligence capabilities to
organize and enrich your data
Financial Transactions
All of your sources
Any structure
or frequency
Analytics capabilities for
any use case or persona
Ad Hoc Data Science
Low-cost, rapid experimentation with
new data and models.
Explainable Machine Learning
High volume, fine-grained analysis at scale
served in the tightest of service windows.
BI Reporting & Dashboarding
Power real-time dashboarding directly,
or feed data to a data warehouse for
high-concurrency reporting.
Real-time Applications
Provide real-time data to downstream
applications or power applications via APIs.
PreciselyID
ADMIN
BOUNDARIES
BANK DEPOSITS
MOBILE
MOVEMENT
WEATHER
EVENTS
HAZARD &
RISK DATA
AMENITIES &
COMPETITION
EVERY US/CAN
ADDRESS
BUSINESS
LOCATIONS
PROPERTY
ATTRIBUTES
SCHOOLS &
NEIGHBORHOODS
POPULATION
DEMOGRAPHICS
PARCELS &
BUILDINGS
Analytics Platform
10. Understanding the
data challenge
10
⢠Accessing the right raw data
⢠Keeping up with continuously changing data feeds
⢠Building features from raw data
⢠Combining features into training data
⢠Calculating and serving features in production
⢠Monitoring features in production
Key data challenges that organizations
face when productionizing ML systems
10
11. What is a âfeature-basedâ
architecture?
11
A feature store is an ML-specific data system that:
⢠Runs data pipelines that transform raw data into
feature values
⢠Stores and manages the feature data itself, and
⢠Serves feature data consistently for training and
inference purposes
A feature is data used as an input
signal to a predictive model
11
12. 12
Processing
Storage
Inputs
Location specific records Shape files Streaming records
Address Fabric
Analytics
Processing
⢠Model outputs
⢠Scores
⢠Computed columns
⢠Analysis outcome
Batch Geocoding
with the Operational
Addressing SDKs
⢠Vaildate input addresses
⢠Validate other data
⢠Locate addresses
⢠Match inputs
⢠Assign PreciselyID
⢠Relate data around
PrecisleyID
Batch Spatial
Processing
with the Location
Intelligence SDK
⢠Flatten shape files
⢠Compute PIP
⢠Compute D2P, D2L
⢠Compute basic scores
⢠Generate geohash
⢠Relate data around geohash
(where application)
Realtime Processing
with the Precisely SDKs
⢠Operational Addressing APIs
⢠Assign PreciselyID
⢠Generate geohash
⢠Relate data
Message Bus
Feature Store
In-stream Analytics Layer
Model outputs, scores, computed columns,
analysis outcomes
PrecisleyID Address
P0000MK1IAAD 287 E 300 S. Provo, UT 84606
P0000MK1DPRD 410 N University Ave. Provo, UT 84601
Vendor
data files
Customer Loyalty Records
Equipment Inventories
Franchise
Zones
Pricing Delivery
Territories
Mobile Trace
Data
POS/IOT
Data
Administration, Governance, Security, Connectivity, Schema, Catalog
Model
Training
EDW
precisely
Data subscriptions
with PreciselyID
PrecisleyID Address Name Type Score Location MICode PointCode DemoRgn
P0000MK1IAAD 287 E 300 S. Provo, UT 84606 Empas LLC REST 91.529 UT108 10020100 101067669 8926
P0000MK1DPRD 410 N University Ave. Provo, UT 84601 THAI HUT REST 65.981 UT108 10020100 100854441 4144
âŚ. âŚ.. âŚ.. âŚ.. âŚ.. âŚ. âŚ. âŚ.. âŚ.