SlideShare ist ein Scribd-Unternehmen logo
1 von 34
Downloaden Sie, um offline zu lesen
Profiling US Restaurants from Billions
of Payment Card Transactions
Himel Dev Hossein Hamooni
Payment Card Transactions
Across the globe, billions of
people regularly use payment
cards (say debit and credit) to
pay for goods and services.
Payment processing companies
handle these transactions and
record their attributes in the
form of transaction data.
Transaction data contain rich
insights into the behavior of
cardholders and the business
of merchants.
Transactional insights can benefit various applications, such
as payment fraud detection and merchant recommendation.
However, utilizing transactional insights often require
auxiliary information about merchants that are missing
from the payment company’s perspective.
Can’t We Just Use Google orYelp?
Cost: It is expensive to acquire
merchant information, especially for
commercial purposes.
Unavailability: Payment cards are
used in many countries where crowd
sourced merchant data is unavailable.
Unreliability: Acquired information
become outdated, as new merchants
appear or old merchants disappear.
Our big goal is to infer latent merchant attributes from
transaction data, without using external sources.
Proof-of-Concept Use Case
Infer the cuisine types of restaurants by analyzing only transaction data.
Restaurant
Recommendation
Fraud
Detection
Transaction Data
Our dataset contains four billion debit and credit card transactions in
more than half a million US restaurants within three months.
We aim to develop a framework for inferring the cuisine types of
restaurants from the transaction data.
Cardholder Merchant Zip Date & Time $$ $$+
5r8g3d0u5q Peking Cafe 55075 20-06-2017 18:03:05 12.65 13.65
5r8g3d0u5q Health Junkie 55075 12-07-2017 19:21:17 17.81 19.81
5r8g3d0u5q Pineda Tacos 55075 13-07-2017 13:06:04 10.99 12.00
1
Cardholder Merchant Zip Date & Time $$ $$+
5r8g3d0u5q Peking Cafe 55075 20-06-2017 18:03:05 12.65 13.65
5r8g3d0u5q Health Junkie 55075 12-07-2017 19:21:17 17.81 19.81
5r8g3d0u5q Pineda Tacos 55075 13-07-2017 13:06:04 10.99 12.00
Location PriceTime
Group CapacityTips
Loyalty
Name Label
Peking
Cafe
Chinese
Pineda
Tacos
Mexican
Health
Junkie
? Embedding
2
2
3
Deep Neural Network
Cuisine Inference
Framework
Weakly Supervised
Label Generation
Statistical Feature and
Neural Embedding Extraction
Deep Neural Network
Based Classification
1
2
3
1.Weakly Supervised Label Generation
Cuisine Taxonomy Creation
Id Cuisine Type Subcategories
1 Latin American Mexican, Cuban, Brazilian, Colombian
2 European French, Italian, German, Polish, Irish
3 Mediterranean Greek,Turkish
Middle Eastern Saudi Arabian, Lebanese, Persian,Afghan
African Moroccan, Ethiopian, Eritrean
4 South Asian Indian, Pakistani, Nepalese, Bangladeshi
5 South East Asian Thai,Vietnamese, Indonesian, Malaysian
6 East Asian Chinese, Japanese, Korean, Mongolian
7 Grill and Steak Grill, Steakhouse
8 Fastfood Sandwich, Burger, Pizza
9 Bar Bar, Pub,Tavern, Inn
10 Dessert Ice Cream, Cafe, Bakery, Juice
We create a cuisine
taxonomy for the US
restaurants.
Our taxonomy contains
the ten most popular
cuisine types in the US.
Each of these major
cuisine types cover many
minor cuisine types.
Seed Word Compilation
Restaurant Name Cuisine Type
Peking Garden Chinese
Golden Wok Chinese
Ambar India Indian
Himalayan Chimney Indian
Biaggi's Ristorante Italiano Italian
Pizzeria Antica Italian
Maize Mexican Grill Mexican
Burrito King Mexican
Garbanzo Mediterranean Fresh Mediterranean
Jerusalem Mediterranean
Good Fella ???
We compile a set of seed words
for each cuisine type in our
taxonomy.
We use these words as common
patterns to generate cuisine labels
for restaurant names.
Currently, we have a list of 225
seed words that represent the ten
major cuisine types.
Bootstrapped Label Expansion
We extract new (beyond seed) words from restaurant names to utilize as highly
accurate patterns for increasing the coverage of labeled restaurants.
Frequency: The word needs to appear in θf fraction of all restaurant
names
Precision: If we use the word and its majority label as a labeling rule, the
rule needs to be true for θp fraction of labeled restaurants
Significance: The ratio of labeled and unlabeled restaurants that contain the
word should be θs
Using seed and bootstrapped words, we
could label 35% restaurants in our dataset.
Topic Modeling
To augment the keyword-based approach, we develop a custom topic model.
Issue Description Solution
Monolith Many restaurant names consist of a single word Sprinkling
Sparse Sparse word co-occurrence patterns in restaurant names BTM
LongTail Long-tail distribution of words in restaurant names Stratification
The resultant topics (cuisine types) are coherent and consistent
with the cuisine types generated by our keyword-based approach.
1I. Statistical Feature and Neural Embedding Extraction
Statistical Features
Feature Type Description
Pricing The deciles of authorized amount in transactions
Tipping culture The deciles of (settlement amount - authorized amount) in transactions
Serving capacity The deciles of hourly transaction count
Party size The proportion of transactions for different party size
Party pricing The average authorized amount for different party size
Temporal pattern I The distribution of number of transactions over days of the week
Temporal pattern II The distribution of number of transactions over the hours of weekdays
Temporal pattern III The distribution of number of transactions over the hours of weekends
Customer revisitation The deciles of the number of revisits by the customers
Customer loyalty The deciles of the number of restaurants visited by the customers
Location The digits of restaurant zipcode and corresponding location granularity
Statistical Feature Insights
X
Customer Restaurant Interaction
U1 U2 U3 U4
R1 R2 R3 R4 R5
Micro and Macro Hypotheses
The distinction between the hypotheses lies in application: individual vs group.
Micro Hypothesis: The compatibility between a customer’s preferences and a
restaurant’s attributes is a good predictor of whether the customer will visit
the restaurant. For example, a vegetarian is likely to visit an Indian restaurant.
Macro Hypothesis: The type of customers who visit a given restaurant (as a
whole) is a good predictor of its attributes. For example, a restaurant is
unlikely to be a steakhouse if many of its customers are vegetarian.
Micro Embedding
U1 U2 U3 U4
R1 R2 R3 R4 R5
R1 R3R2 R2 R5 R4Micro: word2vec
Macro Embedding
U1 U2 U3 U4
R1 R2 R3 R4 R5
U1 U3U2 U2 U4 U3Macro: doc2vec
Micro and Macro Embedding
U1 U2 U3 U4
R1 R2 R3 R4 R5
U1 U3U2 U2 U4 U3
R1 R3R2 R2 R5 R4Micro: word2vec
Macro: doc2vec
Name Embedding
We generate name embeddings to utilize the non-labeling words in names.
We first remove the labeling words from each restaurant name.
We then retrieve pre-trained GloVe embedding for each remaining word in name.
We finally combine the word embeddings via max pooling.
We generate three sets of restaurant embeddings
to represent the latent characteristics of restaurants.
1II. Deep Neural Network Based Classification
DNN Models
Shallow Feedforward: This is a feedforward neural network with two hidden layers
Deep Feedforward: This is a feedforward neural network with four hidden layers
Deep Feedforward Res: This is a deep feedforward network with residual connections
Wide and Deep: This is the wide and deep network that captures feature interaction
Deep and Cross: This is the deep and cross network that applies feature crossing
To demonstrate the effectiveness of our framework, we develop several DNN models.
Price Tips Location…
Statistical Features
Micro Macro Name
Embeddings
Concatenated Layer
Hidden Layer 1
Hidden Layer 2
S
Output
…
…
+ + + + + +
Deep
Feedforward
with Residual
Experimental Evaluation
Performance Comparison
The Deep Feedforward Network outperforms Wide and Deep, and Deep and Cross.
Adding residual connections boost the performance of the Deep Feedforward Network.
DNN Model Accuracy
Shallow Feedforward 0.743
Deep Feedforward 0.756
Deep Feedforward with Residual 0.762
Wide and Deep 0.740
Deep and Cross 0.746
Confusion Matrix
Confusion Matrix
Confusion Matrix
Ablation Study
Ablation Study
Ablation Study
Summary
We developed a framework for inferring the cuisine types of restaurants from debit
and credit card transactions.
Our proposed framework consists of three steps: 1) weakly-supervised label
generation, 2) statistical feature and neural embedding extraction, and 3) deep neural
network based classification.
The proposed framework achieved a 76.2% accuracy in classifying the US restaurants.

Weitere ähnliche Inhalte

Ähnlich wie Profiling US Restaurants from Billions of Payment Card Transactions

How GetNinjas uses data to make smarter product decisions
How GetNinjas uses data to make smarter product decisionsHow GetNinjas uses data to make smarter product decisions
How GetNinjas uses data to make smarter product decisionsBernardo Srulzon
 
A Detailed Analysis of Food Delivery Aggregator Data (1).pptx
A Detailed Analysis of Food Delivery Aggregator Data (1).pptxA Detailed Analysis of Food Delivery Aggregator Data (1).pptx
A Detailed Analysis of Food Delivery Aggregator Data (1).pptxjacklutz2
 
How rubinson can help you drive growth in a digital age december 2019
How rubinson can help you drive growth in a digital age december 2019How rubinson can help you drive growth in a digital age december 2019
How rubinson can help you drive growth in a digital age december 2019Joel Rubinson
 
Behavior Analytics by Ronny Max
Behavior Analytics by Ronny MaxBehavior Analytics by Ronny Max
Behavior Analytics by Ronny MaxRonny Max
 
Restaurants of Seoul - "likes" prediction report
Restaurants of Seoul - "likes" prediction reportRestaurants of Seoul - "likes" prediction report
Restaurants of Seoul - "likes" prediction reportAmelia Choi
 
Analytical Strategies For Energy Marketing
Analytical Strategies For Energy MarketingAnalytical Strategies For Energy Marketing
Analytical Strategies For Energy Marketingmartyagius
 
A Detailed Analysis of Food Delivery Aggregator Data (1).pdf
A Detailed Analysis of Food Delivery Aggregator Data (1).pdfA Detailed Analysis of Food Delivery Aggregator Data (1).pdf
A Detailed Analysis of Food Delivery Aggregator Data (1).pdfjacklutz2
 
A Detailed Analysis of Food Delivery Aggregator Data.pdf
A Detailed Analysis of Food Delivery Aggregator Data.pdfA Detailed Analysis of Food Delivery Aggregator Data.pdf
A Detailed Analysis of Food Delivery Aggregator Data.pdfjacklutz2
 
Revel Presents at Under the Radar
Revel Presents at Under the RadarRevel Presents at Under the Radar
Revel Presents at Under the RadarDealmaker Media
 
How to Visualize a Business
How to Visualize a BusinessHow to Visualize a Business
How to Visualize a BusinessJeffrey Tjendra
 
Using Lifetime Value to Optimize Your Digital Marketing Investments
Using Lifetime Value to Optimize Your Digital Marketing InvestmentsUsing Lifetime Value to Optimize Your Digital Marketing Investments
Using Lifetime Value to Optimize Your Digital Marketing InvestmentsAdknowledge
 
Telecom analytics brochure
Telecom analytics brochure Telecom analytics brochure
Telecom analytics brochure Daniel Thomas
 
Supply Chain Metrics That Matter: A Focus on the Consumer Products Industry 2...
Supply Chain Metrics That Matter: A Focus on the Consumer Products Industry 2...Supply Chain Metrics That Matter: A Focus on the Consumer Products Industry 2...
Supply Chain Metrics That Matter: A Focus on the Consumer Products Industry 2...Lora Cecere
 

Ähnlich wie Profiling US Restaurants from Billions of Payment Card Transactions (20)

How GetNinjas uses data to make smarter product decisions
How GetNinjas uses data to make smarter product decisionsHow GetNinjas uses data to make smarter product decisions
How GetNinjas uses data to make smarter product decisions
 
A Detailed Analysis of Food Delivery Aggregator Data (1).pptx
A Detailed Analysis of Food Delivery Aggregator Data (1).pptxA Detailed Analysis of Food Delivery Aggregator Data (1).pptx
A Detailed Analysis of Food Delivery Aggregator Data (1).pptx
 
How rubinson can help you drive growth in a digital age december 2019
How rubinson can help you drive growth in a digital age december 2019How rubinson can help you drive growth in a digital age december 2019
How rubinson can help you drive growth in a digital age december 2019
 
Dat analytics all verticals
Dat analytics all verticalsDat analytics all verticals
Dat analytics all verticals
 
Recommender system
Recommender systemRecommender system
Recommender system
 
Behavior Analytics by Ronny Max
Behavior Analytics by Ronny MaxBehavior Analytics by Ronny Max
Behavior Analytics by Ronny Max
 
Restaurants of Seoul - "likes" prediction report
Restaurants of Seoul - "likes" prediction reportRestaurants of Seoul - "likes" prediction report
Restaurants of Seoul - "likes" prediction report
 
Analytical Strategies For Energy Marketing
Analytical Strategies For Energy MarketingAnalytical Strategies For Energy Marketing
Analytical Strategies For Energy Marketing
 
A Detailed Analysis of Food Delivery Aggregator Data (1).pdf
A Detailed Analysis of Food Delivery Aggregator Data (1).pdfA Detailed Analysis of Food Delivery Aggregator Data (1).pdf
A Detailed Analysis of Food Delivery Aggregator Data (1).pdf
 
A Detailed Analysis of Food Delivery Aggregator Data.pdf
A Detailed Analysis of Food Delivery Aggregator Data.pdfA Detailed Analysis of Food Delivery Aggregator Data.pdf
A Detailed Analysis of Food Delivery Aggregator Data.pdf
 
Revel Presents at Under the Radar
Revel Presents at Under the RadarRevel Presents at Under the Radar
Revel Presents at Under the Radar
 
Best Buy
Best BuyBest Buy
Best Buy
 
Solving Big Data Industry Use Cases with AWS Cloud Computing
Solving Big Data Industry Use Cases with AWS Cloud ComputingSolving Big Data Industry Use Cases with AWS Cloud Computing
Solving Big Data Industry Use Cases with AWS Cloud Computing
 
Ihop Cs Slide Final Xp
Ihop Cs Slide Final   XpIhop Cs Slide Final   Xp
Ihop Cs Slide Final Xp
 
01.seven principles scm
01.seven principles scm01.seven principles scm
01.seven principles scm
 
How to Visualize a Business
How to Visualize a BusinessHow to Visualize a Business
How to Visualize a Business
 
How to Visualize a Business
How to Visualize a BusinessHow to Visualize a Business
How to Visualize a Business
 
Using Lifetime Value to Optimize Your Digital Marketing Investments
Using Lifetime Value to Optimize Your Digital Marketing InvestmentsUsing Lifetime Value to Optimize Your Digital Marketing Investments
Using Lifetime Value to Optimize Your Digital Marketing Investments
 
Telecom analytics brochure
Telecom analytics brochure Telecom analytics brochure
Telecom analytics brochure
 
Supply Chain Metrics That Matter: A Focus on the Consumer Products Industry 2...
Supply Chain Metrics That Matter: A Focus on the Consumer Products Industry 2...Supply Chain Metrics That Matter: A Focus on the Consumer Products Industry 2...
Supply Chain Metrics That Matter: A Focus on the Consumer Products Industry 2...
 

Kürzlich hochgeladen

Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 

Kürzlich hochgeladen (20)

Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 

Profiling US Restaurants from Billions of Payment Card Transactions

  • 1. Profiling US Restaurants from Billions of Payment Card Transactions Himel Dev Hossein Hamooni
  • 2. Payment Card Transactions Across the globe, billions of people regularly use payment cards (say debit and credit) to pay for goods and services. Payment processing companies handle these transactions and record their attributes in the form of transaction data. Transaction data contain rich insights into the behavior of cardholders and the business of merchants. Transactional insights can benefit various applications, such as payment fraud detection and merchant recommendation.
  • 3. However, utilizing transactional insights often require auxiliary information about merchants that are missing from the payment company’s perspective.
  • 4. Can’t We Just Use Google orYelp? Cost: It is expensive to acquire merchant information, especially for commercial purposes. Unavailability: Payment cards are used in many countries where crowd sourced merchant data is unavailable. Unreliability: Acquired information become outdated, as new merchants appear or old merchants disappear.
  • 5. Our big goal is to infer latent merchant attributes from transaction data, without using external sources.
  • 6. Proof-of-Concept Use Case Infer the cuisine types of restaurants by analyzing only transaction data. Restaurant Recommendation Fraud Detection
  • 7. Transaction Data Our dataset contains four billion debit and credit card transactions in more than half a million US restaurants within three months. We aim to develop a framework for inferring the cuisine types of restaurants from the transaction data. Cardholder Merchant Zip Date & Time $$ $$+ 5r8g3d0u5q Peking Cafe 55075 20-06-2017 18:03:05 12.65 13.65 5r8g3d0u5q Health Junkie 55075 12-07-2017 19:21:17 17.81 19.81 5r8g3d0u5q Pineda Tacos 55075 13-07-2017 13:06:04 10.99 12.00
  • 8. 1 Cardholder Merchant Zip Date & Time $$ $$+ 5r8g3d0u5q Peking Cafe 55075 20-06-2017 18:03:05 12.65 13.65 5r8g3d0u5q Health Junkie 55075 12-07-2017 19:21:17 17.81 19.81 5r8g3d0u5q Pineda Tacos 55075 13-07-2017 13:06:04 10.99 12.00 Location PriceTime Group CapacityTips Loyalty Name Label Peking Cafe Chinese Pineda Tacos Mexican Health Junkie ? Embedding 2 2 3 Deep Neural Network Cuisine Inference Framework Weakly Supervised Label Generation Statistical Feature and Neural Embedding Extraction Deep Neural Network Based Classification 1 2 3
  • 10. Cuisine Taxonomy Creation Id Cuisine Type Subcategories 1 Latin American Mexican, Cuban, Brazilian, Colombian 2 European French, Italian, German, Polish, Irish 3 Mediterranean Greek,Turkish Middle Eastern Saudi Arabian, Lebanese, Persian,Afghan African Moroccan, Ethiopian, Eritrean 4 South Asian Indian, Pakistani, Nepalese, Bangladeshi 5 South East Asian Thai,Vietnamese, Indonesian, Malaysian 6 East Asian Chinese, Japanese, Korean, Mongolian 7 Grill and Steak Grill, Steakhouse 8 Fastfood Sandwich, Burger, Pizza 9 Bar Bar, Pub,Tavern, Inn 10 Dessert Ice Cream, Cafe, Bakery, Juice We create a cuisine taxonomy for the US restaurants. Our taxonomy contains the ten most popular cuisine types in the US. Each of these major cuisine types cover many minor cuisine types.
  • 11. Seed Word Compilation Restaurant Name Cuisine Type Peking Garden Chinese Golden Wok Chinese Ambar India Indian Himalayan Chimney Indian Biaggi's Ristorante Italiano Italian Pizzeria Antica Italian Maize Mexican Grill Mexican Burrito King Mexican Garbanzo Mediterranean Fresh Mediterranean Jerusalem Mediterranean Good Fella ??? We compile a set of seed words for each cuisine type in our taxonomy. We use these words as common patterns to generate cuisine labels for restaurant names. Currently, we have a list of 225 seed words that represent the ten major cuisine types.
  • 12. Bootstrapped Label Expansion We extract new (beyond seed) words from restaurant names to utilize as highly accurate patterns for increasing the coverage of labeled restaurants. Frequency: The word needs to appear in θf fraction of all restaurant names Precision: If we use the word and its majority label as a labeling rule, the rule needs to be true for θp fraction of labeled restaurants Significance: The ratio of labeled and unlabeled restaurants that contain the word should be θs Using seed and bootstrapped words, we could label 35% restaurants in our dataset.
  • 13. Topic Modeling To augment the keyword-based approach, we develop a custom topic model. Issue Description Solution Monolith Many restaurant names consist of a single word Sprinkling Sparse Sparse word co-occurrence patterns in restaurant names BTM LongTail Long-tail distribution of words in restaurant names Stratification The resultant topics (cuisine types) are coherent and consistent with the cuisine types generated by our keyword-based approach.
  • 14. 1I. Statistical Feature and Neural Embedding Extraction
  • 15. Statistical Features Feature Type Description Pricing The deciles of authorized amount in transactions Tipping culture The deciles of (settlement amount - authorized amount) in transactions Serving capacity The deciles of hourly transaction count Party size The proportion of transactions for different party size Party pricing The average authorized amount for different party size Temporal pattern I The distribution of number of transactions over days of the week Temporal pattern II The distribution of number of transactions over the hours of weekdays Temporal pattern III The distribution of number of transactions over the hours of weekends Customer revisitation The deciles of the number of revisits by the customers Customer loyalty The deciles of the number of restaurants visited by the customers Location The digits of restaurant zipcode and corresponding location granularity
  • 17. Customer Restaurant Interaction U1 U2 U3 U4 R1 R2 R3 R4 R5
  • 18. Micro and Macro Hypotheses The distinction between the hypotheses lies in application: individual vs group. Micro Hypothesis: The compatibility between a customer’s preferences and a restaurant’s attributes is a good predictor of whether the customer will visit the restaurant. For example, a vegetarian is likely to visit an Indian restaurant. Macro Hypothesis: The type of customers who visit a given restaurant (as a whole) is a good predictor of its attributes. For example, a restaurant is unlikely to be a steakhouse if many of its customers are vegetarian.
  • 19. Micro Embedding U1 U2 U3 U4 R1 R2 R3 R4 R5 R1 R3R2 R2 R5 R4Micro: word2vec
  • 20. Macro Embedding U1 U2 U3 U4 R1 R2 R3 R4 R5 U1 U3U2 U2 U4 U3Macro: doc2vec
  • 21. Micro and Macro Embedding U1 U2 U3 U4 R1 R2 R3 R4 R5 U1 U3U2 U2 U4 U3 R1 R3R2 R2 R5 R4Micro: word2vec Macro: doc2vec
  • 22. Name Embedding We generate name embeddings to utilize the non-labeling words in names. We first remove the labeling words from each restaurant name. We then retrieve pre-trained GloVe embedding for each remaining word in name. We finally combine the word embeddings via max pooling. We generate three sets of restaurant embeddings to represent the latent characteristics of restaurants.
  • 23. 1II. Deep Neural Network Based Classification
  • 24. DNN Models Shallow Feedforward: This is a feedforward neural network with two hidden layers Deep Feedforward: This is a feedforward neural network with four hidden layers Deep Feedforward Res: This is a deep feedforward network with residual connections Wide and Deep: This is the wide and deep network that captures feature interaction Deep and Cross: This is the deep and cross network that applies feature crossing To demonstrate the effectiveness of our framework, we develop several DNN models.
  • 25. Price Tips Location… Statistical Features Micro Macro Name Embeddings Concatenated Layer Hidden Layer 1 Hidden Layer 2 S Output … … + + + + + + Deep Feedforward with Residual
  • 27. Performance Comparison The Deep Feedforward Network outperforms Wide and Deep, and Deep and Cross. Adding residual connections boost the performance of the Deep Feedforward Network. DNN Model Accuracy Shallow Feedforward 0.743 Deep Feedforward 0.756 Deep Feedforward with Residual 0.762 Wide and Deep 0.740 Deep and Cross 0.746
  • 34. Summary We developed a framework for inferring the cuisine types of restaurants from debit and credit card transactions. Our proposed framework consists of three steps: 1) weakly-supervised label generation, 2) statistical feature and neural embedding extraction, and 3) deep neural network based classification. The proposed framework achieved a 76.2% accuracy in classifying the US restaurants.