SlideShare ist ein Scribd-Unternehmen logo
1 von 66
Building a Large-Scale, Adaptive
Recommendation Engine with Apache
Flink and Spark
Zoltán Zvara
zoltan.zvara@ilab.sztaki.hu
Gábor Hermann
ghermann@ilab.sztaki.hu
This project has received funding from the European Union’s Horizon 2020
research and innovation program under grant agreement No 688191.
About us
• Institute for Computer Science and Control, Hungarian Academy of
Sciences (MTA SZTAKI)
• Informatics Laboratory
• „Big Data – Momemtum” research group
• „Data Mining and Search” research group
• Research group with strong industry ties
• Ericsson, Rovio, Portugal Telekom, etc.
Agenda
1. Recommendation systems and matrix factorization
2. Batch vs. online
3. Matrix factorization
1. Online
2. Batch + online
4. Solution in Spark & Flink
5. Conclusions
Recommendation systems
Recommendation systems
𝑅
Recommendation with matrix factorization
5
1
3
5
2
0
0
0
0
Zoltán
Gábor
Rogue One Interstellar
Zoltán rated Rogue One
with 5 stars
𝑅
Recommendation with matrix factorization
𝑈
𝑈 ∙ 𝐼 ≈ 𝑅
item vector
3
2
5
5
3
2
5 -6 -1
5 4 -4
5
1
3
user
vector
5
2
Level of action
Level of drama
X factor
0
0
0
0
Latent
factors
Zoltán
Gábor
Rogue One Interstellar
Zoltán rated Rogue One
with 5 stars
𝑅
Recommendation with matrix factorization
𝑈
𝑈 ∙ 𝐼 ≈ 𝑅
item vector
3
2
5
5
3
2
5 -6 -1
5 4 -4
5
1
3
user
vector
5
2
Level of action
Level of drama
X factor
0
0
0
0
Latent
factors
Zoltán
Gábor
Rogue One Interstellar
min
𝑢∗,𝑖∗
(𝑝,𝑞)∈𝜅 𝑅
𝑟𝑝𝑞 − 𝜇 − 𝑏 𝑝 − 𝑏 𝑞 − 𝑢 𝑝 𝑖 𝑞
2
+
+𝜆
𝑝∈𝜅 𝑈
( 𝑢 𝑝
2
+ 𝑏 𝑝
2
) + 𝜆
𝑞∈𝜅 𝐼
( 𝑖 𝑞
2
+ 𝑏 𝑞
2
)
Zoltán rated Rogue One
with 5 stars
𝑅
Recommendation with matrix factorization
𝑈
𝑈 ∙ 𝐼 ≈ 𝑅
item vector
3
2
5
5
3
2
5 -6 -1
5 4 -4
5
1
3
user
vector
5
2
Level of action
Level of drama
X factor
?
0
0
0
0
Latent
factors
Zoltán
Gábor
Rogue One Interstellar
Zoltán rated Rogue One
with 5 stars
Would Gábor like Interstellar?
𝑅
Recommendation with matrix factorization
𝑈
𝑈 ∙ 𝐼 ≈ 𝑅
item vector
3
2
5
5
3
2
5 -6 -1
5 4 -4
5
1
3
user
vector
5
2
Level of action
Level of drama
X factor
?
0
0
0
0
Latent
factors
Zoltán
Gábor
Rogue One Interstellar
Zoltán rated Rogue One
with 5 stars
Would Gábor like Interstellar?
𝑅
Recommendation with matrix factorization
𝑈
𝑈 ∙ 𝐼 ≈ 𝑅
item vector
3
2
5
5
3
2
5 -6 -1
5 4 -4
5
1
3
user
vector
5
2
Level of action
Level of drama
X factor
?
0
0
0
0
Latent
factors
Zoltán
Gábor
Rogue One Interstellar
Zoltán rated Rogue One
with 5 stars
Would Gábor like Interstellar?
5 4 -4
3
2
5
𝑅
Recommendation with matrix factorization
𝑈
𝑈 ∙ 𝐼 ≈ 𝑅
item vector
3
2
5
5
3
2
5 -6 -1
5 4 -4
5
1
3
user
vector
5
2
Level of action
Level of drama
X factor
?
0
0
0
0
Latent
factors
Zoltán
Gábor
Rogue One Interstellar
Zoltán rated Rogue One
with 5 stars
Would Gábor like Interstellar?
5 4 -4
3
2
5
3
𝑅
Recommendation with matrix factorization
𝑈
𝑈 ∙ 𝐼 ≈ 𝑅
item vector
3
2
5
5
3
2
5 -6 -1
5 4 -4
5
1
3
user
vector
5
2
Level of action
Level of drama
X factor
3
0
0
0
0
Latent
factors
Zoltán
Gábor
Rogue One Interstellar
Zoltán rated Rogue One
with 5 stars
Would Gábor like Interstellar?
5 4 -4
3
2
5
3
[user; item; time; rating]
𝑅
Batch training
𝑈
item vector
5
1
3
user
vector
5
2
3
0
0
0
0
Zoltán
Gábor
Rogue One Interstellar
PERSISTENT STORAGE
[user; item; time; rating]
𝑅
Batch training
𝑈
item vector
5
1
3
user
vector
5
2
3
0
0
0
0
Zoltán
Gábor
Rogue One Interstellar
PERSISTENT STORAGE
[user; item; time; rating]
𝑅
Batch training
𝑈
item vector
3
2
5
5
3
2
5 -6 -1
5 4 -4
5
1
3
user
vector
5
2
3
0
0
0
0
Zoltán
Gábor
Rogue One Interstellar
PERSISTENT STORAGE
𝑅
Online training
𝑈
item vector
3
2
5
5
3
2
5 -6 -1
5 4 -4
5
1
3
user
vector
5 3
0
0
0
0
Zoltán
Gábor
Rogue One Interstellar
[user; item; time; rating]
2 5 4 2 4
𝑅
Online training
𝑈
item vector
3
2
6
5
3
2
5 -6 -2
5 4 -4
5
1
3
user
vector
5
2
3
0
0
0
0
Zoltán
Gábor
Rogue One Interstellar
[user; item; time; rating]
5 4 2 4
𝑅
Online training
𝑈
item vector
1
3
5
5
3
2
4 -5 -1
5 4 -4
5
1
3
user
vector
5
2
3
0
0
0
0
Zoltán
Gábor
Rogue One Interstellar
[user; item; time; rating]
5 4 2 4
Batch + online combination
But how to scale?
• Spotify streamed 20 billion hours of music in 2015
• YouTube over a billion users, billions of video views every day
• Use distributed data-analytics frameworks
• How can we combine batch + online?
Apache Spark vs. Apache Flink
𝑅
Distributed online matrix factorization
𝑈
item vector
3
2
6
5
3
2
5 -6 -2
5 4 -4
1
3
user
vector
3
0
0
0
0
Zoltán
Gábor
Rogue One Interstellar
[user; item; time; rating]
2 5 4 2 4
𝑅
Distributed online matrix factorization
𝑈
item vector
3
2
6
5
3
2
5 -6 -2
5 4 -4
1
3
user
vector
2
3
0
0
0
0
Zoltán
Gábor
Rogue One Interstellar
[user; item; time; rating]
5 4 2 4
𝑅
Distributed online matrix factorization
𝑈
item vector
3
2
6
5
3
2
5 -6 -2
5 4 -4
1
3
user
vector
2
3
0
0
0
0
Zoltán
Gábor
Rogue One Interstellar
[user; item; time; rating]
5 4 2 4
3
2
6
25 -6 -2
need to co-locate
𝑅
Distributed online matrix factorization
𝑈
item vector
3
2
6
5
3
2
5 -6 -2
5 4 -4
1
3
user
vector
2
3
0
0
0
0
Zoltán
Gábor
Rogue One Interstellar
[user; item; time; rating]
5 4 2 4
1
3
5
24 -3 -1
need to co-locate
then update
𝑅
Distributed online matrix factorization
𝑈
item vector
1
3
5
5
3
2
4 -5 -1
5 4 -4
1
3
user
vector
2
3
0
0
0
0
Zoltán
Gábor
Rogue One Interstellar
[user; item; time; rating]
5 4 2 4
1
3
5
24 -3 -1
need to co-locate
then update
send updates
𝑅
Distributed online matrix factorization
𝑈
item vector
1
3
5
5
3
2
4 -5 -1
5 4 -4
5
1
3
user
vector
5
2
3
0
0
0
0
Zoltán
Gábor
Rogue One Interstellar
5 4 2 4
process two ratings in parallel
𝑅
Distributed online matrix factorization
𝑈
item vector
1
3
5
5
3
2
4 -5 -1
5 4 -4
5
1
3
user
vector
5
2
3
0
0
0
0
Zoltán
Gábor
Rogue One Interstellar
5 4 2 4
process two ratings in parallel
𝑅
Distributed online matrix factorization
𝑈
item vector
1
3
5
5
3
2
4 -5 -1
5 4 -4
5
1
3
user
vector
5
2
3
0
0
0
0
Zoltán
Gábor
Rogue One Interstellar
5 4 2 4
process two ratings in parallel
• Concurrent modification
• Similar problem with batch SGD
• Distributed SGD
(Gemulla et al. 2011)
Online MF in Spark
val ratings: DStream[Rating] = ...
we have our input
Online MF in Spark
val ratings: DStream[Rating] = ...
val updateStream: DStream[Either[(UserId, Vector), (ItemId, Vector)]] =
we have our input
would like to have output like this
Online MF in Spark
val ratings: DStream[Rating] = ...
val updateStream: DStream[Either[(UserId, Vector), (ItemId, Vector)]] =
we have our input
would like to have output like this
updateStateByKey?
Online MF in Spark
val ratings: DStream[Rating] = ...
val updateStream: DStream[Either[(UserId, Vector), (ItemId, Vector)]] =
we have our input
would like to have output like this
updateStateByKey?
Use batch DSGD for online updates!
(discussion issue SPARK-6407)
Online MF in Spark
val ratings: DStream[Rating] = ...
var users: RDD[(UserId, Vector)] = ...
var items: RDD[(ItemId, Vector)] = ...
val updateStream: DStream[Either[(UserId, Vector), (ItemId, Vector)]] =
we have our input
would like to have output like this
need to represent factor matrices
Online MF in Spark
val ratings: DStream[Rating] = ...
var users: RDD[(UserId, Vector)] = ...
var items: RDD[(ItemId, Vector)] = ...
val updateStream: DStream[Either[(UserId, Vector), (ItemId, Vector)]] =
ratings.transform { (rs: RDD[Rating]) =>
we have our input
would like to have output like this
use transform to allow RDD operations
need to represent factor matrices
Online MF in Spark
val ratings: DStream[Rating] = ...
var users: RDD[(UserId, Vector)] = ...
var items: RDD[(ItemId, Vector)] = ...
val updateStream: DStream[Either[(UserId, Vector), (ItemId, Vector)]] =
ratings.transform { (rs: RDD[Rating]) =>
val updates = batchDSGD(rs, users, items)
we have our input
would like to have output like this
use transform to allow RDD operations
need to represent factor matrices
compute updates
Online MF in Spark
val ratings: DStream[Rating] = ...
var users: RDD[(UserId, Vector)] = ...
var items: RDD[(ItemId, Vector)] = ...
val updateStream: DStream[Either[(UserId, Vector), (ItemId, Vector)]] =
ratings.transform { (rs: RDD[Rating]) =>
val updates = batchDSGD(rs, users, items)
users = applyUserUpdates(users, updates)
items = applyItemUpdates(items, updates)
updates
}
we have our input
would like to have output like this
use transform to allow RDD operations
need to represent factor matrices
compute updates
apply updates to get updated matrices
Online MF in Spark
• Performance decreases by time
Online MF in Spark
• Performance decreases by time
• Problem: tracking lineage graph
• Solution: use checkpointing
Online MF in Spark
• Performance decreases by time
• Problem: tracking lineage graph
• Solution: use checkpointing
Online MF in Flink
user
vectors
item
vectors
long-running operators with state
Online MF in Flink
user
vectors
item
vectors
long-running operators with state
backward edge in dataflow
(stream loop)
Online MF in Flink
1. rating event
2
user
vectors
item
vectors
Online MF in Flink
1. rating event 2. rating event & user vector
25 -6 -22
user
vectors
item
vectors
Online MF in Flink
1. rating event 2. rating event & user vector 25 -6 -2
3
2
6
25 -6 -22
user
vectors
item
vectors
Online MF in Flink
1. rating event 2. rating event & user vector
3. apply update
2
25 -6 -22
user
vectors
item
vectors
4 -3 -1
1
3
5
Online MF in Flink
1. rating event 2. rating event & user vector
4. user vector update
3. apply update
2
25 -6 -22
user
vectors
item
vectors
4 -3 -1
1
3
5
4 -3 -1
Online MF in Flink
WARNING!
Loops API (iterative streams) not mature enough yet,
but there is ongoing effort
1. rating event 2. rating event & user vector
4. user vector update
3. apply update
2
25 -6 -22
user
vectors
item
vectors
4 -3 -1
1
3
5
4 -3 -1
Online MF: Spark vs. Flink
Combining batch + online in Spark
• Easy: can run batch training periodically on whole dataset
Combining batch + online in Flink
• Combining Flink Batch API with Streaming API
• Could only do it with an external system
Combining batch + online in Flink
• Combining Flink Batch API with Streaming API
• Could only do it with an external system
• Batch with Streaming API
• Feasible!
• Asynchronous training
(Schelter et al. 2014)
Combining batch + online in Flink
• Combining Flink Batch API with Streaming API
• Could only do it with an external system
• Batch with Streaming API
• Feasible!
• Asynchronous training
(Schelter et al. 2014)
• Batch + online
• Both with Streaming API
• Share matrices in common state
• Parameter Server approach
Lessons learned
Lessons learned
Flink Spark
Implementation More complex solution,
harder to implement
Easier to use:
could use batch for streaming
Lessons learned
Flink Spark
Implementation More complex solution,
harder to implement
Easier to use:
could use batch for streaming
Generality Can express finer grained updates Updates limited by mini-batch
Lessons learned
Flink Spark
Implementation More complex solution,
harder to implement
Easier to use:
could use batch for streaming
Generality Can express finer grained updates Updates limited by mini-batch
Code stability Some parts are not mature enough
(e.g. Loops API)
More mature
Lessons learned
Flink Spark
Implementation More complex solution,
harder to implement
Easier to use:
could use batch for streaming
Generality Can express finer grained updates Updates limited by mini-batch
Code stability Some parts are not mature enough
(e.g. Loops API)
More mature
Performance Optimal for online learning,
can perform well on batch
Not always optimal for online
learning (e.g. online MF)
Lessons learned
Flink Spark
Implementation More complex solution,
harder to implement
Easier to use:
could use batch for streaming
Generality Can express finer grained updates Updates limited by mini-batch
Code stability Some parts are not mature enough
(e.g. Loops API)
More mature
Performance Optimal for online learning,
can perform well on batch
Not always optimal for online
learning (e.g. online MF)
Handling
data skew
Currently hard to relocate
long-running operators
Periodic scheduling enables easier
modification of partitioning
Lessons learned
Flink Spark
Implementation More complex solution,
harder to implement
Easier to use:
could use batch for streaming
Generality Can express finer grained updates Updates limited by mini-batch
Code stability Some parts are not mature enough
(e.g. Loops API)
More mature
Performance Optimal for online learning,
can perform well on batch
Not always optimal for online
learning (e.g. online MF)
Handling
data skew
Currently hard to relocate
long-running operators
Periodic scheduling enables easier
modification of partitioning
Machine learning Non-complete ML library
and other efforts for ML in Flink
Spark MLlib is mature
and used in production
Thank you for your attention
Zoltán Zvara
zoltan.zvara@ilab.sztaki.hu
Gábor Hermann
ghermann@ilab.sztaki.hu
Source code:
https://github.com/gaborhermann/large-scale-recommendation
Measurements
Batch + online combination
• 30M music listening Last.fm dataset
• Weekly batch training
• Evaluation weekly average
• on every incoming listening
• Around 45.000 users
Online MF: Spark vs. Flink
• 30M music listening Last.fm dataset read from 12 Kafka partitions
• Spark batch duration: 5 sec
• Time of processing X ratings
• DSGD algorithm
• Using 6 nodes, 4 cores each
• Spark 2.1.0, Flink 1.2.0
Batch on Flink Streaming
• Movielens 1M movie rating dataset
• Using 6 nodes, 4 cores each

Weitere ähnliche Inhalte

Was ist angesagt?

[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...YONG ZHENG
 
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks Christopher Morris
 
Using systems thinking to improve organisations
Using systems thinking to improve organisationsUsing systems thinking to improve organisations
Using systems thinking to improve organisationsDavid Alman
 
The Graph Traversal Programming Pattern
The Graph Traversal Programming PatternThe Graph Traversal Programming Pattern
The Graph Traversal Programming PatternMarko Rodriguez
 
Explainability for Natural Language Processing
Explainability for Natural Language ProcessingExplainability for Natural Language Processing
Explainability for Natural Language ProcessingYunyao Li
 
Why start using uplift models for more efficient marketing campaigns
Why start using uplift models for more efficient marketing campaignsWhy start using uplift models for more efficient marketing campaigns
Why start using uplift models for more efficient marketing campaignsData Con LA
 
Discovery Workshop Template
Discovery Workshop TemplateDiscovery Workshop Template
Discovery Workshop Templatedesigner DATA
 
ENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-MeansENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-Meanstthonet
 
Science to Data Science: PhDs and postdocs moving to startups and industry (2...
Science to Data Science: PhDs and postdocs moving to startups and industry (2...Science to Data Science: PhDs and postdocs moving to startups and industry (2...
Science to Data Science: PhDs and postdocs moving to startups and industry (2...AI Guild
 
Construindo um Servidor Web com GO
Construindo um Servidor Web com GOConstruindo um Servidor Web com GO
Construindo um Servidor Web com GOBeto Muniz
 
From Key-Value to Multi-Model - RedisConf19
From Key-Value to Multi-Model - RedisConf19From Key-Value to Multi-Model - RedisConf19
From Key-Value to Multi-Model - RedisConf19Guy Korland
 
Comparing three data ingestion approaches where Apache Kafka integrates with ...
Comparing three data ingestion approaches where Apache Kafka integrates with ...Comparing three data ingestion approaches where Apache Kafka integrates with ...
Comparing three data ingestion approaches where Apache Kafka integrates with ...HostedbyConfluent
 
品管七大手法
品管七大手法品管七大手法
品管七大手法5045033
 
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)台灣資料科學年會
 
Matrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender SystemsMatrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender SystemsAladejubelo Oluwashina
 
aggregation and indexing with suitable example using MongoDB.
aggregation and indexing with suitable example using MongoDB.aggregation and indexing with suitable example using MongoDB.
aggregation and indexing with suitable example using MongoDB.bhavesh lande
 
GraphQL Advanced
GraphQL AdvancedGraphQL Advanced
GraphQL AdvancedLeanIX GmbH
 

Was ist angesagt? (20)

[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
[SAC2014]Splitting Approaches for Context-Aware Recommendation: An Empirical ...
 
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks
 
Using systems thinking to improve organisations
Using systems thinking to improve organisationsUsing systems thinking to improve organisations
Using systems thinking to improve organisations
 
The Graph Traversal Programming Pattern
The Graph Traversal Programming PatternThe Graph Traversal Programming Pattern
The Graph Traversal Programming Pattern
 
Explainability for Natural Language Processing
Explainability for Natural Language ProcessingExplainability for Natural Language Processing
Explainability for Natural Language Processing
 
Why start using uplift models for more efficient marketing campaigns
Why start using uplift models for more efficient marketing campaignsWhy start using uplift models for more efficient marketing campaigns
Why start using uplift models for more efficient marketing campaigns
 
Monte Carlo Method Introduction
Monte Carlo Method IntroductionMonte Carlo Method Introduction
Monte Carlo Method Introduction
 
Discovery Workshop Template
Discovery Workshop TemplateDiscovery Workshop Template
Discovery Workshop Template
 
ENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-MeansENBIS 2018 presentation on Deep k-Means
ENBIS 2018 presentation on Deep k-Means
 
Science to Data Science: PhDs and postdocs moving to startups and industry (2...
Science to Data Science: PhDs and postdocs moving to startups and industry (2...Science to Data Science: PhDs and postdocs moving to startups and industry (2...
Science to Data Science: PhDs and postdocs moving to startups and industry (2...
 
Construindo um Servidor Web com GO
Construindo um Servidor Web com GOConstruindo um Servidor Web com GO
Construindo um Servidor Web com GO
 
From Key-Value to Multi-Model - RedisConf19
From Key-Value to Multi-Model - RedisConf19From Key-Value to Multi-Model - RedisConf19
From Key-Value to Multi-Model - RedisConf19
 
Comparing three data ingestion approaches where Apache Kafka integrates with ...
Comparing three data ingestion approaches where Apache Kafka integrates with ...Comparing three data ingestion approaches where Apache Kafka integrates with ...
Comparing three data ingestion approaches where Apache Kafka integrates with ...
 
品管七大手法
品管七大手法品管七大手法
品管七大手法
 
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)
[系列活動] 智慧製造與生產線上的資料科學 (製造資料科學:從預測性思維到處方性決策)
 
1 2 林韋丞 老年睡眠障礙
1 2 林韋丞 老年睡眠障礙1 2 林韋丞 老年睡眠障礙
1 2 林韋丞 老年睡眠障礙
 
Matrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender SystemsMatrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender Systems
 
aggregation and indexing with suitable example using MongoDB.
aggregation and indexing with suitable example using MongoDB.aggregation and indexing with suitable example using MongoDB.
aggregation and indexing with suitable example using MongoDB.
 
Software Quality Assurance
Software Quality AssuranceSoftware Quality Assurance
Software Quality Assurance
 
GraphQL Advanced
GraphQL AdvancedGraphQL Advanced
GraphQL Advanced
 

Andere mochten auch

Scala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music RecommendationsScala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music RecommendationsChris Johnson
 
Sparking Science up with Research Recommendations by Maya Hristakeva
Sparking Science up with Research Recommendations by Maya HristakevaSparking Science up with Research Recommendations by Maya Hristakeva
Sparking Science up with Research Recommendations by Maya HristakevaSpark Summit
 
Food Recommendation System Using Clustering Analysis for Diabetic patients
Food Recommendation System Using Clustering Analysis for Diabetic patientsFood Recommendation System Using Clustering Analysis for Diabetic patients
Food Recommendation System Using Clustering Analysis for Diabetic patientsMaiyaporn Phanich
 
iRIS: A Large-Scale Food and Recipe Recommendation System Using Spark-(Joohyu...
iRIS: A Large-Scale Food and Recipe Recommendation System Using Spark-(Joohyu...iRIS: A Large-Scale Food and Recipe Recommendation System Using Spark-(Joohyu...
iRIS: A Large-Scale Food and Recipe Recommendation System Using Spark-(Joohyu...Spark Summit
 
Apache Spark Performance Observations
Apache Spark Performance ObservationsApache Spark Performance Observations
Apache Spark Performance ObservationsAdam Roberts
 
Comparing topic models for a movie recommendation system webist2014
Comparing topic models for a movie recommendation system webist2014Comparing topic models for a movie recommendation system webist2014
Comparing topic models for a movie recommendation system webist2014Laura Po
 
Apache Spark Training | Spark Tutorial For Beginners | Apache Spark Certifica...
Apache Spark Training | Spark Tutorial For Beginners | Apache Spark Certifica...Apache Spark Training | Spark Tutorial For Beginners | Apache Spark Certifica...
Apache Spark Training | Spark Tutorial For Beginners | Apache Spark Certifica...Edureka!
 
Movie recommendation system using Apache Mahout and Facebook APIs
Movie recommendation system using Apache Mahout and Facebook APIsMovie recommendation system using Apache Mahout and Facebook APIs
Movie recommendation system using Apache Mahout and Facebook APIsSmitha Mysore Lokesh
 
Developing a Movie recommendation Engine with Spark
Developing a Movie recommendation Engine with SparkDeveloping a Movie recommendation Engine with Spark
Developing a Movie recommendation Engine with SparkEdureka!
 
How to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on SparkHow to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on SparkCaserta
 
Collaborative Filtering with Spark
Collaborative Filtering with SparkCollaborative Filtering with Spark
Collaborative Filtering with SparkChris Johnson
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Xavier Amatriain
 

Andere mochten auch (12)

Scala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music RecommendationsScala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music Recommendations
 
Sparking Science up with Research Recommendations by Maya Hristakeva
Sparking Science up with Research Recommendations by Maya HristakevaSparking Science up with Research Recommendations by Maya Hristakeva
Sparking Science up with Research Recommendations by Maya Hristakeva
 
Food Recommendation System Using Clustering Analysis for Diabetic patients
Food Recommendation System Using Clustering Analysis for Diabetic patientsFood Recommendation System Using Clustering Analysis for Diabetic patients
Food Recommendation System Using Clustering Analysis for Diabetic patients
 
iRIS: A Large-Scale Food and Recipe Recommendation System Using Spark-(Joohyu...
iRIS: A Large-Scale Food and Recipe Recommendation System Using Spark-(Joohyu...iRIS: A Large-Scale Food and Recipe Recommendation System Using Spark-(Joohyu...
iRIS: A Large-Scale Food and Recipe Recommendation System Using Spark-(Joohyu...
 
Apache Spark Performance Observations
Apache Spark Performance ObservationsApache Spark Performance Observations
Apache Spark Performance Observations
 
Comparing topic models for a movie recommendation system webist2014
Comparing topic models for a movie recommendation system webist2014Comparing topic models for a movie recommendation system webist2014
Comparing topic models for a movie recommendation system webist2014
 
Apache Spark Training | Spark Tutorial For Beginners | Apache Spark Certifica...
Apache Spark Training | Spark Tutorial For Beginners | Apache Spark Certifica...Apache Spark Training | Spark Tutorial For Beginners | Apache Spark Certifica...
Apache Spark Training | Spark Tutorial For Beginners | Apache Spark Certifica...
 
Movie recommendation system using Apache Mahout and Facebook APIs
Movie recommendation system using Apache Mahout and Facebook APIsMovie recommendation system using Apache Mahout and Facebook APIs
Movie recommendation system using Apache Mahout and Facebook APIs
 
Developing a Movie recommendation Engine with Spark
Developing a Movie recommendation Engine with SparkDeveloping a Movie recommendation Engine with Spark
Developing a Movie recommendation Engine with Spark
 
How to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on SparkHow to Build a Recommendation Engine on Spark
How to Build a Recommendation Engine on Spark
 
Collaborative Filtering with Spark
Collaborative Filtering with SparkCollaborative Filtering with Spark
Collaborative Filtering with Spark
 
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
 

Ähnlich wie Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and Spark

PredictionIO - Building Applications That Predict User Behavior Through Big D...
PredictionIO - Building Applications That Predict User Behavior Through Big D...PredictionIO - Building Applications That Predict User Behavior Through Big D...
PredictionIO - Building Applications That Predict User Behavior Through Big D...predictionio
 
Flink Forward Berlin 2017: Daniel Berecz, Gabor Hermann - Parameter Server on...
Flink Forward Berlin 2017: Daniel Berecz, Gabor Hermann - Parameter Server on...Flink Forward Berlin 2017: Daniel Berecz, Gabor Hermann - Parameter Server on...
Flink Forward Berlin 2017: Daniel Berecz, Gabor Hermann - Parameter Server on...Flink Forward
 
ML Zoomcamp 1.8 - Linear Algebra Refresher
ML Zoomcamp 1.8 - Linear Algebra RefresherML Zoomcamp 1.8 - Linear Algebra Refresher
ML Zoomcamp 1.8 - Linear Algebra RefresherAlexey Grigorev
 
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
 Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ... Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...Big Data Spain
 
Building a real time big data analytics platform with solr
Building a real time big data analytics platform with solrBuilding a real time big data analytics platform with solr
Building a real time big data analytics platform with solrTrey Grainger
 
Building a real time, big data analytics platform with solr
Building a real time, big data analytics platform with solrBuilding a real time, big data analytics platform with solr
Building a real time, big data analytics platform with solrlucenerevolution
 
An Inter-Wiki Page Data Processor for a M2M System @Matsue, 1sep., Eskm2013
An Inter-Wiki Page Data Processor for a M2M System  @Matsue, 1sep., Eskm2013An Inter-Wiki Page Data Processor for a M2M System  @Matsue, 1sep., Eskm2013
An Inter-Wiki Page Data Processor for a M2M System @Matsue, 1sep., Eskm2013Takashi Yamanoue
 
Cypher to SQL online mapper
Cypher to SQL online mapperCypher to SQL online mapper
Cypher to SQL online mapperAl Zindiq
 
Probo.ci Drupal 4 Gov Devops 1/2 day Presentation
Probo.ci Drupal 4 Gov Devops 1/2 day Presentation Probo.ci Drupal 4 Gov Devops 1/2 day Presentation
Probo.ci Drupal 4 Gov Devops 1/2 day Presentation Zivtech, LLC
 
Rokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptxRokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptxJadna Almeida
 
Rokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptxRokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptxJadna Almeida
 
Stateful patterns in Azure Functions
Stateful patterns in Azure FunctionsStateful patterns in Azure Functions
Stateful patterns in Azure FunctionsMassimo Bonanni
 
Graph processing at scale using spark & graph frames
Graph processing at scale using spark & graph framesGraph processing at scale using spark & graph frames
Graph processing at scale using spark & graph framesRon Barabash
 
Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systemsNAVER Engineering
 
LSP ( Logic Score Preference ) _ Rajan_Dhabalia_San Francisco State University
LSP ( Logic Score Preference ) _ Rajan_Dhabalia_San Francisco State UniversityLSP ( Logic Score Preference ) _ Rajan_Dhabalia_San Francisco State University
LSP ( Logic Score Preference ) _ Rajan_Dhabalia_San Francisco State Universitydhabalia
 
Ego web qqml presentation 2016 pdf export
Ego web qqml presentation 2016 pdf exportEgo web qqml presentation 2016 pdf export
Ego web qqml presentation 2016 pdf exportDavid Kennedy
 
Path Analyzer X-Files: How We Built the Ultimate xDB Forensic Tool
Path Analyzer X-Files: How We Built the Ultimate xDB Forensic ToolPath Analyzer X-Files: How We Built the Ultimate xDB Forensic Tool
Path Analyzer X-Files: How We Built the Ultimate xDB Forensic ToolSitecore
 

Ähnlich wie Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and Spark (20)

PredictionIO - Building Applications That Predict User Behavior Through Big D...
PredictionIO - Building Applications That Predict User Behavior Through Big D...PredictionIO - Building Applications That Predict User Behavior Through Big D...
PredictionIO - Building Applications That Predict User Behavior Through Big D...
 
Flink Forward Berlin 2017: Daniel Berecz, Gabor Hermann - Parameter Server on...
Flink Forward Berlin 2017: Daniel Berecz, Gabor Hermann - Parameter Server on...Flink Forward Berlin 2017: Daniel Berecz, Gabor Hermann - Parameter Server on...
Flink Forward Berlin 2017: Daniel Berecz, Gabor Hermann - Parameter Server on...
 
ML Zoomcamp 1.8 - Linear Algebra Refresher
ML Zoomcamp 1.8 - Linear Algebra RefresherML Zoomcamp 1.8 - Linear Algebra Refresher
ML Zoomcamp 1.8 - Linear Algebra Refresher
 
Matrix Factorization
Matrix FactorizationMatrix Factorization
Matrix Factorization
 
Angular 5
Angular 5Angular 5
Angular 5
 
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
 Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ... Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
 
Building a real time big data analytics platform with solr
Building a real time big data analytics platform with solrBuilding a real time big data analytics platform with solr
Building a real time big data analytics platform with solr
 
Building a real time, big data analytics platform with solr
Building a real time, big data analytics platform with solrBuilding a real time, big data analytics platform with solr
Building a real time, big data analytics platform with solr
 
An Inter-Wiki Page Data Processor for a M2M System @Matsue, 1sep., Eskm2013
An Inter-Wiki Page Data Processor for a M2M System  @Matsue, 1sep., Eskm2013An Inter-Wiki Page Data Processor for a M2M System  @Matsue, 1sep., Eskm2013
An Inter-Wiki Page Data Processor for a M2M System @Matsue, 1sep., Eskm2013
 
Cypher to SQL online mapper
Cypher to SQL online mapperCypher to SQL online mapper
Cypher to SQL online mapper
 
Probo.ci Drupal 4 Gov Devops 1/2 day Presentation
Probo.ci Drupal 4 Gov Devops 1/2 day Presentation Probo.ci Drupal 4 Gov Devops 1/2 day Presentation
Probo.ci Drupal 4 Gov Devops 1/2 day Presentation
 
Rokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptxRokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptx
 
Rokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptxRokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptx
 
Stateful patterns in Azure Functions
Stateful patterns in Azure FunctionsStateful patterns in Azure Functions
Stateful patterns in Azure Functions
 
Software-defined Networks as Databases
Software-defined Networks as DatabasesSoftware-defined Networks as Databases
Software-defined Networks as Databases
 
Graph processing at scale using spark & graph frames
Graph processing at scale using spark & graph framesGraph processing at scale using spark & graph frames
Graph processing at scale using spark & graph frames
 
Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systems
 
LSP ( Logic Score Preference ) _ Rajan_Dhabalia_San Francisco State University
LSP ( Logic Score Preference ) _ Rajan_Dhabalia_San Francisco State UniversityLSP ( Logic Score Preference ) _ Rajan_Dhabalia_San Francisco State University
LSP ( Logic Score Preference ) _ Rajan_Dhabalia_San Francisco State University
 
Ego web qqml presentation 2016 pdf export
Ego web qqml presentation 2016 pdf exportEgo web qqml presentation 2016 pdf export
Ego web qqml presentation 2016 pdf export
 
Path Analyzer X-Files: How We Built the Ultimate xDB Forensic Tool
Path Analyzer X-Files: How We Built the Ultimate xDB Forensic ToolPath Analyzer X-Files: How We Built the Ultimate xDB Forensic Tool
Path Analyzer X-Files: How We Built the Ultimate xDB Forensic Tool
 

Mehr von DataWorks Summit/Hadoop Summit

Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerDataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformDataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLDataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...DataWorks Summit/Hadoop Summit
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesDataWorks Summit/Hadoop Summit
 

Mehr von DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
 

Kürzlich hochgeladen

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 

Kürzlich hochgeladen (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 

Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and Spark

  • 1. Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and Spark Zoltán Zvara zoltan.zvara@ilab.sztaki.hu Gábor Hermann ghermann@ilab.sztaki.hu This project has received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement No 688191.
  • 2. About us • Institute for Computer Science and Control, Hungarian Academy of Sciences (MTA SZTAKI) • Informatics Laboratory • „Big Data – Momemtum” research group • „Data Mining and Search” research group • Research group with strong industry ties • Ericsson, Rovio, Portugal Telekom, etc.
  • 3. Agenda 1. Recommendation systems and matrix factorization 2. Batch vs. online 3. Matrix factorization 1. Online 2. Batch + online 4. Solution in Spark & Flink 5. Conclusions
  • 6. 𝑅 Recommendation with matrix factorization 5 1 3 5 2 0 0 0 0 Zoltán Gábor Rogue One Interstellar Zoltán rated Rogue One with 5 stars
  • 7. 𝑅 Recommendation with matrix factorization 𝑈 𝑈 ∙ 𝐼 ≈ 𝑅 item vector 3 2 5 5 3 2 5 -6 -1 5 4 -4 5 1 3 user vector 5 2 Level of action Level of drama X factor 0 0 0 0 Latent factors Zoltán Gábor Rogue One Interstellar Zoltán rated Rogue One with 5 stars
  • 8. 𝑅 Recommendation with matrix factorization 𝑈 𝑈 ∙ 𝐼 ≈ 𝑅 item vector 3 2 5 5 3 2 5 -6 -1 5 4 -4 5 1 3 user vector 5 2 Level of action Level of drama X factor 0 0 0 0 Latent factors Zoltán Gábor Rogue One Interstellar min 𝑢∗,𝑖∗ (𝑝,𝑞)∈𝜅 𝑅 𝑟𝑝𝑞 − 𝜇 − 𝑏 𝑝 − 𝑏 𝑞 − 𝑢 𝑝 𝑖 𝑞 2 + +𝜆 𝑝∈𝜅 𝑈 ( 𝑢 𝑝 2 + 𝑏 𝑝 2 ) + 𝜆 𝑞∈𝜅 𝐼 ( 𝑖 𝑞 2 + 𝑏 𝑞 2 ) Zoltán rated Rogue One with 5 stars
  • 9. 𝑅 Recommendation with matrix factorization 𝑈 𝑈 ∙ 𝐼 ≈ 𝑅 item vector 3 2 5 5 3 2 5 -6 -1 5 4 -4 5 1 3 user vector 5 2 Level of action Level of drama X factor ? 0 0 0 0 Latent factors Zoltán Gábor Rogue One Interstellar Zoltán rated Rogue One with 5 stars Would Gábor like Interstellar?
  • 10. 𝑅 Recommendation with matrix factorization 𝑈 𝑈 ∙ 𝐼 ≈ 𝑅 item vector 3 2 5 5 3 2 5 -6 -1 5 4 -4 5 1 3 user vector 5 2 Level of action Level of drama X factor ? 0 0 0 0 Latent factors Zoltán Gábor Rogue One Interstellar Zoltán rated Rogue One with 5 stars Would Gábor like Interstellar?
  • 11. 𝑅 Recommendation with matrix factorization 𝑈 𝑈 ∙ 𝐼 ≈ 𝑅 item vector 3 2 5 5 3 2 5 -6 -1 5 4 -4 5 1 3 user vector 5 2 Level of action Level of drama X factor ? 0 0 0 0 Latent factors Zoltán Gábor Rogue One Interstellar Zoltán rated Rogue One with 5 stars Would Gábor like Interstellar? 5 4 -4 3 2 5
  • 12. 𝑅 Recommendation with matrix factorization 𝑈 𝑈 ∙ 𝐼 ≈ 𝑅 item vector 3 2 5 5 3 2 5 -6 -1 5 4 -4 5 1 3 user vector 5 2 Level of action Level of drama X factor ? 0 0 0 0 Latent factors Zoltán Gábor Rogue One Interstellar Zoltán rated Rogue One with 5 stars Would Gábor like Interstellar? 5 4 -4 3 2 5 3
  • 13. 𝑅 Recommendation with matrix factorization 𝑈 𝑈 ∙ 𝐼 ≈ 𝑅 item vector 3 2 5 5 3 2 5 -6 -1 5 4 -4 5 1 3 user vector 5 2 Level of action Level of drama X factor 3 0 0 0 0 Latent factors Zoltán Gábor Rogue One Interstellar Zoltán rated Rogue One with 5 stars Would Gábor like Interstellar? 5 4 -4 3 2 5 3
  • 14. [user; item; time; rating] 𝑅 Batch training 𝑈 item vector 5 1 3 user vector 5 2 3 0 0 0 0 Zoltán Gábor Rogue One Interstellar PERSISTENT STORAGE
  • 15. [user; item; time; rating] 𝑅 Batch training 𝑈 item vector 5 1 3 user vector 5 2 3 0 0 0 0 Zoltán Gábor Rogue One Interstellar PERSISTENT STORAGE
  • 16. [user; item; time; rating] 𝑅 Batch training 𝑈 item vector 3 2 5 5 3 2 5 -6 -1 5 4 -4 5 1 3 user vector 5 2 3 0 0 0 0 Zoltán Gábor Rogue One Interstellar PERSISTENT STORAGE
  • 17. 𝑅 Online training 𝑈 item vector 3 2 5 5 3 2 5 -6 -1 5 4 -4 5 1 3 user vector 5 3 0 0 0 0 Zoltán Gábor Rogue One Interstellar [user; item; time; rating] 2 5 4 2 4
  • 18. 𝑅 Online training 𝑈 item vector 3 2 6 5 3 2 5 -6 -2 5 4 -4 5 1 3 user vector 5 2 3 0 0 0 0 Zoltán Gábor Rogue One Interstellar [user; item; time; rating] 5 4 2 4
  • 19. 𝑅 Online training 𝑈 item vector 1 3 5 5 3 2 4 -5 -1 5 4 -4 5 1 3 user vector 5 2 3 0 0 0 0 Zoltán Gábor Rogue One Interstellar [user; item; time; rating] 5 4 2 4
  • 20. Batch + online combination
  • 21. But how to scale? • Spotify streamed 20 billion hours of music in 2015 • YouTube over a billion users, billions of video views every day • Use distributed data-analytics frameworks • How can we combine batch + online?
  • 22. Apache Spark vs. Apache Flink
  • 23. 𝑅 Distributed online matrix factorization 𝑈 item vector 3 2 6 5 3 2 5 -6 -2 5 4 -4 1 3 user vector 3 0 0 0 0 Zoltán Gábor Rogue One Interstellar [user; item; time; rating] 2 5 4 2 4
  • 24. 𝑅 Distributed online matrix factorization 𝑈 item vector 3 2 6 5 3 2 5 -6 -2 5 4 -4 1 3 user vector 2 3 0 0 0 0 Zoltán Gábor Rogue One Interstellar [user; item; time; rating] 5 4 2 4
  • 25. 𝑅 Distributed online matrix factorization 𝑈 item vector 3 2 6 5 3 2 5 -6 -2 5 4 -4 1 3 user vector 2 3 0 0 0 0 Zoltán Gábor Rogue One Interstellar [user; item; time; rating] 5 4 2 4 3 2 6 25 -6 -2 need to co-locate
  • 26. 𝑅 Distributed online matrix factorization 𝑈 item vector 3 2 6 5 3 2 5 -6 -2 5 4 -4 1 3 user vector 2 3 0 0 0 0 Zoltán Gábor Rogue One Interstellar [user; item; time; rating] 5 4 2 4 1 3 5 24 -3 -1 need to co-locate then update
  • 27. 𝑅 Distributed online matrix factorization 𝑈 item vector 1 3 5 5 3 2 4 -5 -1 5 4 -4 1 3 user vector 2 3 0 0 0 0 Zoltán Gábor Rogue One Interstellar [user; item; time; rating] 5 4 2 4 1 3 5 24 -3 -1 need to co-locate then update send updates
  • 28. 𝑅 Distributed online matrix factorization 𝑈 item vector 1 3 5 5 3 2 4 -5 -1 5 4 -4 5 1 3 user vector 5 2 3 0 0 0 0 Zoltán Gábor Rogue One Interstellar 5 4 2 4 process two ratings in parallel
  • 29. 𝑅 Distributed online matrix factorization 𝑈 item vector 1 3 5 5 3 2 4 -5 -1 5 4 -4 5 1 3 user vector 5 2 3 0 0 0 0 Zoltán Gábor Rogue One Interstellar 5 4 2 4 process two ratings in parallel
  • 30. 𝑅 Distributed online matrix factorization 𝑈 item vector 1 3 5 5 3 2 4 -5 -1 5 4 -4 5 1 3 user vector 5 2 3 0 0 0 0 Zoltán Gábor Rogue One Interstellar 5 4 2 4 process two ratings in parallel • Concurrent modification • Similar problem with batch SGD • Distributed SGD (Gemulla et al. 2011)
  • 31. Online MF in Spark val ratings: DStream[Rating] = ... we have our input
  • 32. Online MF in Spark val ratings: DStream[Rating] = ... val updateStream: DStream[Either[(UserId, Vector), (ItemId, Vector)]] = we have our input would like to have output like this
  • 33. Online MF in Spark val ratings: DStream[Rating] = ... val updateStream: DStream[Either[(UserId, Vector), (ItemId, Vector)]] = we have our input would like to have output like this updateStateByKey?
  • 34. Online MF in Spark val ratings: DStream[Rating] = ... val updateStream: DStream[Either[(UserId, Vector), (ItemId, Vector)]] = we have our input would like to have output like this updateStateByKey? Use batch DSGD for online updates! (discussion issue SPARK-6407)
  • 35. Online MF in Spark val ratings: DStream[Rating] = ... var users: RDD[(UserId, Vector)] = ... var items: RDD[(ItemId, Vector)] = ... val updateStream: DStream[Either[(UserId, Vector), (ItemId, Vector)]] = we have our input would like to have output like this need to represent factor matrices
  • 36. Online MF in Spark val ratings: DStream[Rating] = ... var users: RDD[(UserId, Vector)] = ... var items: RDD[(ItemId, Vector)] = ... val updateStream: DStream[Either[(UserId, Vector), (ItemId, Vector)]] = ratings.transform { (rs: RDD[Rating]) => we have our input would like to have output like this use transform to allow RDD operations need to represent factor matrices
  • 37. Online MF in Spark val ratings: DStream[Rating] = ... var users: RDD[(UserId, Vector)] = ... var items: RDD[(ItemId, Vector)] = ... val updateStream: DStream[Either[(UserId, Vector), (ItemId, Vector)]] = ratings.transform { (rs: RDD[Rating]) => val updates = batchDSGD(rs, users, items) we have our input would like to have output like this use transform to allow RDD operations need to represent factor matrices compute updates
  • 38. Online MF in Spark val ratings: DStream[Rating] = ... var users: RDD[(UserId, Vector)] = ... var items: RDD[(ItemId, Vector)] = ... val updateStream: DStream[Either[(UserId, Vector), (ItemId, Vector)]] = ratings.transform { (rs: RDD[Rating]) => val updates = batchDSGD(rs, users, items) users = applyUserUpdates(users, updates) items = applyItemUpdates(items, updates) updates } we have our input would like to have output like this use transform to allow RDD operations need to represent factor matrices compute updates apply updates to get updated matrices
  • 39. Online MF in Spark • Performance decreases by time
  • 40. Online MF in Spark • Performance decreases by time • Problem: tracking lineage graph • Solution: use checkpointing
  • 41. Online MF in Spark • Performance decreases by time • Problem: tracking lineage graph • Solution: use checkpointing
  • 42. Online MF in Flink user vectors item vectors long-running operators with state
  • 43. Online MF in Flink user vectors item vectors long-running operators with state backward edge in dataflow (stream loop)
  • 44. Online MF in Flink 1. rating event 2 user vectors item vectors
  • 45. Online MF in Flink 1. rating event 2. rating event & user vector 25 -6 -22 user vectors item vectors
  • 46. Online MF in Flink 1. rating event 2. rating event & user vector 25 -6 -2 3 2 6 25 -6 -22 user vectors item vectors
  • 47. Online MF in Flink 1. rating event 2. rating event & user vector 3. apply update 2 25 -6 -22 user vectors item vectors 4 -3 -1 1 3 5
  • 48. Online MF in Flink 1. rating event 2. rating event & user vector 4. user vector update 3. apply update 2 25 -6 -22 user vectors item vectors 4 -3 -1 1 3 5 4 -3 -1
  • 49. Online MF in Flink WARNING! Loops API (iterative streams) not mature enough yet, but there is ongoing effort 1. rating event 2. rating event & user vector 4. user vector update 3. apply update 2 25 -6 -22 user vectors item vectors 4 -3 -1 1 3 5 4 -3 -1
  • 50. Online MF: Spark vs. Flink
  • 51. Combining batch + online in Spark • Easy: can run batch training periodically on whole dataset
  • 52. Combining batch + online in Flink • Combining Flink Batch API with Streaming API • Could only do it with an external system
  • 53. Combining batch + online in Flink • Combining Flink Batch API with Streaming API • Could only do it with an external system • Batch with Streaming API • Feasible! • Asynchronous training (Schelter et al. 2014)
  • 54. Combining batch + online in Flink • Combining Flink Batch API with Streaming API • Could only do it with an external system • Batch with Streaming API • Feasible! • Asynchronous training (Schelter et al. 2014) • Batch + online • Both with Streaming API • Share matrices in common state • Parameter Server approach
  • 56. Lessons learned Flink Spark Implementation More complex solution, harder to implement Easier to use: could use batch for streaming
  • 57. Lessons learned Flink Spark Implementation More complex solution, harder to implement Easier to use: could use batch for streaming Generality Can express finer grained updates Updates limited by mini-batch
  • 58. Lessons learned Flink Spark Implementation More complex solution, harder to implement Easier to use: could use batch for streaming Generality Can express finer grained updates Updates limited by mini-batch Code stability Some parts are not mature enough (e.g. Loops API) More mature
  • 59. Lessons learned Flink Spark Implementation More complex solution, harder to implement Easier to use: could use batch for streaming Generality Can express finer grained updates Updates limited by mini-batch Code stability Some parts are not mature enough (e.g. Loops API) More mature Performance Optimal for online learning, can perform well on batch Not always optimal for online learning (e.g. online MF)
  • 60. Lessons learned Flink Spark Implementation More complex solution, harder to implement Easier to use: could use batch for streaming Generality Can express finer grained updates Updates limited by mini-batch Code stability Some parts are not mature enough (e.g. Loops API) More mature Performance Optimal for online learning, can perform well on batch Not always optimal for online learning (e.g. online MF) Handling data skew Currently hard to relocate long-running operators Periodic scheduling enables easier modification of partitioning
  • 61. Lessons learned Flink Spark Implementation More complex solution, harder to implement Easier to use: could use batch for streaming Generality Can express finer grained updates Updates limited by mini-batch Code stability Some parts are not mature enough (e.g. Loops API) More mature Performance Optimal for online learning, can perform well on batch Not always optimal for online learning (e.g. online MF) Handling data skew Currently hard to relocate long-running operators Periodic scheduling enables easier modification of partitioning Machine learning Non-complete ML library and other efforts for ML in Flink Spark MLlib is mature and used in production
  • 62. Thank you for your attention Zoltán Zvara zoltan.zvara@ilab.sztaki.hu Gábor Hermann ghermann@ilab.sztaki.hu Source code: https://github.com/gaborhermann/large-scale-recommendation
  • 64. Batch + online combination • 30M music listening Last.fm dataset • Weekly batch training • Evaluation weekly average • on every incoming listening • Around 45.000 users
  • 65. Online MF: Spark vs. Flink • 30M music listening Last.fm dataset read from 12 Kafka partitions • Spark batch duration: 5 sec • Time of processing X ratings • DSGD algorithm • Using 6 nodes, 4 cores each • Spark 2.1.0, Flink 1.2.0
  • 66. Batch on Flink Streaming • Movielens 1M movie rating dataset • Using 6 nodes, 4 cores each

Hinweis der Redaktion

  1. Say that we focus on comparing the two systems for this use-case.
  2. Say that we focus on comparing the two systems for this use-case.
  3. Say that we focus on comparing the two systems for this use-case.
  4. Ratings in a sparse matrix
  5. Story: turned out it is worth to combine these two? Message: batch + online is better than batch alone, or online alone. DCG: Discounted Cumulative Gain, measures ranking quality, higher-better https://en.wikipedia.org/wiki/Discounted_cumulative_gain
  6. Sources: Spotify 2015 data https://techcrunch.com/2015/12/01/spotify-claims-streaming-music-throne-worldwide-but-pandora-is-still-top-service-in-u-s/?ncid=rss#.uuccs9:VA8w YT https://www.youtube.com/yt/press/en-GB/statistics.html
  7. Vs. mini-batch. Send records without global synchronization.
  8. Vs. mini-batch. Send records without global synchronization.
  9. TODO: 4 dia „animalas”
  10. TODO: 4 dia „animalas”
  11. TODO: 4 dia „animalas”
  12. TODO: 4 dia „animalas”
  13. TODO: 4 dia „animalas”
  14. TODO: 4 dia „animalas”