SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Downloaden Sie, um offline zu lesen
Simon Chan
simon@prediction.io
Data Science London - April 24, 2013
Big Data Week
Machine Learning is....
computers learning to predict
from data
putting
Machine Learning
into practice
challenge #1
Scalability
Big Data Bottlenecks
Machine Learning Processing
PredictionIO has a
horizontally scalable
architecture
Async SDK
Client client = new Client(appkey);
// Adding user behaviors
req = client.getUserRateItemRequestBuilder(uid, iid, rate);
client.userRateItemAsFuture(req);
Play
Framework
‣ stateless - no server session
‣ non-blocking web request
Play: A Non-blocking Example
def index = Action {
val futureInt = scala.concurrent.Future { slowDataProcess() }
Async {
futureInt.map(i => Ok(views.html.result.render(i)))
}
}
MongoDB
‣ Read scaling: Replica Sets
‣ Write scaling: Sharding
‣ Indexes (e.g. geospatial)
{ geoSearch : "places", near : [33, 33],
maxDistance : 6, search : { uid : "user1" } }
Hadoop
Hadoop&
Cascading&(Java)&
Scalding&(Scala)&
MapReduce
- Native Java
public class WordCount {
public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, Context context) throws .....{
String line = value.toString();
StringTokenizer tokenizer = new StringTokenizer(line);
while (tokenizer.hasMoreTokens()) {
word.set(tokenizer.nextToken());
context.write(word, one);
}
}
}
public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {
public void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) { sum += val.get(); }
context.write(key, new IntWritable(sum));
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = new Job(conf, "wordcount");
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.waitForCompletion(true);
}
}
MapReduce
- Scalding
class ScaldingTestJob(args: Args) extends Job(args) {
Tsv(args(0), 'text)
.flatMap('text -> 'word) { text : String => text.split("s+") }
.groupBy('word) { _.size }
.write(Tsv(args(1))
}
Sample Code
### Sample PredictionIO Python SDK Code
client = predictionio.Client(appkey="<your app key>")
# Add Data
client.create_user(uid=”user123”)
client.create_item(iid=”itemXYZ”, itypes=(1,))
client.user_view_item(uid=”user123”, iid=”itemXYZ”)
# Get Prediction
rec = client.get_itemrec(engine="<engine name>", uid=”user123”, n=5)
Getting
Involved!
- @PredictionIO
- prediction.io - Newsletter
- github.com/predictionio
Q&A
Q: Selecting the right features is a big problem. Can PredictionIO solve this problem?
A: Not at this moment.That’s why we focus on collaborative filtering algorithms right now
which don’t require the use of features.And we believe that the involvement of data
scientists is needed for many specific problems. PredictionIO is positioned as a tool to
make their work easier, but not as a replacement.
Q: How’s PredictionIO different from Weka?
A:Weka, like Mahout, is a ML algorithm library.You can see PredictionIO as a layer on top
of it, which helps you to implement algorithm into production environment by providing a
complete infrastructure.
Q: How do you compare PredictionIO with RapidMiner?
A: RapidMiner is a great product to define data engineering workflow visually.
PredictionIO focuses on a different problem -- i.e. deploying ML solution into production
environment.
Q: How does the algorithm evaluation metrics work in PredictionIO?
A: At this moment, you can evaluate algorithms by some offline metrics, such as Mean
Average Precision, based on your existing data.
Q:What’s the business model?
A: We focus on making PredictionIO a useful open source product at this moment.

Weitere ähnliche Inhalte

Was ist angesagt?

Современная архитектура Android-приложений - Archetype / Степан Гончаров (90 ...
Современная архитектура Android-приложений - Archetype / Степан Гончаров (90 ...Современная архитектура Android-приложений - Archetype / Степан Гончаров (90 ...
Современная архитектура Android-приложений - Archetype / Степан Гончаров (90 ...Ontico
 
Megan Kurka, H2O.ai - AutoDoc with H2O Driverless AI - H2O World 2019 NYC
Megan Kurka, H2O.ai - AutoDoc with H2O Driverless AI - H2O World 2019 NYCMegan Kurka, H2O.ai - AutoDoc with H2O Driverless AI - H2O World 2019 NYC
Megan Kurka, H2O.ai - AutoDoc with H2O Driverless AI - H2O World 2019 NYCSri Ambati
 
Introducing AWS AppSync: serverless data driven apps with real-time and offli...
Introducing AWS AppSync: serverless data driven apps with real-time and offli...Introducing AWS AppSync: serverless data driven apps with real-time and offli...
Introducing AWS AppSync: serverless data driven apps with real-time and offli...Amazon Web Services
 
Using Azure Machine Learning Models
Using Azure Machine Learning ModelsUsing Azure Machine Learning Models
Using Azure Machine Learning ModelsEng Teong Cheah
 
Intershop Commerce Management with Microsoft SQL Server
Intershop Commerce Management with Microsoft SQL ServerIntershop Commerce Management with Microsoft SQL Server
Intershop Commerce Management with Microsoft SQL ServerMauro Boffardi
 
Supercharging Applications with GraphQL and AWS AppSync
Supercharging Applications with GraphQL and AWS AppSyncSupercharging Applications with GraphQL and AWS AppSync
Supercharging Applications with GraphQL and AWS AppSyncAmazon Web Services
 
How BigQuery broke my heart
How BigQuery broke my heartHow BigQuery broke my heart
How BigQuery broke my heartGabriel Hamilton
 
Optimizing PV energy yield with Elasticsearch and graphQL
Optimizing PV energy yield with Elasticsearch and graphQLOptimizing PV energy yield with Elasticsearch and graphQL
Optimizing PV energy yield with Elasticsearch and graphQLChijioke “CJ” Ejimuda
 
BigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLBigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLMárton Kodok
 
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...javier ramirez
 
An Architecture for Agile Machine Learning in Real-Time Applications
An Architecture for Agile Machine Learning in Real-Time ApplicationsAn Architecture for Agile Machine Learning in Real-Time Applications
An Architecture for Agile Machine Learning in Real-Time ApplicationsJohann Schleier-Smith
 
Pathway to Cloud-Native .NET
Pathway to Cloud-Native .NETPathway to Cloud-Native .NET
Pathway to Cloud-Native .NETVMware Tanzu
 
Analytics in the Cloud
Analytics in the CloudAnalytics in the Cloud
Analytics in the CloudRoss McNeely
 
Driverless AI - Arno Candel, H2O.ai
Driverless AI - Arno Candel, H2O.aiDriverless AI - Arno Candel, H2O.ai
Driverless AI - Arno Candel, H2O.aiSri Ambati
 
Performance optimisation with GraphQL
Performance optimisation with GraphQLPerformance optimisation with GraphQL
Performance optimisation with GraphQLyann_s
 
Big objects in Salesforce Technology
Big objects in Salesforce TechnologyBig objects in Salesforce Technology
Big objects in Salesforce TechnologyDivya Agrawal
 
Agile Machine Learning for Real-time Recommender Systems
Agile Machine Learning for Real-time Recommender SystemsAgile Machine Learning for Real-time Recommender Systems
Agile Machine Learning for Real-time Recommender SystemsJohann Schleier-Smith
 
30 days of google cloud event
30 days of google cloud event30 days of google cloud event
30 days of google cloud eventPreetyKhatkar
 

Was ist angesagt? (20)

Современная архитектура Android-приложений - Archetype / Степан Гончаров (90 ...
Современная архитектура Android-приложений - Archetype / Степан Гончаров (90 ...Современная архитектура Android-приложений - Archetype / Степан Гончаров (90 ...
Современная архитектура Android-приложений - Archetype / Степан Гончаров (90 ...
 
Megan Kurka, H2O.ai - AutoDoc with H2O Driverless AI - H2O World 2019 NYC
Megan Kurka, H2O.ai - AutoDoc with H2O Driverless AI - H2O World 2019 NYCMegan Kurka, H2O.ai - AutoDoc with H2O Driverless AI - H2O World 2019 NYC
Megan Kurka, H2O.ai - AutoDoc with H2O Driverless AI - H2O World 2019 NYC
 
AppSync and GraphQL on iOS
AppSync and GraphQL on iOSAppSync and GraphQL on iOS
AppSync and GraphQL on iOS
 
Introducing AWS AppSync: serverless data driven apps with real-time and offli...
Introducing AWS AppSync: serverless data driven apps with real-time and offli...Introducing AWS AppSync: serverless data driven apps with real-time and offli...
Introducing AWS AppSync: serverless data driven apps with real-time and offli...
 
Using Azure Machine Learning Models
Using Azure Machine Learning ModelsUsing Azure Machine Learning Models
Using Azure Machine Learning Models
 
Intershop Commerce Management with Microsoft SQL Server
Intershop Commerce Management with Microsoft SQL ServerIntershop Commerce Management with Microsoft SQL Server
Intershop Commerce Management with Microsoft SQL Server
 
Supercharging Applications with GraphQL and AWS AppSync
Supercharging Applications with GraphQL and AWS AppSyncSupercharging Applications with GraphQL and AWS AppSync
Supercharging Applications with GraphQL and AWS AppSync
 
How BigQuery broke my heart
How BigQuery broke my heartHow BigQuery broke my heart
How BigQuery broke my heart
 
Optimizing PV energy yield with Elasticsearch and graphQL
Optimizing PV energy yield with Elasticsearch and graphQLOptimizing PV energy yield with Elasticsearch and graphQL
Optimizing PV energy yield with Elasticsearch and graphQL
 
Redshift VS BigQuery
Redshift VS BigQueryRedshift VS BigQuery
Redshift VS BigQuery
 
BigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQLBigQuery ML - Machine learning at scale using SQL
BigQuery ML - Machine learning at scale using SQL
 
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
 
An Architecture for Agile Machine Learning in Real-Time Applications
An Architecture for Agile Machine Learning in Real-Time ApplicationsAn Architecture for Agile Machine Learning in Real-Time Applications
An Architecture for Agile Machine Learning in Real-Time Applications
 
Pathway to Cloud-Native .NET
Pathway to Cloud-Native .NETPathway to Cloud-Native .NET
Pathway to Cloud-Native .NET
 
Analytics in the Cloud
Analytics in the CloudAnalytics in the Cloud
Analytics in the Cloud
 
Driverless AI - Arno Candel, H2O.ai
Driverless AI - Arno Candel, H2O.aiDriverless AI - Arno Candel, H2O.ai
Driverless AI - Arno Candel, H2O.ai
 
Performance optimisation with GraphQL
Performance optimisation with GraphQLPerformance optimisation with GraphQL
Performance optimisation with GraphQL
 
Big objects in Salesforce Technology
Big objects in Salesforce TechnologyBig objects in Salesforce Technology
Big objects in Salesforce Technology
 
Agile Machine Learning for Real-time Recommender Systems
Agile Machine Learning for Real-time Recommender SystemsAgile Machine Learning for Real-time Recommender Systems
Agile Machine Learning for Real-time Recommender Systems
 
30 days of google cloud event
30 days of google cloud event30 days of google cloud event
30 days of google cloud event
 

Andere mochten auch

PredictionIO - The 1st International Conference on Predictive APIs and Apps
PredictionIO - The 1st International Conference on Predictive APIs and AppsPredictionIO - The 1st International Conference on Predictive APIs and Apps
PredictionIO - The 1st International Conference on Predictive APIs and Appspredictionio
 
Machine Learning & Ecommerce - by David Jones - PAPIs Connect
Machine Learning & Ecommerce - by David Jones - PAPIs ConnectMachine Learning & Ecommerce - by David Jones - PAPIs Connect
Machine Learning & Ecommerce - by David Jones - PAPIs ConnectPAPIs.io
 
Building A Machine Learning Platform At Quora (1)
Building A Machine Learning Platform At Quora (1)Building A Machine Learning Platform At Quora (1)
Building A Machine Learning Platform At Quora (1)Nikhil Garg
 
Инфраструктура как услуга (IaaS) в Windows Azure
Инфраструктура как услуга (IaaS) в Windows AzureИнфраструктура как услуга (IaaS) в Windows Azure
Инфраструктура как услуга (IaaS) в Windows AzureNatalia Efimtseva
 
Microsoft Azure - введение в основные сервисы для разработки и инфраструктуры...
Microsoft Azure - введение в основные сервисы для разработки и инфраструктуры...Microsoft Azure - введение в основные сервисы для разработки и инфраструктуры...
Microsoft Azure - введение в основные сервисы для разработки и инфраструктуры...Microsoft
 
Презентация MS Azure
Презентация MS AzureПрезентация MS Azure
Презентация MS AzureDmitry Moskvin
 
Naive application of Machine Learning to Software Development
Naive application of Machine Learning to Software DevelopmentNaive application of Machine Learning to Software Development
Naive application of Machine Learning to Software DevelopmentAndriy Khavryuchenko
 
Applying Machine Learning to Software Clustering
Applying Machine Learning to Software ClusteringApplying Machine Learning to Software Clustering
Applying Machine Learning to Software Clusteringbutest
 
Pragmatic machine learning for the real world
Pragmatic machine learning for the real worldPragmatic machine learning for the real world
Pragmatic machine learning for the real worldLouis Dorard
 
Setting up a Machine Learning Platform - Monitoring social media the “smart” way
Setting up a Machine Learning Platform - Monitoring social media the “smart” waySetting up a Machine Learning Platform - Monitoring social media the “smart” way
Setting up a Machine Learning Platform - Monitoring social media the “smart” way10x Nation
 
Seldon - Open Sourcing a Predictive API - Data Science London #ds_ldn
Seldon - Open Sourcing a Predictive API - Data Science London #ds_ldnSeldon - Open Sourcing a Predictive API - Data Science London #ds_ldn
Seldon - Open Sourcing a Predictive API - Data Science London #ds_ldnAlex Housley
 
Co-occurrence Based Recommendations with Mahout, Scala and Spark
Co-occurrence Based Recommendations with Mahout, Scala and SparkCo-occurrence Based Recommendations with Mahout, Scala and Spark
Co-occurrence Based Recommendations with Mahout, Scala and Sparksscdotopen
 
Big wins with small data. PredictionIO in ecommerce - David Jones
Big wins with small data. PredictionIO in ecommerce - David JonesBig wins with small data. PredictionIO in ecommerce - David Jones
Big wins with small data. PredictionIO in ecommerce - David JonesPAPIs.io
 
Prediction io–final 2014-jp-handout
Prediction io–final 2014-jp-handoutPrediction io–final 2014-jp-handout
Prediction io–final 2014-jp-handoutHa Phuong
 
Machine Learning system architecture – Microsoft Translator, a Case Study : ...
Machine Learning system architecture – Microsoft Translator, a Case Study :  ...Machine Learning system architecture – Microsoft Translator, a Case Study :  ...
Machine Learning system architecture – Microsoft Translator, a Case Study : ...Vishal Chowdhary
 
AI For Enterprise
AI For EnterpriseAI For Enterprise
AI For EnterpriseNVIDIA
 
The Universal Recommender
The Universal RecommenderThe Universal Recommender
The Universal RecommenderPat Ferrel
 

Andere mochten auch (19)

PredictionIO - The 1st International Conference on Predictive APIs and Apps
PredictionIO - The 1st International Conference on Predictive APIs and AppsPredictionIO - The 1st International Conference on Predictive APIs and Apps
PredictionIO - The 1st International Conference on Predictive APIs and Apps
 
Machine Learning & Ecommerce - by David Jones - PAPIs Connect
Machine Learning & Ecommerce - by David Jones - PAPIs ConnectMachine Learning & Ecommerce - by David Jones - PAPIs Connect
Machine Learning & Ecommerce - by David Jones - PAPIs Connect
 
Building A Machine Learning Platform At Quora (1)
Building A Machine Learning Platform At Quora (1)Building A Machine Learning Platform At Quora (1)
Building A Machine Learning Platform At Quora (1)
 
Инфраструктура как услуга (IaaS) в Windows Azure
Инфраструктура как услуга (IaaS) в Windows AzureИнфраструктура как услуга (IaaS) в Windows Azure
Инфраструктура как услуга (IaaS) в Windows Azure
 
Microsoft Azure - введение в основные сервисы для разработки и инфраструктуры...
Microsoft Azure - введение в основные сервисы для разработки и инфраструктуры...Microsoft Azure - введение в основные сервисы для разработки и инфраструктуры...
Microsoft Azure - введение в основные сервисы для разработки и инфраструктуры...
 
Презентация MS Azure
Презентация MS AzureПрезентация MS Azure
Презентация MS Azure
 
Naive application of Machine Learning to Software Development
Naive application of Machine Learning to Software DevelopmentNaive application of Machine Learning to Software Development
Naive application of Machine Learning to Software Development
 
Applying Machine Learning to Software Clustering
Applying Machine Learning to Software ClusteringApplying Machine Learning to Software Clustering
Applying Machine Learning to Software Clustering
 
Discovery
DiscoveryDiscovery
Discovery
 
Pragmatic machine learning for the real world
Pragmatic machine learning for the real worldPragmatic machine learning for the real world
Pragmatic machine learning for the real world
 
Setting up a Machine Learning Platform - Monitoring social media the “smart” way
Setting up a Machine Learning Platform - Monitoring social media the “smart” waySetting up a Machine Learning Platform - Monitoring social media the “smart” way
Setting up a Machine Learning Platform - Monitoring social media the “smart” way
 
Seldon - Open Sourcing a Predictive API - Data Science London #ds_ldn
Seldon - Open Sourcing a Predictive API - Data Science London #ds_ldnSeldon - Open Sourcing a Predictive API - Data Science London #ds_ldn
Seldon - Open Sourcing a Predictive API - Data Science London #ds_ldn
 
Co-occurrence Based Recommendations with Mahout, Scala and Spark
Co-occurrence Based Recommendations with Mahout, Scala and SparkCo-occurrence Based Recommendations with Mahout, Scala and Spark
Co-occurrence Based Recommendations with Mahout, Scala and Spark
 
Big wins with small data. PredictionIO in ecommerce - David Jones
Big wins with small data. PredictionIO in ecommerce - David JonesBig wins with small data. PredictionIO in ecommerce - David Jones
Big wins with small data. PredictionIO in ecommerce - David Jones
 
Prediction io–final 2014-jp-handout
Prediction io–final 2014-jp-handoutPrediction io–final 2014-jp-handout
Prediction io–final 2014-jp-handout
 
Machine Learning system architecture – Microsoft Translator, a Case Study : ...
Machine Learning system architecture – Microsoft Translator, a Case Study :  ...Machine Learning system architecture – Microsoft Translator, a Case Study :  ...
Machine Learning system architecture – Microsoft Translator, a Case Study : ...
 
201203 Adaptive Empathetic Software
201203 Adaptive Empathetic Software201203 Adaptive Empathetic Software
201203 Adaptive Empathetic Software
 
AI For Enterprise
AI For EnterpriseAI For Enterprise
AI For Enterprise
 
The Universal Recommender
The Universal RecommenderThe Universal Recommender
The Universal Recommender
 

Ähnlich wie Machine Learning and Big Data technologies discussed

GDSC Backend Bootcamp.pptx
GDSC Backend Bootcamp.pptxGDSC Backend Bootcamp.pptx
GDSC Backend Bootcamp.pptxSaaraBansode
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
How We Built a Mobile Electronic Health Record App Using Xamarin, Angular, an...
How We Built a Mobile Electronic Health Record App Using Xamarin, Angular, an...How We Built a Mobile Electronic Health Record App Using Xamarin, Angular, an...
How We Built a Mobile Electronic Health Record App Using Xamarin, Angular, an...Matt Spradley
 
Evolving your Data Access with MongoDB Stitch
Evolving your Data Access with MongoDB StitchEvolving your Data Access with MongoDB Stitch
Evolving your Data Access with MongoDB StitchMongoDB
 
Developing Next-Gen Enterprise Web Application
Developing Next-Gen Enterprise Web ApplicationDeveloping Next-Gen Enterprise Web Application
Developing Next-Gen Enterprise Web ApplicationMark Gu
 
Yufeng Guo | Coding the 7 steps of machine learning | Codemotion Madrid 2018
Yufeng Guo |  Coding the 7 steps of machine learning | Codemotion Madrid 2018 Yufeng Guo |  Coding the 7 steps of machine learning | Codemotion Madrid 2018
Yufeng Guo | Coding the 7 steps of machine learning | Codemotion Madrid 2018 Codemotion
 
MongoDB.local Atlanta: Introduction to Serverless MongoDB
MongoDB.local Atlanta: Introduction to Serverless MongoDBMongoDB.local Atlanta: Introduction to Serverless MongoDB
MongoDB.local Atlanta: Introduction to Serverless MongoDBMongoDB
 
I Know It Was MEAN, But I Cut the Cord to LAMP Anyway
I Know It Was MEAN, But I Cut the Cord to LAMP AnywayI Know It Was MEAN, But I Cut the Cord to LAMP Anyway
I Know It Was MEAN, But I Cut the Cord to LAMP AnywayAll Things Open
 
Large scale data capture and experimentation platform at Grab
Large scale data capture and experimentation platform at GrabLarge scale data capture and experimentation platform at Grab
Large scale data capture and experimentation platform at GrabRoman
 
Sufan presentation
Sufan presentationSufan presentation
Sufan presentationSufanhk
 
MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...
MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...
MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...MongoDB
 
Architecting for change: LinkedIn's new data ecosystem
Architecting for change: LinkedIn's new data ecosystemArchitecting for change: LinkedIn's new data ecosystem
Architecting for change: LinkedIn's new data ecosystemYael Garten
 
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystemStrata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystemShirshanka Das
 
Do we need a bigger dev data culture
Do we need a bigger dev data cultureDo we need a bigger dev data culture
Do we need a bigger dev data cultureSimon Dittlmann
 
Evolving your Data Access with MongoDB Stitch - Drew Di Palma
Evolving your Data Access with MongoDB Stitch - Drew Di PalmaEvolving your Data Access with MongoDB Stitch - Drew Di Palma
Evolving your Data Access with MongoDB Stitch - Drew Di PalmaMongoDB
 

Ähnlich wie Machine Learning and Big Data technologies discussed (20)

GDSC Backend Bootcamp.pptx
GDSC Backend Bootcamp.pptxGDSC Backend Bootcamp.pptx
GDSC Backend Bootcamp.pptx
 
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
Are API Services Taking Over All the Interesting Data Science Problems?
Are API Services Taking Over All the Interesting Data Science Problems?Are API Services Taking Over All the Interesting Data Science Problems?
Are API Services Taking Over All the Interesting Data Science Problems?
 
Coding Naked 2023
Coding Naked 2023Coding Naked 2023
Coding Naked 2023
 
How We Built a Mobile Electronic Health Record App Using Xamarin, Angular, an...
How We Built a Mobile Electronic Health Record App Using Xamarin, Angular, an...How We Built a Mobile Electronic Health Record App Using Xamarin, Angular, an...
How We Built a Mobile Electronic Health Record App Using Xamarin, Angular, an...
 
Evolving your Data Access with MongoDB Stitch
Evolving your Data Access with MongoDB StitchEvolving your Data Access with MongoDB Stitch
Evolving your Data Access with MongoDB Stitch
 
Mobile optimization
Mobile optimizationMobile optimization
Mobile optimization
 
Abhishek_Kumar
Abhishek_KumarAbhishek_Kumar
Abhishek_Kumar
 
Developing Next-Gen Enterprise Web Application
Developing Next-Gen Enterprise Web ApplicationDeveloping Next-Gen Enterprise Web Application
Developing Next-Gen Enterprise Web Application
 
Yufeng Guo | Coding the 7 steps of machine learning | Codemotion Madrid 2018
Yufeng Guo |  Coding the 7 steps of machine learning | Codemotion Madrid 2018 Yufeng Guo |  Coding the 7 steps of machine learning | Codemotion Madrid 2018
Yufeng Guo | Coding the 7 steps of machine learning | Codemotion Madrid 2018
 
MongoDB.local Atlanta: Introduction to Serverless MongoDB
MongoDB.local Atlanta: Introduction to Serverless MongoDBMongoDB.local Atlanta: Introduction to Serverless MongoDB
MongoDB.local Atlanta: Introduction to Serverless MongoDB
 
Clean Architecture @ Taxibeat
Clean Architecture @ TaxibeatClean Architecture @ Taxibeat
Clean Architecture @ Taxibeat
 
I Know It Was MEAN, But I Cut the Cord to LAMP Anyway
I Know It Was MEAN, But I Cut the Cord to LAMP AnywayI Know It Was MEAN, But I Cut the Cord to LAMP Anyway
I Know It Was MEAN, But I Cut the Cord to LAMP Anyway
 
Large scale data capture and experimentation platform at Grab
Large scale data capture and experimentation platform at GrabLarge scale data capture and experimentation platform at Grab
Large scale data capture and experimentation platform at Grab
 
Sufan presentation
Sufan presentationSufan presentation
Sufan presentation
 
MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...
MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...
MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...
 
Architecting for change: LinkedIn's new data ecosystem
Architecting for change: LinkedIn's new data ecosystemArchitecting for change: LinkedIn's new data ecosystem
Architecting for change: LinkedIn's new data ecosystem
 
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystemStrata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
 
Do we need a bigger dev data culture
Do we need a bigger dev data cultureDo we need a bigger dev data culture
Do we need a bigger dev data culture
 
Evolving your Data Access with MongoDB Stitch - Drew Di Palma
Evolving your Data Access with MongoDB Stitch - Drew Di PalmaEvolving your Data Access with MongoDB Stitch - Drew Di Palma
Evolving your Data Access with MongoDB Stitch - Drew Di Palma
 

Kürzlich hochgeladen

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 

Kürzlich hochgeladen (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 

Machine Learning and Big Data technologies discussed

  • 1. Simon Chan simon@prediction.io Data Science London - April 24, 2013 Big Data Week
  • 2. Machine Learning is.... computers learning to predict from data
  • 5. Big Data Bottlenecks Machine Learning Processing
  • 6. PredictionIO has a horizontally scalable architecture
  • 7.
  • 8. Async SDK Client client = new Client(appkey); // Adding user behaviors req = client.getUserRateItemRequestBuilder(uid, iid, rate); client.userRateItemAsFuture(req);
  • 9. Play Framework ‣ stateless - no server session ‣ non-blocking web request
  • 10. Play: A Non-blocking Example def index = Action { val futureInt = scala.concurrent.Future { slowDataProcess() } Async { futureInt.map(i => Ok(views.html.result.render(i))) } }
  • 11. MongoDB ‣ Read scaling: Replica Sets ‣ Write scaling: Sharding ‣ Indexes (e.g. geospatial) { geoSearch : "places", near : [33, 33], maxDistance : 6, search : { uid : "user1" } }
  • 13. MapReduce - Native Java public class WordCount { public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, Context context) throws .....{ String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); context.write(word, one); } } } public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> { public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } context.write(key, new IntWritable(sum)); } } public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = new Job(conf, "wordcount"); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(Map.class); job.setReducerClass(Reduce.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.waitForCompletion(true); } }
  • 14. MapReduce - Scalding class ScaldingTestJob(args: Args) extends Job(args) { Tsv(args(0), 'text) .flatMap('text -> 'word) { text : String => text.split("s+") } .groupBy('word) { _.size } .write(Tsv(args(1)) }
  • 16. ### Sample PredictionIO Python SDK Code client = predictionio.Client(appkey="<your app key>") # Add Data client.create_user(uid=”user123”) client.create_item(iid=”itemXYZ”, itypes=(1,)) client.user_view_item(uid=”user123”, iid=”itemXYZ”) # Get Prediction rec = client.get_itemrec(engine="<engine name>", uid=”user123”, n=5)
  • 17. Getting Involved! - @PredictionIO - prediction.io - Newsletter - github.com/predictionio
  • 18. Q&A Q: Selecting the right features is a big problem. Can PredictionIO solve this problem? A: Not at this moment.That’s why we focus on collaborative filtering algorithms right now which don’t require the use of features.And we believe that the involvement of data scientists is needed for many specific problems. PredictionIO is positioned as a tool to make their work easier, but not as a replacement. Q: How’s PredictionIO different from Weka? A:Weka, like Mahout, is a ML algorithm library.You can see PredictionIO as a layer on top of it, which helps you to implement algorithm into production environment by providing a complete infrastructure. Q: How do you compare PredictionIO with RapidMiner? A: RapidMiner is a great product to define data engineering workflow visually. PredictionIO focuses on a different problem -- i.e. deploying ML solution into production environment. Q: How does the algorithm evaluation metrics work in PredictionIO? A: At this moment, you can evaluate algorithms by some offline metrics, such as Mean Average Precision, based on your existing data. Q:What’s the business model? A: We focus on making PredictionIO a useful open source product at this moment.