SlideShare a Scribd company logo
Federated Machine Learning
Andreas Hellander
Co-founder and Lead Scientist, Scaleout Systems
Associate Professor in Scientific Computing, Uppsala University
scaleoutsystems.com it.uu.se
Main issues with the centralized paradigm
in machine learning:
● Private/Proprietary data — Sharing
valuable business data with someone
else is not an option.
● Regulated data — GDPR, HIPAA, etc.
● Practical blockers — data is too big,
the network connection is expensive,
slow or unreliable.
Also, large datasets relevant to AI
problems are controlled by a small number
of large organizations and there are no
great mechanisms for sharing that data
with the data science community.
scaleoutsystems.com
The data centralization problem
1. Collect and centralize data from
different sources (data lake, cloud).
2. Create ML model using centralised
data (cluster computing)
How can parties come together to create joint
ML models without sharing/pooling data?
Federated Machine Learning
Federated Machine Learning (FedML) is a
distributed machine learning approach
which enables training on decentralised
data.
● Train local machine learning model on
local/private data.
● Combine local model updates into a
global, federated model.
Federated learning addresses the
fundamental problems of centralized AI
such as privacy, ownership, and locality of
data.
scaleoutsystems.com/federated-machine-learning
The key benefit of FedML
Lets parties form alliances/networks to
build stronger models than what could be
attained by the parties in isolation.
● Data security and privacy where data
never moves.
● Powerful data network effects in
industries where data cannot be
transferred.
● Reduced data transfer costs when
data is very large or networks
unreliable.
scaleoutsystems.com/federated-machine-learning
N. Gauraha, O. Spjuth, A. Hellander (2019), manuscript in preparation
Early example
FedML on Gboard:
● Local model for search suggestion,
with context and whether suggestion
was clicked
● On device the history is processed,
and then only a model update is
suggested to Google
● Based on Federated Averaging, a
scheme to aggregate weights from
locally trained neural nets:
https://arxiv.org/pdf/1602.05629.pdf
scaleoutsystems.com/federated-machine-learning
https://ai.googleblog.com/2017/04/federated-learning-collaborative.html
Smart software on top of decentralized
infrastructure/instruments ● Let’s an instrument/software vendor
build smarter software.
● Digital pathology, medical dosimetry,
predictive maintenance etc.
● Sensitive data does not need to be
shared.
● Powerful network effects possible.
scaleoutsystems.com/federated-machine-learning
Federated
Model
Software services
Federated learning system
Infrastructure vendor
Integrity-preserving E-health
● Digital tools/video surveillance in
home care.
● Train and deploy models based on
homeowners’ private interactions
without collecting central data.
scaleoutsystems.com/federated-machine-learning
Privacy-preservation features of FedML
● Input privacy simplified since data do
not move (handled according to local
policies)
● Output privacy - depends on the
algorithm, how easy it is to invert the
model etc.
● What can be learned from the
coordination of computation?
○ Different for federated averaging
and ensemble methods
(algorithm dependent)
scaleoutsystems.com/federated-machine-learning
UN Handbook for Privacy-Preserving Techniques
Differential privacy & homomorphic encryption in FedML
Differential privacy: Add
carefully calibrated noise
(protects against inference
attacks)
Homomorphic encryption: Methods
work on encrypted data
Secure multiparty computation:
Aggregate/compute without a
third party trust
provider/server.
scaleoutsystems.com/federated-machine-learning
R&D challenges
Scalability and ML
performance
How do we (re)design
algorithms and frameworks
to scale out to the fog and
edge?
Decentralized computation
How can we do FedML
without a third-party trust
provider?
Adversarial ML
How can we make the
system robust to dishonest
members and external
threats?
FedML is a research area that integrates many differents areas of
computer science and mathematics.
scaleoutsystems.com/federated-machine-learning
Backdooring federated learning
● Big threat to a FedML comes from
within the alliance / from
compromised members.
● Large alliances can be expected to be
relatively robust to data poisoning
attacks.
● Bagdasaryan et al. shows how their
proposed approach of model
replacement can efficiently introduce
backdoors in a global model.
● Secure aggregation/MPC makes it
impossible to detect a malicious
model update, and who submitted it!
scaleoutsystems.com/federated-machine-learning
Bagdasaryan et al. How to backdoor federated learning (2019) https://arxiv.org/pdf/1807.00459.pdf
Federated learning in production
Secure model
communication,
anomaly detection,
etc.
API Federated components
Global model
serving
ML pipeline
APIML pipeline
APIML pipeline
A problem that spans many complex areas
● Decentralized computing / fog computing
● Information and security/systems security expertise
● Trust-mechanisms (third-party or decentralized protocol)
● Machine learning algorithms designed for/adapted to a decentralized setting
● Adversarial ML
○ Data poisoning
○ Inference attacks
○ …
A considerable increase in system and developer complexity
compared to the standard paradigm!
scaleoutsystems.com/federated-machine-learning
Scaleout Federated Platform
Scaleout Studio | Developing Scaleout Store | Package & Deploying Scaleout Serve | Serving
Scaleout Federated Platform
ML studio
- Ingestion
- Prepare & Analyse Data
- Modeling & Testing
- Training
ML workflow automation
- Automated ML Studio
Pipelines
API
API
Model management
- Versioning
- Annotation
- Storage
- Distribution
API
Model
serving
- Traffic
management
- Authentication
/Authorization
- Policies
- Monitoring
Monitoring &
Visualizations
API
API
Endpoint registry
Graphical User Interface
Incl Pipeline Visualization
AuthenticationandAuthorization
Model Sharing
Joint Training
Federation
Orchestration
Federation Identity &
Security
Federation Cross Validation
& Holdout Set
scaleoutsystems.com/federated-machine-learning
scaleoutsystems.com
Thank you!
SCALEOUT
Bridging the gap between research and
production grade systems in machine
learning. Learn more about our Lean AI
framework, and our Federated Machine
Learning platform.
ANDREAS HELLANDER
andreas.hellander@it.uu.se
SALMAN TOOR
salman.toor@it.uu.se
Scaleout FedML platform demo at
Testa Center, GE Healthcare
https://www.youtube.com/watch?v=K-JUNkAYs-4

More Related Content

What's hot

Federated learning and its role in the privacy preservation of IoT devices
Federated learning and its role in the privacy preservation of IoT devicesFederated learning and its role in the privacy preservation of IoT devices
Federated learning and its role in the privacy preservation of IoT devices
AlAtfat
 
"Getting More from Your Datasets: Data Augmentation, Annotation and Generativ...
"Getting More from Your Datasets: Data Augmentation, Annotation and Generativ..."Getting More from Your Datasets: Data Augmentation, Annotation and Generativ...
"Getting More from Your Datasets: Data Augmentation, Annotation and Generativ...
Edge AI and Vision Alliance
 
Cloud, Fog & Edge Computing
Cloud, Fog & Edge ComputingCloud, Fog & Edge Computing
Cloud, Fog & Edge Computing
EUBrasilCloudFORUM .
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learning
butest
 

What's hot (20)

byteLAKE and Lenovo presenting Federated Learning at MWC 2019
byteLAKE and Lenovo presenting Federated Learning at MWC 2019byteLAKE and Lenovo presenting Federated Learning at MWC 2019
byteLAKE and Lenovo presenting Federated Learning at MWC 2019
 
Privacy preserving machine learning
Privacy preserving machine learningPrivacy preserving machine learning
Privacy preserving machine learning
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
Federated learning and its role in the privacy preservation of IoT devices
Federated learning and its role in the privacy preservation of IoT devicesFederated learning and its role in the privacy preservation of IoT devices
Federated learning and its role in the privacy preservation of IoT devices
 
"Getting More from Your Datasets: Data Augmentation, Annotation and Generativ...
"Getting More from Your Datasets: Data Augmentation, Annotation and Generativ..."Getting More from Your Datasets: Data Augmentation, Annotation and Generativ...
"Getting More from Your Datasets: Data Augmentation, Annotation and Generativ...
 
A Privacy Framework for Hierarchical Federated Learning
A Privacy Framework for Hierarchical Federated LearningA Privacy Framework for Hierarchical Federated Learning
A Privacy Framework for Hierarchical Federated Learning
 
Cloud, Fog & Edge Computing
Cloud, Fog & Edge ComputingCloud, Fog & Edge Computing
Cloud, Fog & Edge Computing
 
Machine Learning Ml Overview Algorithms Use Cases And Applications
Machine Learning Ml Overview Algorithms Use Cases And ApplicationsMachine Learning Ml Overview Algorithms Use Cases And Applications
Machine Learning Ml Overview Algorithms Use Cases And Applications
 
Explainable AI (XAI) - A Perspective
Explainable AI (XAI) - A Perspective Explainable AI (XAI) - A Perspective
Explainable AI (XAI) - A Perspective
 
Federated Learning
Federated LearningFederated Learning
Federated Learning
 
LLM Healthcare.pdf
LLM Healthcare.pdfLLM Healthcare.pdf
LLM Healthcare.pdf
 
Federated learning based_trafiic_flow_prediction.ppt
Federated learning based_trafiic_flow_prediction.pptFederated learning based_trafiic_flow_prediction.ppt
Federated learning based_trafiic_flow_prediction.ppt
 
Explainable AI (XAI)
Explainable AI (XAI)Explainable AI (XAI)
Explainable AI (XAI)
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
Deep learning - what is it and why now?
Deep learning - what is it and why now?Deep learning - what is it and why now?
Deep learning - what is it and why now?
 
An introduction to Machine Learning
An introduction to Machine LearningAn introduction to Machine Learning
An introduction to Machine Learning
 
The Deep Learning Glossary
The Deep Learning GlossaryThe Deep Learning Glossary
The Deep Learning Glossary
 
Deep learning for medical imaging
Deep learning for medical imagingDeep learning for medical imaging
Deep learning for medical imaging
 
Unified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIMEUnified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIME
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 

Similar to Big Data Stockholm v 7 | "Federated Machine Learning for Collaborative and Secure AI" - Andreas Hellander

The FAIR data movement and 22 Feb 2023.pdf
The FAIR data movement and 22 Feb 2023.pdfThe FAIR data movement and 22 Feb 2023.pdf
The FAIR data movement and 22 Feb 2023.pdf
Alan Morrison
 
Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...
Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...
Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...
YogeshIJTSRD
 
A New Way Of Distributed Or Cloud Computing
A New Way Of Distributed Or Cloud ComputingA New Way Of Distributed Or Cloud Computing
A New Way Of Distributed Or Cloud Computing
Ashley Lovato
 
FEDERATED LEARNING FOR PRIVACY-PRESERVING: A REVIEW OF PII DATA ANALYSIS IN F...
FEDERATED LEARNING FOR PRIVACY-PRESERVING: A REVIEW OF PII DATA ANALYSIS IN F...FEDERATED LEARNING FOR PRIVACY-PRESERVING: A REVIEW OF PII DATA ANALYSIS IN F...
FEDERATED LEARNING FOR PRIVACY-PRESERVING: A REVIEW OF PII DATA ANALYSIS IN F...
ijseajournal
 

Similar to Big Data Stockholm v 7 | "Federated Machine Learning for Collaborative and Secure AI" - Andreas Hellander (20)

An_Efficient_Privacy-Enhancing_Cross-Silo_Federated_Learning_and_Applications...
An_Efficient_Privacy-Enhancing_Cross-Silo_Federated_Learning_and_Applications...An_Efficient_Privacy-Enhancing_Cross-Silo_Federated_Learning_and_Applications...
An_Efficient_Privacy-Enhancing_Cross-Silo_Federated_Learning_and_Applications...
 
leewayhertz.com-Federated learning Unlocking the potential of secure distribu...
leewayhertz.com-Federated learning Unlocking the potential of secure distribu...leewayhertz.com-Federated learning Unlocking the potential of secure distribu...
leewayhertz.com-Federated learning Unlocking the potential of secure distribu...
 
2019-09-05Federated Learning.pdf
2019-09-05Federated Learning.pdf2019-09-05Federated Learning.pdf
2019-09-05Federated Learning.pdf
 
Federated learning of deep networks using model averaging
Federated learning of deep networks using model averagingFederated learning of deep networks using model averaging
Federated learning of deep networks using model averaging
 
FAIR data_ Superior data visibility and reuse without warehousing.pdf
FAIR data_ Superior data visibility and reuse without warehousing.pdfFAIR data_ Superior data visibility and reuse without warehousing.pdf
FAIR data_ Superior data visibility and reuse without warehousing.pdf
 
federated learning method of machine learning
federated learning method of machine learningfederated learning method of machine learning
federated learning method of machine learning
 
The FAIR data movement and 22 Feb 2023.pdf
The FAIR data movement and 22 Feb 2023.pdfThe FAIR data movement and 22 Feb 2023.pdf
The FAIR data movement and 22 Feb 2023.pdf
 
Data Management Trends 2022_Shailendra Mruthyunjayappa.pdf
Data Management Trends 2022_Shailendra Mruthyunjayappa.pdfData Management Trends 2022_Shailendra Mruthyunjayappa.pdf
Data Management Trends 2022_Shailendra Mruthyunjayappa.pdf
 
Technology overview of_mobil_247134-1
Technology overview of_mobil_247134-1Technology overview of_mobil_247134-1
Technology overview of_mobil_247134-1
 
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCEANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
 
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCEANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
 
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCEANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
 
Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...
Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...
Cloud Analytics Ability to Design, Build, Secure, and Maintain Analytics Solu...
 
Introducing Polymerize Connect_ The Ultimate Solution for Chemical R&D (1).pdf
Introducing Polymerize Connect_ The Ultimate Solution for Chemical R&D (1).pdfIntroducing Polymerize Connect_ The Ultimate Solution for Chemical R&D (1).pdf
Introducing Polymerize Connect_ The Ultimate Solution for Chemical R&D (1).pdf
 
A New Way Of Distributed Or Cloud Computing
A New Way Of Distributed Or Cloud ComputingA New Way Of Distributed Or Cloud Computing
A New Way Of Distributed Or Cloud Computing
 
thilaganga journal 1
thilaganga journal 1thilaganga journal 1
thilaganga journal 1
 
Privilege_Escalation_Attack_Detection_and_Mitigation_in_Cloud_Using_Machine_L...
Privilege_Escalation_Attack_Detection_and_Mitigation_in_Cloud_Using_Machine_L...Privilege_Escalation_Attack_Detection_and_Mitigation_in_Cloud_Using_Machine_L...
Privilege_Escalation_Attack_Detection_and_Mitigation_in_Cloud_Using_Machine_L...
 
Unit-II-part 3.pdf
Unit-II-part 3.pdfUnit-II-part 3.pdf
Unit-II-part 3.pdf
 
FEDERATED LEARNING FOR PRIVACY-PRESERVING: A REVIEW OF PII DATA ANALYSIS IN F...
FEDERATED LEARNING FOR PRIVACY-PRESERVING: A REVIEW OF PII DATA ANALYSIS IN F...FEDERATED LEARNING FOR PRIVACY-PRESERVING: A REVIEW OF PII DATA ANALYSIS IN F...
FEDERATED LEARNING FOR PRIVACY-PRESERVING: A REVIEW OF PII DATA ANALYSIS IN F...
 
Big Data Science Workshop Documentation V1.0
Big Data Science Workshop Documentation V1.0Big Data Science Workshop Documentation V1.0
Big Data Science Workshop Documentation V1.0
 

More from Dataconomy Media

More from Dataconomy Media (20)

Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & 	David An...
Data Natives Paris v 10.0 | "Blockchain in Healthcare" - Lea Dias & David An...
 
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
Data Natives Frankfurt v 11.0 | "Competitive advantages with knowledge graphs...
 
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
Data Natives Frankfurt v 11.0 | "Can we be responsible for misuse of data & a...
 
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
Data Natives Munich v 12.0 | "How to be more productive with Autonomous Data ...
 
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...Data Natives meets DataRobot |  "Build and deploy an anti-money laundering mo...
Data Natives meets DataRobot | "Build and deploy an anti-money laundering mo...
 
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
Data Natives Munich v 12.0 | "Political Data Science: A tale of Fake News, So...
 
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...Data Natives Vienna v 7.0  | "Building Kubernetes Operators with KUDO for Dat...
Data Natives Vienna v 7.0 | "Building Kubernetes Operators with KUDO for Dat...
 
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
Data Natives Vienna v 7.0 | "The Ingredients of Data Innovation" - Robbert de...
 
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...Data Natives Cologne v 4.0  | "The Data Lorax: Planting the Seeds of Fairness...
Data Natives Cologne v 4.0 | "The Data Lorax: Planting the Seeds of Fairness...
 
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
Data Natives Cologne v 4.0 | "How People Analytics Can Reveal the Hidden Aspe...
 
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
Data Natives Amsterdam v 9.0 | "Ten Little Servers: A Story of no Downtime" -...
 
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
Data Natives Amsterdam v 9.0 | "Point in Time Labeling at Scale" - Timothy Th...
 
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
Data Natives Hamburg v 6.0 | "Interpersonal behavior: observing Alex to under...
 
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
Data Natives Hamburg v 6.0 | "About Surfing, Failing & Scaling" - Florian Sch...
 
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
Data NativesBerlin v 20.0 | "Serving A/B experimentation platform end-to-end"...
 
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
Data Natives Berlin v 20.0 | "Ten Little Servers: A Story of no Downtime" - A...
 
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
Big Data Frankfurt meets Thinkport | "The Cloud as a Driver of Innovation" - ...
 
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
Thinkport meets Frankfurt | "Financial Time Series Analysis using Wavelets" -...
 
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
Big Data Helsinki v 3 | "Distributed Machine and Deep Learning at Scale with ...
 
Big Data Helsinki v 3 | "What you should know about PSD2 APIs?" - Joonas Tomperi
Big Data Helsinki v 3 | "What you should know about PSD2 APIs?" - Joonas TomperiBig Data Helsinki v 3 | "What you should know about PSD2 APIs?" - Joonas Tomperi
Big Data Helsinki v 3 | "What you should know about PSD2 APIs?" - Joonas Tomperi
 

Recently uploaded

Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Peter Udo Diehl
 

Recently uploaded (20)

In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT Professionals
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
Server-Driven User Interface (SDUI) at Priceline
Server-Driven User Interface (SDUI) at PricelineServer-Driven User Interface (SDUI) at Priceline
Server-Driven User Interface (SDUI) at Priceline
 
Enterprise Security Monitoring, And Log Management.
Enterprise Security Monitoring, And Log Management.Enterprise Security Monitoring, And Log Management.
Enterprise Security Monitoring, And Log Management.
 
Transforming The New York Times: Empowering Evolution through UX
Transforming The New York Times: Empowering Evolution through UXTransforming The New York Times: Empowering Evolution through UX
Transforming The New York Times: Empowering Evolution through UX
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
Connecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAKConnecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAK
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
Intelligent Gimbal FINAL PAPER Engineering.pdf
Intelligent Gimbal FINAL PAPER Engineering.pdfIntelligent Gimbal FINAL PAPER Engineering.pdf
Intelligent Gimbal FINAL PAPER Engineering.pdf
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 

Big Data Stockholm v 7 | "Federated Machine Learning for Collaborative and Secure AI" - Andreas Hellander

  • 1. Federated Machine Learning Andreas Hellander Co-founder and Lead Scientist, Scaleout Systems Associate Professor in Scientific Computing, Uppsala University scaleoutsystems.com it.uu.se
  • 2. Main issues with the centralized paradigm in machine learning: ● Private/Proprietary data — Sharing valuable business data with someone else is not an option. ● Regulated data — GDPR, HIPAA, etc. ● Practical blockers — data is too big, the network connection is expensive, slow or unreliable. Also, large datasets relevant to AI problems are controlled by a small number of large organizations and there are no great mechanisms for sharing that data with the data science community. scaleoutsystems.com The data centralization problem 1. Collect and centralize data from different sources (data lake, cloud). 2. Create ML model using centralised data (cluster computing)
  • 3. How can parties come together to create joint ML models without sharing/pooling data?
  • 4. Federated Machine Learning Federated Machine Learning (FedML) is a distributed machine learning approach which enables training on decentralised data. ● Train local machine learning model on local/private data. ● Combine local model updates into a global, federated model. Federated learning addresses the fundamental problems of centralized AI such as privacy, ownership, and locality of data. scaleoutsystems.com/federated-machine-learning
  • 5. The key benefit of FedML Lets parties form alliances/networks to build stronger models than what could be attained by the parties in isolation. ● Data security and privacy where data never moves. ● Powerful data network effects in industries where data cannot be transferred. ● Reduced data transfer costs when data is very large or networks unreliable. scaleoutsystems.com/federated-machine-learning N. Gauraha, O. Spjuth, A. Hellander (2019), manuscript in preparation
  • 6. Early example FedML on Gboard: ● Local model for search suggestion, with context and whether suggestion was clicked ● On device the history is processed, and then only a model update is suggested to Google ● Based on Federated Averaging, a scheme to aggregate weights from locally trained neural nets: https://arxiv.org/pdf/1602.05629.pdf scaleoutsystems.com/federated-machine-learning https://ai.googleblog.com/2017/04/federated-learning-collaborative.html
  • 7. Smart software on top of decentralized infrastructure/instruments ● Let’s an instrument/software vendor build smarter software. ● Digital pathology, medical dosimetry, predictive maintenance etc. ● Sensitive data does not need to be shared. ● Powerful network effects possible. scaleoutsystems.com/federated-machine-learning Federated Model Software services Federated learning system Infrastructure vendor
  • 8. Integrity-preserving E-health ● Digital tools/video surveillance in home care. ● Train and deploy models based on homeowners’ private interactions without collecting central data. scaleoutsystems.com/federated-machine-learning
  • 9. Privacy-preservation features of FedML ● Input privacy simplified since data do not move (handled according to local policies) ● Output privacy - depends on the algorithm, how easy it is to invert the model etc. ● What can be learned from the coordination of computation? ○ Different for federated averaging and ensemble methods (algorithm dependent) scaleoutsystems.com/federated-machine-learning UN Handbook for Privacy-Preserving Techniques
  • 10. Differential privacy & homomorphic encryption in FedML Differential privacy: Add carefully calibrated noise (protects against inference attacks) Homomorphic encryption: Methods work on encrypted data Secure multiparty computation: Aggregate/compute without a third party trust provider/server. scaleoutsystems.com/federated-machine-learning
  • 11. R&D challenges Scalability and ML performance How do we (re)design algorithms and frameworks to scale out to the fog and edge? Decentralized computation How can we do FedML without a third-party trust provider? Adversarial ML How can we make the system robust to dishonest members and external threats? FedML is a research area that integrates many differents areas of computer science and mathematics. scaleoutsystems.com/federated-machine-learning
  • 12. Backdooring federated learning ● Big threat to a FedML comes from within the alliance / from compromised members. ● Large alliances can be expected to be relatively robust to data poisoning attacks. ● Bagdasaryan et al. shows how their proposed approach of model replacement can efficiently introduce backdoors in a global model. ● Secure aggregation/MPC makes it impossible to detect a malicious model update, and who submitted it! scaleoutsystems.com/federated-machine-learning Bagdasaryan et al. How to backdoor federated learning (2019) https://arxiv.org/pdf/1807.00459.pdf
  • 13. Federated learning in production Secure model communication, anomaly detection, etc. API Federated components Global model serving ML pipeline APIML pipeline APIML pipeline
  • 14. A problem that spans many complex areas ● Decentralized computing / fog computing ● Information and security/systems security expertise ● Trust-mechanisms (third-party or decentralized protocol) ● Machine learning algorithms designed for/adapted to a decentralized setting ● Adversarial ML ○ Data poisoning ○ Inference attacks ○ … A considerable increase in system and developer complexity compared to the standard paradigm! scaleoutsystems.com/federated-machine-learning
  • 15. Scaleout Federated Platform Scaleout Studio | Developing Scaleout Store | Package & Deploying Scaleout Serve | Serving Scaleout Federated Platform ML studio - Ingestion - Prepare & Analyse Data - Modeling & Testing - Training ML workflow automation - Automated ML Studio Pipelines API API Model management - Versioning - Annotation - Storage - Distribution API Model serving - Traffic management - Authentication /Authorization - Policies - Monitoring Monitoring & Visualizations API API Endpoint registry Graphical User Interface Incl Pipeline Visualization AuthenticationandAuthorization Model Sharing Joint Training Federation Orchestration Federation Identity & Security Federation Cross Validation & Holdout Set scaleoutsystems.com/federated-machine-learning
  • 16. scaleoutsystems.com Thank you! SCALEOUT Bridging the gap between research and production grade systems in machine learning. Learn more about our Lean AI framework, and our Federated Machine Learning platform. ANDREAS HELLANDER andreas.hellander@it.uu.se SALMAN TOOR salman.toor@it.uu.se Scaleout FedML platform demo at Testa Center, GE Healthcare https://www.youtube.com/watch?v=K-JUNkAYs-4