SlideShare ist ein Scribd-Unternehmen logo
1 von 12
Downloaden Sie, um offline zu lesen
Deploying Models
Eng Teong Cheah
Microsoft MVP
Inferencing?
In machine learning, inferencing refers to the use of a trained model to predict labels for
new data on which the model has not been trained. Often, the model is deployed as part
of a service that enables applications to request immediate, or real-time, predictions for
individual, or small numbers of data observations.
In Azure Machine Learning, you can create real-time inferencing solutions by deploying a
model as a service, hosted in a containerized platform, such as Azure Kubernetes Services
(AKS).
Deploying a Real-Time Inferencing Service
Machine learning inference during deployment
When deploying your AI model during production, you need to consider how it will make
predictions. The two main processes for AI models are:
•Batch inference: An asynchronous process that bases its predictions on a batch of
observations. The predictions are stored as files or in a database for end users or business
applications.
•Real-time (or interactive) inference: Frees the model to make predictions at any time
and trigger an immediate response. This pattern can be used to analyze streaming and
interactive application data.
Machine learning inference during deployment
Consider the following questions to evaluate your model, compare the two processes,
and select the one that suits your model:
•How often should predictions be generated?
•How soon are the results needed?
•Should predictions be generated individually, in small batches, or in large batches?
•Is latency to be expected from the model?
•How much compute power is needed to execute the model?
•Are there operational implications and costs to maintain the model?
Batch inference
Batch inference, sometimes called offline inference, is a simpler inference process that
helps models to run in timed intervals and business applications to store predictions.
Consider the following best practices for batch inference:
•Trigger batch scoring: Use Azure Machine Learning pipelines and
the ParallelRunStep feature in Azure Machine Learning to set up a schedule or event-
based automation.
•Compute options for batch inference: Since batch inference processes don't run
continuously, it's recommended to automatically start, stop, and scale reusable clusters
that can handle a range of workloads. Different models require different environments,
and your solution needs to be able to deploy a specific environment and remove it when
inference is over for the compute to be available for the next model.
Real-time inference
Real-time, or interactive, inference is architecture where model inference can be triggered
at any time, and an immediate response is expected. This pattern can be used to analyze
streaming data, interactive application data, and more. This mode allows you to take
advantage of your machine learning model in real time and resolves the cold-start
problem outlined above in batch inference.
The following considerations and best practices are available if real-time inference is right
for your model:
•The challenges of real-time inference: Latency and performance requirements make
real-time inference architecture more complex for your model. A system might need to
respond in 100 milliseconds or less, during which it needs to retrieve the data, perform
inference, validate and store the model results, run any required business logic, and
return the results to the system or application.
Real-time inference
•Compute options for real-time inference: The best way to implement real-time
inference is to deploy the model in a container form to Docker or Azure Kubernetes
Service (AKS) cluster and expose it as a web service with a REST API. This way, the model
runs in its own isolated environment and can be managed like any other web service.
Docker and AKS capabilities can then be used for management, monitoring, scaling, and
more. The model can be deployed on-premises, in the cloud, or on the edge. The
preceding compute decision outlines real-time inference.
Real-time inference
•Multiregional deployment and high availability: Regional deployment and high
availability architectures need to be considered in real-time inference scenarios, as latency
and the model's performance will be critical to resolve. To reduce latency in multiregional
deployments, it's recommended to locate the model as close as possible to the
consumption point. The model and supporting infrastructure should follow the business'
high availability and DR principles and strategy.
Create a real-time
inference service
https://ceteongvanness.wordpress.com/2022/11/01/
create-a-real-time-inference-service/
References
Microsoft Docs

Weitere ähnliche Inhalte

Ähnlich wie Deploying Models

Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Tour de France Azure PaaS 2/7 Exécuter une application
Tour de France Azure PaaS 2/7 Exécuter une applicationTour de France Azure PaaS 2/7 Exécuter une application
Tour de France Azure PaaS 2/7 Exécuter une applicationAlex Danvy
 
cloud services and providers
cloud services and providerscloud services and providers
cloud services and providersKalai Selvi
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerProvectus
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformDATAVERSITY
 
A Survey on Heuristic Based Techniques in Cloud Computing
A Survey on Heuristic Based Techniques in Cloud ComputingA Survey on Heuristic Based Techniques in Cloud Computing
A Survey on Heuristic Based Techniques in Cloud ComputingIRJET Journal
 
MLops on Vertex AI Presentation (AI/ML).pptx
MLops on Vertex AI Presentation (AI/ML).pptxMLops on Vertex AI Presentation (AI/ML).pptx
MLops on Vertex AI Presentation (AI/ML).pptxKnoldus Inc.
 
Modern Architecture in the Cloud of 2018
Modern Architecture in the Cloud of 2018Modern Architecture in the Cloud of 2018
Modern Architecture in the Cloud of 2018Marius Zaharia
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated MLMark Tabladillo
 
Unleashing Apache Kafka and TensorFlow in the Cloud

Unleashing Apache Kafka and TensorFlow in the Cloud
Unleashing Apache Kafka and TensorFlow in the Cloud

Unleashing Apache Kafka and TensorFlow in the Cloud
Kai Wähner
 
Hyf azure ml_1
Hyf azure ml_1Hyf azure ml_1
Hyf azure ml_1KatoK1
 
AWS ML Model Deployment
AWS ML Model DeploymentAWS ML Model Deployment
AWS ML Model DeploymentKnoldus Inc.
 
Big Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft AzureBig Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft AzureMark Tabladillo
 
Cloud-Native DevOps: Simplifying application lifecycle management with AWS | ...
Cloud-Native DevOps: Simplifying application lifecycle management with AWS | ...Cloud-Native DevOps: Simplifying application lifecycle management with AWS | ...
Cloud-Native DevOps: Simplifying application lifecycle management with AWS | ...Amazon Web Services
 
Serverless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaServerless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaData Science Milan
 
Microservice message routing on Kubernetes
Microservice message routing on KubernetesMicroservice message routing on Kubernetes
Microservice message routing on KubernetesFrans van Buul
 
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-usDevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-useltonrodriguez11
 
Platforming the Major Analytic Use Cases for Modern Engineering
Platforming the Major Analytic Use Cases for Modern EngineeringPlatforming the Major Analytic Use Cases for Modern Engineering
Platforming the Major Analytic Use Cases for Modern EngineeringDATAVERSITY
 
Dynamic resource allocation using virtual machines for cloud computing enviro...
Dynamic resource allocation using virtual machines for cloud computing enviro...Dynamic resource allocation using virtual machines for cloud computing enviro...
Dynamic resource allocation using virtual machines for cloud computing enviro...IEEEFINALYEARPROJECTS
 

Ähnlich wie Deploying Models (20)

Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
Tour de France Azure PaaS 2/7 Exécuter une application
Tour de France Azure PaaS 2/7 Exécuter une applicationTour de France Azure PaaS 2/7 Exécuter une application
Tour de France Azure PaaS 2/7 Exécuter une application
 
cloud services and providers
cloud services and providerscloud services and providers
cloud services and providers
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics Platform
 
A Survey on Heuristic Based Techniques in Cloud Computing
A Survey on Heuristic Based Techniques in Cloud ComputingA Survey on Heuristic Based Techniques in Cloud Computing
A Survey on Heuristic Based Techniques in Cloud Computing
 
MLops on Vertex AI Presentation (AI/ML).pptx
MLops on Vertex AI Presentation (AI/ML).pptxMLops on Vertex AI Presentation (AI/ML).pptx
MLops on Vertex AI Presentation (AI/ML).pptx
 
Modern Architecture in the Cloud of 2018
Modern Architecture in the Cloud of 2018Modern Architecture in the Cloud of 2018
Modern Architecture in the Cloud of 2018
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated ML
 
Unleashing Apache Kafka and TensorFlow in the Cloud

Unleashing Apache Kafka and TensorFlow in the Cloud
Unleashing Apache Kafka and TensorFlow in the Cloud

Unleashing Apache Kafka and TensorFlow in the Cloud

 
Hyf azure ml_1
Hyf azure ml_1Hyf azure ml_1
Hyf azure ml_1
 
Cloud computing
Cloud computing Cloud computing
Cloud computing
 
AWS ML Model Deployment
AWS ML Model DeploymentAWS ML Model Deployment
AWS ML Model Deployment
 
Big Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft AzureBig Data Adavnced Analytics on Microsoft Azure
Big Data Adavnced Analytics on Microsoft Azure
 
Cloud-Native DevOps: Simplifying application lifecycle management with AWS | ...
Cloud-Native DevOps: Simplifying application lifecycle management with AWS | ...Cloud-Native DevOps: Simplifying application lifecycle management with AWS | ...
Cloud-Native DevOps: Simplifying application lifecycle management with AWS | ...
 
Serverless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaServerless machine learning architectures at Helixa
Serverless machine learning architectures at Helixa
 
Microservice message routing on Kubernetes
Microservice message routing on KubernetesMicroservice message routing on Kubernetes
Microservice message routing on Kubernetes
 
DevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-usDevOps for Machine Learning overview en-us
DevOps for Machine Learning overview en-us
 
Platforming the Major Analytic Use Cases for Modern Engineering
Platforming the Major Analytic Use Cases for Modern EngineeringPlatforming the Major Analytic Use Cases for Modern Engineering
Platforming the Major Analytic Use Cases for Modern Engineering
 
Dynamic resource allocation using virtual machines for cloud computing enviro...
Dynamic resource allocation using virtual machines for cloud computing enviro...Dynamic resource allocation using virtual machines for cloud computing enviro...
Dynamic resource allocation using virtual machines for cloud computing enviro...
 

Mehr von Eng Teong Cheah

Responsible Machine Learning
Responsible Machine LearningResponsible Machine Learning
Responsible Machine LearningEng Teong Cheah
 
Machine Learning Workflows
Machine Learning WorkflowsMachine Learning Workflows
Machine Learning WorkflowsEng Teong Cheah
 
Experiments & TrainingModels
Experiments & TrainingModelsExperiments & TrainingModels
Experiments & TrainingModelsEng Teong Cheah
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine LearningEng Teong Cheah
 
Getting Started with Azure Machine Learning
Getting Started with Azure Machine LearningGetting Started with Azure Machine Learning
Getting Started with Azure Machine LearningEng Teong Cheah
 
Hacking Containers - Container Storage
Hacking Containers - Container StorageHacking Containers - Container Storage
Hacking Containers - Container StorageEng Teong Cheah
 
Hacking Containers - Looking at Cgroups
Hacking Containers - Looking at CgroupsHacking Containers - Looking at Cgroups
Hacking Containers - Looking at CgroupsEng Teong Cheah
 
Hacking Containers - Linux Containers
Hacking Containers - Linux ContainersHacking Containers - Linux Containers
Hacking Containers - Linux ContainersEng Teong Cheah
 
Data Security - Storage Security
Data Security - Storage SecurityData Security - Storage Security
Data Security - Storage SecurityEng Teong Cheah
 
Application Security- App security
Application Security- App securityApplication Security- App security
Application Security- App securityEng Teong Cheah
 
Application Security - Key Vault
Application Security - Key VaultApplication Security - Key Vault
Application Security - Key VaultEng Teong Cheah
 
Compute Security - Container Security
Compute Security - Container SecurityCompute Security - Container Security
Compute Security - Container SecurityEng Teong Cheah
 
Compute Security - Host Security
Compute Security - Host SecurityCompute Security - Host Security
Compute Security - Host SecurityEng Teong Cheah
 
Virtual Networking Security - Network Security
Virtual Networking Security - Network SecurityVirtual Networking Security - Network Security
Virtual Networking Security - Network SecurityEng Teong Cheah
 
Virtual Networking Security - Perimeter Security
Virtual Networking Security - Perimeter SecurityVirtual Networking Security - Perimeter Security
Virtual Networking Security - Perimeter SecurityEng Teong Cheah
 
Access Security - Hybrid Identity
Access Security - Hybrid IdentityAccess Security - Hybrid Identity
Access Security - Hybrid IdentityEng Teong Cheah
 

Mehr von Eng Teong Cheah (20)

Monitoring Models
Monitoring ModelsMonitoring Models
Monitoring Models
 
Responsible Machine Learning
Responsible Machine LearningResponsible Machine Learning
Responsible Machine Learning
 
Training Optimal Models
Training Optimal ModelsTraining Optimal Models
Training Optimal Models
 
Machine Learning Workflows
Machine Learning WorkflowsMachine Learning Workflows
Machine Learning Workflows
 
Working with Compute
Working with ComputeWorking with Compute
Working with Compute
 
Working with Data
Working with DataWorking with Data
Working with Data
 
Experiments & TrainingModels
Experiments & TrainingModelsExperiments & TrainingModels
Experiments & TrainingModels
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learning
 
Getting Started with Azure Machine Learning
Getting Started with Azure Machine LearningGetting Started with Azure Machine Learning
Getting Started with Azure Machine Learning
 
Hacking Containers - Container Storage
Hacking Containers - Container StorageHacking Containers - Container Storage
Hacking Containers - Container Storage
 
Hacking Containers - Looking at Cgroups
Hacking Containers - Looking at CgroupsHacking Containers - Looking at Cgroups
Hacking Containers - Looking at Cgroups
 
Hacking Containers - Linux Containers
Hacking Containers - Linux ContainersHacking Containers - Linux Containers
Hacking Containers - Linux Containers
 
Data Security - Storage Security
Data Security - Storage SecurityData Security - Storage Security
Data Security - Storage Security
 
Application Security- App security
Application Security- App securityApplication Security- App security
Application Security- App security
 
Application Security - Key Vault
Application Security - Key VaultApplication Security - Key Vault
Application Security - Key Vault
 
Compute Security - Container Security
Compute Security - Container SecurityCompute Security - Container Security
Compute Security - Container Security
 
Compute Security - Host Security
Compute Security - Host SecurityCompute Security - Host Security
Compute Security - Host Security
 
Virtual Networking Security - Network Security
Virtual Networking Security - Network SecurityVirtual Networking Security - Network Security
Virtual Networking Security - Network Security
 
Virtual Networking Security - Perimeter Security
Virtual Networking Security - Perimeter SecurityVirtual Networking Security - Perimeter Security
Virtual Networking Security - Perimeter Security
 
Access Security - Hybrid Identity
Access Security - Hybrid IdentityAccess Security - Hybrid Identity
Access Security - Hybrid Identity
 

Kürzlich hochgeladen

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 

Deploying Models

  • 3. Inferencing? In machine learning, inferencing refers to the use of a trained model to predict labels for new data on which the model has not been trained. Often, the model is deployed as part of a service that enables applications to request immediate, or real-time, predictions for individual, or small numbers of data observations. In Azure Machine Learning, you can create real-time inferencing solutions by deploying a model as a service, hosted in a containerized platform, such as Azure Kubernetes Services (AKS).
  • 4. Deploying a Real-Time Inferencing Service
  • 5. Machine learning inference during deployment When deploying your AI model during production, you need to consider how it will make predictions. The two main processes for AI models are: •Batch inference: An asynchronous process that bases its predictions on a batch of observations. The predictions are stored as files or in a database for end users or business applications. •Real-time (or interactive) inference: Frees the model to make predictions at any time and trigger an immediate response. This pattern can be used to analyze streaming and interactive application data.
  • 6. Machine learning inference during deployment Consider the following questions to evaluate your model, compare the two processes, and select the one that suits your model: •How often should predictions be generated? •How soon are the results needed? •Should predictions be generated individually, in small batches, or in large batches? •Is latency to be expected from the model? •How much compute power is needed to execute the model? •Are there operational implications and costs to maintain the model?
  • 7. Batch inference Batch inference, sometimes called offline inference, is a simpler inference process that helps models to run in timed intervals and business applications to store predictions. Consider the following best practices for batch inference: •Trigger batch scoring: Use Azure Machine Learning pipelines and the ParallelRunStep feature in Azure Machine Learning to set up a schedule or event- based automation. •Compute options for batch inference: Since batch inference processes don't run continuously, it's recommended to automatically start, stop, and scale reusable clusters that can handle a range of workloads. Different models require different environments, and your solution needs to be able to deploy a specific environment and remove it when inference is over for the compute to be available for the next model.
  • 8. Real-time inference Real-time, or interactive, inference is architecture where model inference can be triggered at any time, and an immediate response is expected. This pattern can be used to analyze streaming data, interactive application data, and more. This mode allows you to take advantage of your machine learning model in real time and resolves the cold-start problem outlined above in batch inference. The following considerations and best practices are available if real-time inference is right for your model: •The challenges of real-time inference: Latency and performance requirements make real-time inference architecture more complex for your model. A system might need to respond in 100 milliseconds or less, during which it needs to retrieve the data, perform inference, validate and store the model results, run any required business logic, and return the results to the system or application.
  • 9. Real-time inference •Compute options for real-time inference: The best way to implement real-time inference is to deploy the model in a container form to Docker or Azure Kubernetes Service (AKS) cluster and expose it as a web service with a REST API. This way, the model runs in its own isolated environment and can be managed like any other web service. Docker and AKS capabilities can then be used for management, monitoring, scaling, and more. The model can be deployed on-premises, in the cloud, or on the edge. The preceding compute decision outlines real-time inference.
  • 10. Real-time inference •Multiregional deployment and high availability: Regional deployment and high availability architectures need to be considered in real-time inference scenarios, as latency and the model's performance will be critical to resolve. To reduce latency in multiregional deployments, it's recommended to locate the model as close as possible to the consumption point. The model and supporting infrastructure should follow the business' high availability and DR principles and strategy.
  • 11. Create a real-time inference service https://ceteongvanness.wordpress.com/2022/11/01/ create-a-real-time-inference-service/