Model deployment

Learn how to use requests and how to create custom APIs

M22: Requests and APIs
Learn how to deploy custom APIs using serverless functions and serverless containers in the cloud

M23: Cloud Deployment
Learn how to test APIs for functionality and load

M24: API testing
Learn about different ways to improve the deployment of machine learning models

M25: ML Deployment
Learn how to create a frontend for your application using Streamlit

M26: Frontend

Let's say that you have spent 1000 GPU hours and trained the most awesome model that you want to share with the world. One way to do this is, of course, to just place all your code in a GitHub repository, upload a file with the trained model weights to your favorite online storage (assuming it is too big for GitHub to handle) and ask people to just download your code and the weights to run the code by themselves. This is a fine approach in a small research setting, but in production, you need to be able to deploy the model to an environment that is fully contained such that people can just execute without looking (too hard) at the code.

In this session we try to look at methods specialized towards deployment of models on your local machine and also how to deploy services in the cloud.

Learning objectives

The learning objectives of this session are:

Understand the basics of requests and APIs
Be able to create custom APIs using the framework fastapi and run it locally
Knowledge about serverless deployments and how to deploy custom APIs using both serverless functions and serverless containers
Can create basic continuouss deployment pipelines for your models
Understand the basics of frontend development and how to create a frontend for your application using Streamlit
Know how to use more advanced frameworks like onnx and bentoml to deploy your machine learning models