Deploy ML models to k8s and SageMaker with a single line of code

Deploying ML models could be a challenge, especially if you want to deploy to advanced platforms such as AWS SageMaker or Kubernetes. MLEM gives you a simple and powerful API to do just that - while doing the complex machinery for you.

Alexander Guschin
October 31, 2022 • 3 min read

Header image generated by Dall-E 2

To establish the deployment to cloud platforms, you have to learn how they work, their secrets, and their quirks. To simplify your daily Swiss-army-knife ML duties, you’ll need to write complicated bash scripts, figure out what arguments needs to be supplied to the platform CLI tool or API methods, call them in the correct way and embrace the burden of limitless extension of your knowledge to one more Cloud Platform tool.

But, it doesn’t have to always be that way. Some tools like Terraform help you with managing the infrastructure in a cloud-agnostic way, so why can’t we invent the same for MLOps?

That’s why we’re releasing new Deployment mechanics for MLEM, along with 4 supported deployment targets: Docker container deploy, Heroku, Kubernetes, and AWS SageMaker.

Docker, Heroku, Kubernetest, and SageMaker, in person

Deploying with a single command

MLEM strives to abstract away all the stuff you need to do for deployment. Once you configure kubectl with your cluster IP and credentials, you can deploy your model as simple as:

$ mlem deployment run kubernetes app.mlem \
		--model model --service_type loadbalancer
⏳️ Loading model from model.mlem
💾 Saving deployment to service_name.mlem
🛠 Creating docker image app
  🛠 Building MLEM wheel file...
  💼 Adding model files...
  🛠 Generating dockerfile...
  💼 Adding sources...
  💼 Generating requirements file...
  🛠 Building docker image app:4ee45dc33804b58ee2c7f2f6be447cda...
  ✅  Built docker image app:4ee45dc33804b58ee2c7f2f6be447cda
namespace created. status='{'conditions': None, 'phase': 'Active'}'
deployment created. status='{'available_replicas': None,
 'collision_count': None,
 'conditions': None,
 'observed_generation': None,
 'ready_replicas': None,
 'replicas': None,
 'unavailable_replicas': None,
 'updated_replicas': None}'
service created. status='{'conditions': None, 'load_balancer': {'ingress': None}}'
✅  Deployment app is up in mlem namespace

The app.mlem is a file that is going to have all the information about the deployment that we specified. Later we can use it to deploy a new model version.

This created deployment and service resources in the cluster. Let’s check out pods that were created by the deployment (all the resources are placed in mlem namespace by default):

$ kubectl get pods --namespace mlem
NAMESPACE     NAME                     READY   STATUS    RESTARTS   AGE
mlem          app-cddbcc89b-zkfhx      1/1     Running   0          5m58s

Getting predictions

Since our model above is reachable by HTTP request, we can just open the URL and see the OpenAPI spec there (like this one), or send requests to get predictions. We can also use built-in MLEM functionality to achieve the same:

$ mlem deployment apply app.mlem data.csv --json
[0, 1, 2]

Extend your learning

That’s it: deployment to cloud providers is as simple as it can be. MLEM helps you to simplify your daily routine and help you focus on developing the models and not spending time getting into the DevOps weeds.

To learn how MLEM can help you, try out the Get Started Tutorial or Use Cases.
To see a full-scale Tutorial for Kubernetes, Sagemaker or Heroku, check out our User Guide.
To quickly get your questions answered, reach us in Discord or GitHub issues.

What’s next?

It’s been five months since we released MLEM on the 1st of June, and now it’s October 31st already. With all these big deployment targets, MLEM finally looks like a formidable little dog 🎃. What’s next on the agenda?

We’re going to work on an e2e Computer Vision scenario. Think about training a NN to classify images, saving it with MLEM, and deploying it to K8s or Sagemaker.
We are going to share how to use MLEM when your model consists of two parts: preprocessing and inference.
Batch processing is something we received many requests about. We’ll set up an example of how to use MLEM with Airflow and publish it. 📚

Happy to hear your thoughts on this!

Machine Learning should be ~~mlemming~~ scary! Once a year only.

Happy Halloween!

← Back to blog

Deploy ML models to k8s and SageMaker with a single line of code

Deploying with a single command

Getting predictions

Extend your learning

What’s next?

Ready to get started?