Model Deployment with AWS Sagemaker

Write custom Container

We can deploy our model in AWS Sagemaker using a custom container. We create a folder ‘/opt/program’ inside the container where we store the files:

serve: starts the server API
predictor.py: defines Flask REST API

When Sagemaker runs the container it starts the CMD “serve”, which deploys the REST API. The file

predictor.py

loads the pickled model and implements a Flask API with two methods that Sagemaker expects:

[GET] /ping
[POST] /invocations

The pickled model can be copied directly to the container to a folder of choice. Or it can be stored in a S3 bucket and passed on to Sagemaker as an artifact. Sagemaker then extracts the tar.gz file from S3 and copies it to the folder ‘/opt/ml/model’. Therefore, if we pass the model as an artifact, the predictor module needs to unpickle the file at ‘/opt/ml/model’.

The Dockerfile has the basic structure:

FROM ubuntu:latest

RUN apt-get -y update && apt-get install -y --no-install-recommends \
         wget \
         python3 \
         python3-pip\
         nginx \
         ca-certificates \
    && rm -rf /var/lib/apt/lists/*

#Install python libraries
COPY requirements.txt /opt/program/
RUN python3 -m pip install /opt/prorgam/requirements.txt && \
        rm -rf /root/.cache

ENV PYTHONUNBUFFERED=TRUE
ENV PYTHONDONTWRITEBYTECODE=TRUE
ENV PATH="/opt/program:${PATH}"

#copy model to /opt/ml/model or other folder
COPY model.pkl /opt/ml/model/
# Set up the program in the image
COPY model-files /opt/program
WORKDIR /opt/program
RUN chmod +x serve

CMD [ "serve" ]

We can run the container locally and test the API:

#build model
docker build -t sagemaker-model .
#run the container
docker run -p 8080:8080 sagemaker-model:latest 

Now we can access the API at 127.0.0.1:8080:

curl --location --request POST 'http://localhost:8080/invocations' \
--header 'Content-Type: application/json' \
--data-raw '{"data": [[1,2],[3,4],[3,3],[10,1],[7,8]]}'

Sagemaker Deployment

First we need to push our docker image to our AWS ECR repository. Assuming that we have already created a repository with URI: “aws_account_id”.dkr.ecr.”region”.amazonaws.com/”name-model”, we tag the docker image using the same repository URI, that is,

docker tag sagemaker-model:latest "aws_account_id".dkr.ecr."region".amazonaws.com/sagemaker-model:latest

and then push to the ECR repository (it presupposes that one has logged in)

docker push "aws_account_id".dkr.ecr."region".amazonaws.com/model-sagemaker:latest

Now that we have uploaded the docker image we can go to Sagemaker section and create a Model, an Endpoint Configuration and finaly deploy the model to an Endpoint.

Create Model

We give it a name

then we choose to “Provide model artifacts and image location” since we want to use our container

and last we choose “single model” and then write the URI of the docker image. Since our container already has the pickled model we do not need to write anything in the box “Location of model artifacts”

Endpoint-Configuration

We give it a name and then choose the model that we have created in previous step. At this point we need to choose the EC2 instance that will run the container.

Endpoint

Give a name to the endpoint and then choose an existing endpoint-configuration, the one we have previously created:

Then choose “Create Endpoint”.

Access Endpoint

Now that the model is deployed and the endpoint is in “Service”, we build an API to call the container endpoint. There are essentially two ways of doing this:

1) We can invoke the Sagemaker endpoint directly. For this we need to create a role with permission to invoke the sagemaker endpoint.

2) Create a REST API Gateway with a Lambda to call the Sagemaker Endpoint.

1. Invoke Sagemaker directly

In this case the AWS user must have the permission to invoke the sagemaker endpoint. Then we need the credentials Access_Key_id and Secret_access_key of this user. In Postman the request looks like