Deploying PyTorch in Python via a REST API with PESTO
Abstract
This tutorial is inspired from the official Deploying PyTorch in Python via a REST API with Flask tutorial.
In this walkthrough, you will be guided in using PESTO for packaging your Deep Learning Model such that it is ready for deployment in production. You will be able to send processing requests to your newly created web service embedding your own inference model.
This model is a Resnet 50 CNN trained on ImageNet, and takes as input an image and returns one of the 1000 imagenet classes.
During this tutorial you will learn to
- Package a model using PESTO
- Define the input and output API of you web service
- Generate the web service Docker image
- Deploy your webservice
- Send requests & get responses from the service
Install PESTO
First, ensure you have PESTO installed in a python 3.6+ environment. Typically, you can use Miniconda as a virtual env.
Note
You should have docker community edition installed and configured in your machine. Refer to the docker documentation for more details.
To install PESTO with pip (see Get Started):
$ pip install processing-factory
Create PESTO project
Next, initialize your PESTO project in the desired repository.
$ pesto init {PESTO_root_projects_repository_path}
You are prompted for some information to fill the default template. Here's an example of the display
---------------------------------------------------------------------------------------------------------------------------
____ _____ ____ _____ ___ ____ _ __ _
| _ \| ____/ ___|_ _/ _ \ _ | _ \ _ __ ___ ___ ___ ___ ___(_)_ __ __ _ / _| __ _ ___| |_ ___ _ __ _ _
| |_) | _| \___ \ | || | | | (_) | |_) | '__/ _ \ / __/ _ \/ __/ __| | '_ \ / _` | | |_ / _` |/ __| __/ _ \| '__| | | |
| __/| |___ ___) || || |_| | _ | __/| | | (_) | (_| __/\__ \__ \ | | | | (_| | | _| (_| | (__| || (_) | | | |_| |
|_| |_____|____/ |_| \___/ (_) |_| |_| \___/ \___\___||___/___/_|_| |_|\__, | |_| \__,_|\___|\__\___/|_| \__, |
|___/ |___/
----- ProcESsing facTOry : 1.4.3 -------------------------------------------------------------------------------------
Please fill necessary information to initialize your template
maintainer_fullname [pesto]: Computer Vision
maintainer_email [pesto@airbus.com]: computervision@airbus.com
project_name [algo-service]: pytorch-deployment-tutorial
project_sname [pytorch-deployment-tutorial]:
project_short_description [Pesto Template contains all the boilerplate you need to create a processing-factory project]: My first deployment with PESTO
project_version [1.0.0.dev0]: 1.0.0
[2022-12-13 17:44:24,345] 28731-INFO app::init():l44:
Service generated at /tmp/pesto/pytorch-deployment-tutorial
It generates the default template in a folder pytorch-deployment-tutorial
with the following file structure:
pytorch-deployment-tutorial/
├── algorithm
│ ├── __init__.py
│ ├── input_output.py
│ └── process.py
├── __init__.py
├── Makefile
├── MANIFEST.in
├── pesto
│ ├── api
│ ├── build
│ └── tests
├── README.md
├── requirements.txt
└── setup.py
You can recognize a python package with a package named algorithm
and a module algorithm.process
.
The main processing is defined here (in Python and using your custom libraries if you want to do so)
The folder pesto
includes the necessary resources to build the docker image containing the service:
pesto/api
will specify the input/output of our process in terms of RESTful APIpesto/build
will specify resources, docker images, etc ... so that PESTO can build the service with the correct dependenciespesto/test
will contains resources to test & debug your service once built as well as helper scripts to use
Your Custom Processing code
Tip
Due to the way we will load pesto-defined files in our process.py, (as well as custom dependencies unpacked at specific locations), it is hard to locally test the custom processing code without rewriting part of it to work locally. This is a known difficulty in development, we recommend to wrap your codebase under a custom library or package and to write as little code as possible (loading models, calling the prediction library then formatting the result properly) under process.py
First, we will specify our inference pipeline. Our objective is to use a pretrained Convolutional Neural Network (A Resnet50) from torchvision
to predict classes for image that will be fed to it.
The model was trained on ImageNet so it should return one amongst 1000 classes when presented with an image.
We will load our model, using the included checkpoints loading function of torchvision, as well as a json file containing the conversion between class indexes and class names (which is stored in /etc/pesto/config.json
, more on that later).
1 2 3 4 5 6 7 8 9 10 |
|
Resnet model requires the image to be of 3 channel RGB image of size 224 x 224. We will preprocess the image with Imagenet values as well.
Info
Should you require more information , please refer to the original tutorial as well as the pytorch documentation
1 2 3 4 5 6 7 8 9 10 11 |
|
Now, getting predictions from this model is simple:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
Now we will need to put these functions so that PESTO can properly call them.
Look at algorithm/process.py
. This is the module that will be loaded by PESTO inside our server and which will be called during preprocessing.
There is a Process
class with on_start()
and process()
methods.
The on_start()
method will be called on the first processing request, it is usually useful to load resources etc.
The process()
function is called during call to /api/v1/process
, when we want to actually process input data
We want to integrate our previous code into this structure, so your algorithm/process.py
file should look like this (replace the existing process.py
file by this code, or write your own)
Note
We did not load the Model in the Process
class so each method inside the Process
class is static.
process.py | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 |
|
input_output.py | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 |
|
About Images Format
PESTO decodes input request in a specific way, which means that for images they are provided to the algorithm in Channel,Height,Width format, contrarily to the usual Height,Width,Channel format. This means that a transposition is required to wrap them up in PIL format for example.
The easiest way to do so is to call image = image.transpose((1, 2, 0))
Generating the input & output schemas
PESTO needs the input and output json schemas for specifying the algorithm API to the end users. It can be done by editing the files. However, since we used the Input
and Output
classes for the process()
's signature, we can benefit from pesto schemagen
to generate the schema files:
pesto schemagen --force {PESTO_root_projects_repository_path}
The generated schemas are in api/input_schema.json
and in api/output_schema.json
:
Input and output json schemas
api/input_schema.json | |
---|---|
1 2 3 4 5 6 7 8 9 |
|
api/output_schema.json | |
---|---|
1 2 3 4 5 6 7 |
|
Visit the schemagen's checklist to understand how to benefit from the schemagen
mechanism.
Configuring the Processing API for our service
Now that we have implemented our processing, we will configure the web service API to map the RestAPI with the processing API.
Let's have a look at the pesto/api
folder :
pesto/api/
├── config.json
├── config_schema.json
├── description.json
├── description.stateful.json
├── input_schema.json
├── output_schema.json
├── user_definitions.json
└── version.json
config.json
is a static file which will be availableconfig_schema.json
is a json schema file that specifies whatconfig.json
should look likedescription.json
is a json file that contains information about our processinginput_schema.json
is the specification of the input payload that is necessary to runProcess.process()
. It will be used to specify what should be sent to the webserveroutput_schema.json
is the specification of the output response of the processinguser_definitions.json
the user definitions (reusable JSON schema objects)description.stateful.json
is an alternative description that will be used with a different profile. The later parts of the tutorial will address this point specifically.
For more information on jsonschema please refer to the official documentation
config.json
In config.json
you can put any information that will be used later to configure your algorithm. This can be useful when used in conjunction with profiles, should you have different configuration for different profiles.
In our use case, we will simply put all the imagenet classes in this file, so that they are readily acessible by the webservice.
Download this file and copy it as config.json
Imagenet classes can then be loaded in process.py
as follows:
process.py | |
---|---|
1 2 3 |
|
Now we have to define the json schema (config_schema.json
) that validates it. Here it is:
api/config_schema.json | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
Tip
You can use the following code snippet to check for json schema validity in python
1 2 3 4 5 6 7 8 9 10 |
|
description.json
The description.json
file contains information that describe your processing. Feel free to fill as much information as possible. Note that those information are INFORMATIVE only and not used anywhere except for the stateful
key which has to set. For now, leave it to false
, we will come back on it later.
Here is an example of a description.json
file that you can copy:
api/description.json | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
Defining our packaging & dependencies
Now that we have specified our API, let's take on the building part of PESTO.
The principle of PESTO is that a Docker
image with a webservice containing your processing will be constructed when we call pesto build
. The next steps will be configuring PESTO to build a correct docker.
Python dependencies for the project & requirements.txt
The project we created is a python package and will be installed as such inside our docker. It is possible to specify the python requirements directly in requirements.txt
as it will be parsed when doing pip install {our project}
The alternative method would be to provide a docker base image with everything already installed, but this is a more advanced usage.
For now, the requirements.txt
file at the root of our project should look like this:
numpy
Pillow
torch>=1.5.0
torchvision>=0.6.0
Now let's look at the pesto/build
folder
build/
├── build.json
├── requirements.cpu.json
└── requirements.json
Service Name & build.json
The build.json
contains automatically generated information that will be used by PESTO later. You should not have to modify it except if you want to change the version
build/build.json | |
---|---|
1 2 3 4 5 6 |
|
The docker image you will generate during build will be tagged name:version
.
File Dependencies & requirements.json
There are two requirements.json
files automatically generated. requirements.gpu.json
defines a profile for GPU support and we will see later how to configure it.
The requirements.json
file default as such
requirements.json | |
---|---|
1 2 3 4 5 |
|
dockerBaseImage
is a pointer towards a docker image that will be used to build the webservice. PESTO will inherit from this base image to install itself as well as the process and its dependencies. For now, python:3.6-stretch
is a good starting point as it contains the necessary resources installed. You can pass a custom docker image to this step, provided your docker client can access it.
environments
is used to set environment variables. We will set the $TORCH_HOME
environment variable to ensure we know its location. The $TORCH_HOME
variable is used by torchvision
to download weights in specific locations, check the torch Hub documentation for more details
1 2 3 |
|
requirements
is helpful to add static resources such as model weights, configs, as well as custom python package. For now, requirements
support two types of resources:
- static resources inside
.tar.gz
archives that will be uncompressed in environment - python wheel
.whl
files that can be pip installed
In order to try ourselves this mechanism, we will download the weights for our model. Torchvision models automatically do this by default when the models are called if the weights are not in $TORCH_HOME
, but in our case we will put the weights ourselves so that no download step is done during runtime
First, download this file
wget https://download.pytorch.org/models/resnet50-19c8e357.pth
Then put it into a .tar.gz
archive accessible via either a uri (file://
), or an url (gs://
and http://
are supported for now). Note that if you would deploy a model into production we recommend uploading resources to a server or committing them alongside your project so that everything is 100% reproducible
tar -zcvf checkpoint.tar.gz resnet50-19c8e357.pth
Now, note the uri of this checkpoint.tar.gz
. We want to uncompress this file in /opt/torch/checkpoints/
in our docker. So your requirements file will look like:
build/requirements.json | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Building the Service
Now we have everything we need to build our service. The building part is simple:
pesto build {root of your project}/pytorch-deployment-tutorial
There will be a lot of logging informing you about what is happening. PESTO is using /home/$USER/.pesto/{process name}/{process version}
to store resources needed to build the docker containing the service.
Example in our case:
pytorch-deployment-tutorial/
└── 1.0.0.dev0
├── checkpoints
│ └── resnet50-19c8e357.pth
├── dist
│ └── pesto_cli-1.0.0rc1-py3-none-any.whl
├── Dockerfile
├── pesto
│ └── api_geo_process_v1.0.yaml
├── pytorch-deployment-tutorial
│ ├── algorithm
│ │ ├── __init__.py
│ │ └── process.py
│ ├── __init__.py
│ ├── Makefile
│ ├── MANIFEST.in
│ ├── pesto
│ │ ├── api
│ │ │ ├── config.json
│ │ │ ├── config_schema.json
│ │ │ ├── description.json
│ │ │ ├── input_schema.json
│ │ │ ├── output_schema.json
│ │ │ ├── service.json
│ │ │ └── version.json
│ │ ├── build
│ │ │ ├── build.json
│ │ │ └── requirements.json
│ │ └── tests
│ │ ├── (...)
│ ├── README.md
│ ├── requirements.txt
│ └── setup.py
└── requirements
└── checkpoint.tar.gz
If docker build fails you can debug your service directly in this folder.
If the build succeeds you should be able to see your image docker image ls
:
REPOSITORY TAG IMAGE ID CREATED SIZE
pytorch-deployment-tutorial 1.0.0.dev0 08342775a658 4 minutes ago 3.48GB
Testing and Usage
Now we want to test if everything goes well, which means:
- Launching the docker container and checking that it responds to http requests
- Checking that the process we just deployed is working correctly
Fortunately, PESTO features a test/usage framework which is the purpose of the pesto/test
folder
Booting up the container & first http requests
First, we can verify that we are able to start the container and send very basic requests to it;
Run docker run --rm -p 4000:8080 pytorch-deployment-tutorial:1.0.0.dev0
(check the docker documentation should you need help about the various arguments)
This should start the container so that it can be accessed from http://localhost:4000.
In your browser (or using CURL) you can send basic GET requests to your container, such as
-
http://localhost:4000/api/v1/health (
CURL -X GET http://localhost:4000/api/v1/health
) with should answer "OK" -
http://localhost:4000/api/v1/describe (
CURL -X GET http://localhost:4000/api/v1/describe
) which should return a json file
It is recommended that you save said json file, it will be used later on CURL -X GET http://localhost:4000/api/v1/describe > description.json
Now the question is: How can I send a properly formated processing request with the payload (my image) that I want to send ?
Tip
If you know all about base64 encoding or sending URI with POST requests, feel free to skip this part.
For the next parts you can safely stop your running container
Defining Test Resources
Let's take a look at the pesto/test
directory
tests
├── README.md
├── resources/
│ ├── expected_describe.json
│ ├── test_1/
│ └── test_2/
The resources
folder will be used by the PESTO Test API and be converted to processing requests that will be sent to /api/v1/process
with the right format. The response will then be compared to the expected response, and act as unit tests.
Note
In the later part of this tutorial we will showcase three different ways of generating processing payloads and getting responses / comparing to expected responses. Each method can be used in different context, using different abstraction levels.
The first file of interest is the expected_describe.json
. This file will be compared to the http://localhost:4000/api/v1/describe
json document returned by the API. This description file can be used to parse the information about the API (input / output schema, description etc...)
You will learn in time how to manually create an expected_describe.json
from the pesto/api
folder, for now it is best to copy the describe.json
file that we generated earlier and to put it as expected_describe.json
. You can compare this file to the default expected_describe.json
and notice how the differences translate themselves to the default processing
Now, there are several folders named test_*
. The purpose of these test folders is that the input payload files are deposited in input
and the expected response is in output
Let's take a look at the test folder:
test_1
├── input
│ ├── dict_parameter.json
│ ├── image.png
│ ├── integer_parameter.int
│ ├── number_parameter.float
│ ├── object_parameter.json
│ └── string_parameter.string
└── output
├── areas
│ ├── 0.json
│ └── 1.json
├── dict_output.json
├── geojson.json
├── image_list
│ ├── 0.png
│ └── 1.png
├── image.png
├── integer_output.integer
├── number_output.float
└── string_output.string
You can see that both input and output have files with extension corresponding to input types. The filenames are matched with the json payload keys.
Now, we are going to write two tests with those two images as input:
We know that the input key is image
and the output key is category
the model should predict {"category": "Egyptian_cat"}
We know that the input key is image
and the output key is category
the model should predict {"category": "mortar"}
So, your folder structure should now look like:
tests/
├── resources
│ ├── expected_describe.json
│ ├── test_1 (cat)
│ │ ├── input
│ │ │ └── image.png <- copy the cat image here
│ │ └── output
│ │ └── category.string <- this should be Egypcatian_cat
│ └── test_2 (pesto)
│ ├── input
│ │ └── image.jpg <- copy the pesto image here
│ └── output
│ └── category.string <- this should be mortar
Using pesto test
command
The first way of testing your service is to call pesto test
utility the same way you called pesto build
.
In order, this command will:
- Run the docker container (the same way we did previously)
- Send requests to
api/v1/describe
and compare with theexpected_describe.json
- Send process payloads to
api/v1/process
and compare them to the desired outputs
In your project root, run pesto test .
and check what happens. The logs should show different steps being processed.
You can check the responses and differences between dictionnaries in the .pesto workspace: /home/$USER/.pesto/tests/pytorch-deployment-tutorial/1.0.0.dev0
You will find there the results / responses of all the requests, including describes and processings requests. This is a useful folder to debug potential differences.
Should everything goes well, the results.json
file should look like this
results.json | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 |
|
Note
pesto test
is designed not to fail if the requests pass; Instead it will simply compare dictionaries and display / save the differences as well as the responses, so that the user can go look at what happened and check if this is correct. pesto test
should be used for debug purposed and not for unit test purposes. We will see later how we can use the PESTO test API with pytest to actually run unit tests
Bonus: Using Pytest & unit testing
Once you're sure and have debugged properly you can write or edit unit tests in PESTO_PROJECT_ROOT/tests/
(check the autogenerated file tests/test_service.py
) and run it with pytest tests
on your root project
This can be used to ensure non regression on further edits or if you want to do test driver development
Bonus: Using PESTO Python API to run tests & send requests to model
Should you want to use in a non-scalable way or further test your services, you can have a look at the {PESTO_PROJECT_ROOT}/scripts/example_api_usage.py
file that exposes the low level python API that is used in pesto test
- The
ServiceManager
class is the class used as a proxy for the python Docker API, and is used to pull / run /attach / stop the containers - The
PayloadGenerator
class is used to translate files to actual json payload for the REST API - The
EndpointManager
manages the various endpoints of the processes, and act as a front to post/get requests - The
ServiceTester
is used to validate payloads & responses against their expected values
Note
This API is a simple example of how to use services packaged with pesto in python scripts. We encourage you to copy/paste and modify the classes should you feel the need for specific use cases, but both this and pesto test
is not designed for robustness and scalability
We consider the target of pesto test
capabilities to be the data scientist, integration testing & scalability should be done at production level
Adding a GPU profile
In order to create an image with GPU support, we can complete the proposed profile gpu
.
The file requirements.gpu.json
can be updated as follows :
requirements.gpu.json | |
---|---|
1 2 3 4 5 |
|
You can now build your GPU enabled microservice :
pesto build {root of your project}/pytorch-deployment-tutorial -p gpu
PESTO Profiles
in order to accomodate for different hardware targets or slight variations of the same process to deploy, PESTO has a built-in capabilities called profiles
Basically, PESTO profiles is a list of ordered strings (gpu stateless
) whose .json files in build/api
folders sequentially update the base file.
To use them, simply add the list of profiles to your PESTO commands: pesto build {PESTO_PROJECT_ROOT} -p p1 p2
or pesto test {PESTO_PROJECT_ROOT} -p p1 p2
The profiles json files are named {original_file}.{profile}.json
.
For example, for a description.json
, then the corresponding description.json for the profile gpu would be description.gpu.json
.
Profile jsons can be partially complete as they only update the base values if the files are present.
Example:
description.json
: {"key":"value", "key2":"value2"}
description.p1.json
: {"key":"value3"}
Then calling pesto build . -p p1
will generate a description.json
: {"key":"value3", "key2":"value2"}
and take all the other files without additionnal profiles.
Warning
Due to the sequential nature of dictionary updates, the profiles are order dependent
If you have a computer with nvidia drivers & a gpu you can try to run pesto build . -p gpu
and pesto test . -p gpu --nvidia
which should do the same as above but with gpu support (and actually run the process on gpu)
Stateful & Stateless services
PESTO supports building stateless services as well as stateful services.
With stateless services, it is expected that the processing replies directly to the processing request with the response. These services should have no internal state and should always return the same result when presented with the same payload
Stateful services can have internal states, and store the processing results to be later queried.
The main difference is that sending a processing request to api/v1/process
to a stateful service will not return the result but a jobID
. The job state can be quiered at GET api/v1/jobs/{jobID}/status
and results can be queried at GET api/v1/jobs/{jobID}/results
when the job is done. The response of the latter request will be a json matching the output schema with URI to individual content (that should individually be queried using GET requests
)
Try building your previous service with pesto build . -p stateful
and starting it as previously,
docker run --rm -p 4000:8080 pytorch-deployment-tutorial:1.0.0.dev0-stateful
Then, run the API usage script (python scripts/example_api_usage
) while having modified the image name to stateful.
This script should send several requests (like pesto test), but the advantage is that it doesn't kill the service afterwards, so it is possible to look at what happened:
- Try doing a get request on
/api/v1/jobs/
you should see a list of jobs - Grab a job id then do a GET request on
/api/v1/jobs/{jobID}/status
. It should be "DONE" - Then do a GET request on
/api/v1/jobs/{jobID}/results
to get results
You should get something like
1 2 3 |
|
A GET request on the aforementioned URL should return Egyptian_cat
or mortar
Next Steps
The rest of the documentation should be more accessible now that you have completed this tutorial
Tip
You should version your PESTO project using git so that it is reproducible
Feel free to send us feedback and ask any question on github