Perhaps the following conversation is familiar sounding to you, or unfortunately, something you have experienced before or even struggling with currently:
Frank (the ML person): Hey Jane, how things going with getting
my ML project launched into production?
Jane (the DevOps person): (sighs) It's going. We have a lot on our plate
right now so it's going to take more time.
Frank: (trying not to get upset) But that's what
you said like a month or two ago..
Jane: (cooly) Well, there is a lot that needs
to happen before your project can be
production-ready. We have to set up
the build pipeline to package up your
python code, write tests, add continuous
integration, sort out logging and monitoring.
Oh, and there's error handling,
security audits, auto-scaling, load balancing.
After that, go through integration and user
acceptance testing, before we can launch
Frank: (frustration creeping in) That sounds
like a lot of hoops to jump through and
I have no idea if all that is necessary to
getting it launched. Can it be make simpler?
I was talking to a ML friend of mine and he
mentioned they launched his project using
<name some shiny object here>. Can't we _just_
use that and get mine launched also?
Jane: (frowns and replies tensely) Let's take this
offline. My team just need more time to work
through the steps to get the project
Whether you have been on one side of the conversation or the other, interactions like this is unpleasant, and worse part is, this ML project hadn’t gotten any closer to being deployed. The DevOps side has their good reasons, while the ML side is frustrated that they are being stonewalled from getting their work delivered, and it feels like there’s little they can do.
Can such a situation be avoided or at least managed better? Is there anything the ML side can do to help the process along?
The reality is the the closer the two sides can come towards their handoff point, the smoother and faster the process would be. We will get into more details on where these rendezvous points may be, but conceptually they are that long list of steps that Jane said that’s needed in order to get a ML model (or any new projects for that matter) deployed. The more items the ML side can take care of (but up to a certain point), the less work for the devops folks and thus shortens the process.
This doesn’t mean the ML side needs to become devops in order to get projects deployed. It’s more to structure your projects in a way that’s similar to other software packages out there, such that they “blend in” with other software being deployed. This would greatly reduce the overhead of conforming a new package, as well as lower the anxiety for on-boarding an unknown quantity.
The good news is, the work involved isn’t that much, once you become familiar with steps. In this and follow on posts, We will walk you through examples on packaging ML models such that they sail through your deployment workflows just like the rest of them.
Let’s get started.
Note: The resulting code for this example ML package can be found here.
Step 0: The Plan
Essentially, we want to make Frank’s ML model as easy to use for his clients (Jane in this example) as this pseudocode:
create an instance of frank's-ml-model
load an image (or whatever the input might be)
get predictions of the image using frank's-ml-model
do something with the predictions
As a matter of fact, at the end of this tutorial we’d turn the above pseudocode into Python code that you’d add to the README page of the ML model, such that your clients can try out your model by simple copy-and-paste the code to get started.
The larger goal is to make Frank’s ML model behave no differently than other Python packages, such that the devops team can deploy it like any others they have done.
Step 1: Gather the Ingredients
Let’s assume that you have reached the following point in researching and refining a ML model for a particular use case, such that you have:
- A ML model trained to an accuracy that meets your requirements
- An input-output format that supports the typical use case of your pipeline
- Any supporting code for pre- and post-process the input and output data
To give an example, an image classification model would consists of
- A ResNet-50 model you trained/fine-tuned to classify 23 types of fruits
- Inputs are RGB image matrix (vs URLs or files), and outputs are the top k classes and prediction probabilities (vs an array of 23 floats)
- Some Python functions to resize, crop, and normalize the input images
A different scenario would be to predict the home price (1 integer) for a given set of metadata of home listings, formatted as a CSV row (or JSON/XML or Pandas frame, etc). The specifics doesn’t matter, but the point is to decide what are the input and its format you need to make your ML models function correctly, and how the output would be returned to the requesters so they can use the results.
The nice thing about ML is that usually this input-output pair is well understood, since it is already expressed in the training data. You’d just need to choose the formats for the inputs and outputs that work for the consumer of your model. With that, you would have defined an interface to the outside world, and everything else is internal to your domain. This allows for you to update or even swap out your ML model later on, without the outside world needing to change anything or even noticing!
Step 2: Pick a Box for the Package
This used to be the step that’s a pain in the butt, but nowadays there’s a
de facto way to to package Python software. If you have ever use
install Python packages, including Numpy, Tensorflow, and PyTorch, you’d
know how easy they are to install. That’s what we will be creating for
Frank’s ML model.
Creating Python packages isn’t super complicated, but does involve several manual steps. Even better, people have created templates that we can use to generate the boilerplate code so we need not become packaging experts. For this tutorial, we will use a simple template based on Cookiecutter I modified for PyTorch models, named cookiecutter-pytorch-basic, although it can be used for non-PyTorch projects without much adaptation. The process of Cookiecutter templates is that it would ask for some essential details about the package, clone the template, and generate an empty project structure customized for packaging our model, a.k.a., the box. The steps are pretty simple:
# install the cookiecutter tool
pip install -U cookiecutter
# generate the Python package project
You’d be asked for some details on the package, most important the name
of it. This should be a unique name so in the future, people would install
pip <name_of_you_ml_package>. Here’s an example:
project_name [Name of your Pytorch package]: franks-ml-model
author_full_name : Frank Torch
author_email : firstname.lastname@example.org
github_username : frank_torch
project_short_description : Model for classifying fruits
1 - MIT license
2 - BSD license
3 - ISC license
4 - Apache Software License 2.0
5 - GNU General Public License v3
6 - Not open source
Choose from 1, 2, 3, 4, 5, 6 : 6
Cookiecutter will create the following directory structure:
├── franks_ml_model <-- this is the empty box
│ └── __init__.py
This is our “box” readied to be populated with our ML model. We will customized it further later on, but by and large, these files are preconfigured and ready to go so Cookiecutter saved us the time and energy in setting up the packaging project.
Step 3: Place the Content into the Box
Now let’s copy/move your ML model definition into this project and place
franks_ml_model/franks_ml_model/. For this tutorial we will
use an example PyTorch model from the excellent project EfficientNet-PyTorch from Luke Melas-Kyriazi. It’s very self-contained and has pretrained models so it makes for
an ideal project for this tutorial. This would simulate the code for Frank’s
hypothetical ML model prior to being packaged.
git clone https://github.com/lukemelas/EfficientNet-PyTorch
cp EfficientNet-PyTorch/efficientnet_pytorch/*.py franks_ml_model/franks_ml_model/
With that, we technically have finished packaging our ML model. Quite easy, isn’t it? At this point you can run the build command, upload the package, and it’d be ready (kinda) for others like Jane to use.
However, this package would not meet the interface requirement for our input-output pair. Thus, let’s do a bit more work to make it as dead simple as possible.
Step 4: Add Some Bubble Wrap to Encapsulate the Content
What we have completed so far is a package that your clients could do the following (what the original interface of EfficientNet is):
from efficientnet_pytorch import EfficientNet
model = EfficientNet.from_pretrained('efficientnet-b0')
outputs = model(img)
While that seems simple enough, the model requires the image to be a tensor
that’s been properly transformed first. However, we don’t necessarily want
every client to have to know what this transform entails, since this is
an internal detail we’d like to hide from them. Same thing goes for
outputs of the model. It’s a tensor of n floats, which we can convert
into a friendlier format. This is what “wrapping” refers to, to create
a simpler interface and hide internal details.
The process is relatively simple: we could either modify the original EfficientNet implementation and add convenience methods, or create a new (small) class for the wrapper. Without going into the details, it is usually cleaner to do the latter.
At a high level, we will create a new “infer” class (or “predict” if you
prefer) that gets initialized with the name of the architecture (e.g.,
efficientnet-b0). It would expose only a simple method the client can call to
get class predictions for an image, conforming to the interface we decided
earlier. Everything else is considered implementation details within the model.
The simplified code for this new class that we’d put into
franks_ml_model/infer.py would be along the lines of:
def __init__( self, architecture_name='efficientnet-b0' ):
self.effnet_model = EfficientNet.from_pretrained(architecture_name)
self.transforms = transforms.Compose(...)
def infer_image( self, fn_image, topk=5 ):
batch_image_tensor = self.load_and_transform_image( fn_image )
return self.infer_batch_image_tensor( batch_image_tensor, topk=topk )
Omitting the internal details of the
EfficientNetInfer class, the point is that to the
outside world, they just need to figure out 1) how to instantiate the model,
2) get predictions, and 3) interpret the results.
To make this super easy, we can add a demo command line tool to this class
so people can try out the model without even copying and pasting code,
python franks_ml_model/infer.py <an_image>.jpg:
if __name__ == '__main__':
fn_image = sys.argv
model = EfficientNetInfer()
top_predictions = model.infer_image( fn_image )
for row in top_predictions:
print( row )
Can’t get much simpler than this for the clients! So the desired interface for our ML model has now been implemented by
the wrapper class
EfficientNetInfer, so we can move onto packaging it.
Step 5: Seal and Add Shipping Labels
The last step is to build the package so others can install it via
Fortunately all of the hard work was taken care of by Cookiecutter, so
we just need to build the package via:
python setup.py bdist_wheel
and you’ll see output similar to the following:
copying franks_ml_model/__init__.py -> build/lib/franks_ml_model
and the end result would be a
.whl under the
dist directory, such as
ls -l dist/
-rw-r--r-- 1 gerald staff 1692 Feb 23 09:13 franks_ml_model-0.0.1-py2.py3-none-any.whl
An optional but nice touch is to add to the README file of this project so your client can test out this package without digging through code. This effectively converts that pseudocode we outline in Step 0 into functional Python code for this model:
from franks_ml_model import EfficientNetInfer
model = EfficientNetInfer() # defaults to 'efficientnet-b0'
top_predictions = model.infer_image( fn_image ) # defaults to topk=5
Step 6: Test out the Package Locally
To be thorough, an optional step is to simulate your clients installing and trying out your packaged ML model. This will not be fail-proof since everyone computer can be slightly different (or a lot), but it better work on yours!
To test this out, let’s create a clean virtualenv, install our package and make sure our sample code works:
cd .. # or somewhere else outside the project source
virtualenv -p python3 ~/.virtualenvs/test_franks_model
pip install franks_ml_model/dist/franks_ml_model-0.0.1-py2.py3-none-any.whl
and if you run
pip freeze, you should see that our package was successfully
# pip freeze
Let’s try out the sample code we added to our README page:
>>> from franks_ml_model import EfficientNetInfer
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/gerald/.virtualenvs/test_franks_model/lib/python3.7/site-packages/franks_ml_model/__init__.py", line 3, in <module>
from .infer import EfficientNetInfer
File "/Users/gerald/.virtualenvs/test_franks_model/lib/python3.7/site-packages/franks_ml_model/infer.py", line 3, in <module>
from PIL import Image
ModuleNotFoundError: No module named 'PIL'
Oops, looks like we forgot some of our dependencies. Good thing we tested
out the package before shipping it! For this we just need to update
so the requirements would look like the following:
requirements = [
Then rebuild the wheel package, pip install it (make sure you add the –upgrade flag), and try out the python code.
python setup.py bdist_wheel
pip install --upgrade franks_ml_model/dist/franks_ml_model-0.0.1-py2.py3-none-any.whl
Let try again:
>>> from franks_ml_model import EfficientNetInfer
>>> model = EfficientNetInfer()
>>> top_predictions = model.infer_image( fn_image )
[[207, 'golden retriever', 0.5610832571983337], [213, 'Irish setter, red setter', 0.22866328060626984],...
Works! We have achieved what we set out in our plan initially, which is to package Frank’s ML model so it can be pip installed and invoked with 3 lines of Python code. This improved packaging and interface ensure that Frank’s model “blends in” with other software packages so deployment is made that much simpler.
As a quick note, we will not worry about the output format in this post, but as a rule of thumb, it’s generally not a good idea to pass arrays back to the clients. For now, let’s assume that’s fine with them.
Step 7: Ship it!
This step will be unique to your situation. The most common is to upload the package to PyPi, the public repository for Python packages. However, unless your model is open sourced like this example EfficientNet model, you will most likely not want to make your package publicly available. You have two options:
- Set up a private PyPi server for you organization (if not already there)
- As a fallback, distribute the
Obviously if your organization already has a private PyPi server, use that and follow the procedure for uploading your package there. If not, and if you want to help your devops team setting one up, Nexus is a good option. This will be well worth the work, especially if you plan on adding lots more ML model packages, as well releasing new versions of existing ones.
Short of that, you might be able to get away with using the
directly, so your clients have something to work with. It’s a stop gap since
whenever you build a new version of your package, the distribution of that
can become a huge headache. With a PyPI server set up, it’d be as simple
as a single
twine command, which is beyond the scope of this post.
There you have it: we have taken a ML model, settled on an interface, and packaged the model such that clients of this model can treat this ML model much like any other Python packages they are used to work with and deploy. As you saw, the work involved isn’t monumental, especially once you know how to save time using tools like Cookiecutter.
Frank (the ML person): Hey Jane, I know you're quite busy, so
I did some work and simplified my ML model's
interface, and packaged it into a Python
wheel. Now it can be pip-installed
and only takes 3 lines of code to get
Jane (the DevOps person): That sounds great. Send me a link to the
repo and I'll take a look. If looks easy
enough we may be able to add the package
to our next Docker build, scheduled in
the next sprint or two.
Frank: That'd be great! Looking forward to it
and let me know if I can help in any way.
This is an idealized scenario. More likely than not, there would be additional steps the devops team would need to get your model fully deployable, which we will explore further in future posts. Nevertheless, packaging your ML model is a prerequisite step anyway, not to mention the fun part of naming your own package!