Build and Deploy a Machine Learning Model with Azure ML Service

In this tutorial, we will build and deploy an machine model to predict the salary from the Stackoverflow dataset. By the end of this, you will be able to invoke a RESTful web service to get the predictions.

Jan 18th, 2019 3:00am by Janakiram MSV

Featued image for: Build and Deploy a Machine Learning Model with Azure ML Service

Feature image by Ricardo Gomez Angel on Unsplash.

This article is a post in a series on bringing continuous integration and deployment (CI/CD) practices to machine learning. Check back to The New Stack for future installments. For the background and context, we strongly recommend you to read the previous article on the rise of ML PaaS followed by the article on the overview of Azure ML service.

In this tutorial, we will build and deploy a machine model to predict the salary from the Stackoverflow dataset. By the end of this, you will be able to invoke a RESTful web service to get the predictions.

Since the objective to demonstrate the workflow, we will use a simple two-column dataset with years of experience and salary for the experiment. For the details on the dataset, refer to my previous article on linear regression.

Prerequisites

Basic knowledge of Python and Scikit-learn
Active Microsoft Azure Subscription
Anaconda or Miniconda

Configuring the Development Environment
Configure a virtual environment with the Azure ML SDK. Run the below commands to install the Python SDK, and launching a Jupyter Notebook. Start a new Python 3 kernel from Jupyter.

$ conda create -n aml -y Python=3.6

$ conda activate aml

$ conda install nb_conda

$ pip install azureml-sdk[notebooks]

$ jupyter notebook

Initializing Azure ML Environment

Let’s start by importing all the required Python modules, which include standard Scikit-learn modules and the Azure ML modules.

import datetime
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.externals import joblib

import azureml.core
from azureml.core import Workspace
from azureml.core.model import Model
from azureml.core import Experiment
from azureml.core.webservice import Webservice
from azureml.core.image import ContainerImage
from azureml.core.webservice import AciWebservice
from azureml.core.conda_dependencies import CondaDependencies

We need to create an Azure ML Workspace that acts as the logical boundary for our experiment. A Workspace creates a Storage Account for storing the dataset, a Key Vault for secrets, a Container Registry for maintaining the image repositories, and Application Insights for logging the metrics.

Don’t forget to replace the placeholder with your subscription id.

ws = Workspace.create(name='salary',
                      subscription_id='', 
                      resource_group='mi2',
                      create_resource_group=True,
                      location='southeastasia'
                     )

After a few minutes, we will see the resources created within the Workspace.

We can now create an Experiment to start logging the metrics. Since we don’t have many parameters to log, we are capturing the start time of the training process.

exp = Experiment(workspace=ws, name='salexp')
run = exp.start_logging()                   
run.log("Experiment start time", str(datetime.datetime.now()))

Training and Testing the Scikit-learn ML Model

We will now proceed to train and test the model through Scikit-learn.

sal = pd.read_csv('data/sal.csv',header=0, index_col=None)
X = sal[['x']]
y = sal['y']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=10)

lm = LinearRegression()
lm.fit(X_train,y_train)

The trained model will be serialized as a pickle file in the outputs directory. Azure ML automatically copies the content of the outputs directory to the cloud.

filename = 'outputs/sal_model.pkl'
joblib.dump(lm, filename)

Let’s complete the experiment by logging the slope, intercept, and the end time of the training job.

run.log('Intercept :', lm.intercept_)
run.log('Slope :', lm.coef_[0])

run.log("Experiment end time", str(datetime.datetime.now()))
run.complete()

We can track the metrics and the execution time from the Azure Dashboard.

Registering and Serving the Trained Model

Each time we freeze the model, it can be registered with Azure ML with a unique version. This gives us the ability to easily switch between different models when serving.

Let’s register the salary model from the above training job by pointing the SDK to the location of the PKL file. We are also adding some additional metadata to the model in the form of tags.

model = Model.register(model_path = "outputs/sal_model.pkl",
                       model_name = "sal_model",
                       tags = {"key": "1"},
                       description = "Salary Prediction",
                       workspace = ws)

Check the Models section of the Workspace to ensure that our model is registered.

It’s time for us to package and deploy the model as a container image which will be exposed as a web service.

For the container image to get created, we need to tell Azure ML about the environment needed by the model. We will then pass a Python script that includes code to predict the values based on an inbound data point.

Azure ML API provides handy methods for both. Let’s first create the environment file, salenv.yaml, which tells the runtime to include Scikit-learn in the container image.

salenv = CondaDependencies()
salenv.add_conda_package("scikit-learn")

with open("salenv.yml","w") as f:
    f.write(salenv.serialize_to_string())
with open("salenv.yml","r") as f:
    print(f.read())

The below snippet, when executed from the Jupyter Notebook, creates a file called score.py that contains the inference logic for the model.

%%writefile score.py
import json
import numpy as np
import os
import pickle
from sklearn.externals import joblib
from sklearn.linear_model import LogisticRegression

from azureml.core.model import Model

def init():
    global model
    # retrieve the path to the model file using the model name
    model_path = Model.get_model_path('sal_model')
    model = joblib.load(model_path)

def run(raw_data):
    data = np.array(json.loads(raw_data)['data'])
    # make prediction
    y_hat = model.predict(data)
    return json.dumps(y_hat.tolist())

Now. let’s connect the dots by passing the inference file and environment configuration to the image.

%%time
image_config = ContainerImage.image_configuration(execution_script="score.py", 
                                                  runtime="python", 
                                                  conda_file="salenv.yml")

This eventually results in the creation of a container image which shows up in the Images section of the Workspace.

We are all set to create the deployment configuration that defines the target environment and launching it as web service hosted in Azure Container Instance as a single-vm container. We may also choose AKS or an IoT Edge environment as the deployment target.

aciconfig = AciWebservice.deploy_configuration(cpu_cores=1, 
                                               memory_gb=1, 
                                               tags={"data": "Salary",  "method" : "sklearn"}, 
                                               description='Predict Stackoverflow Salary')

service = Webservice.deploy_from_model(workspace=ws,
                                       name='salary-svc',
                                       deployment_config=aciconfig,
                                       models=[model],
                                       image_config=image_config)

service.wait_for_deployment(show_output=True)

The Azure Resource Group now has an Azure Container Instance running the inference for the model.

We can get the URL of the inference service from the below method:

print(service.scoring_uri)

Let’s go ahead and invoke with the web service through cURL. We can do this from the same Jupyter Notebook.

You can access the dataset and Jupyter Notebook from the Github repo.

The uniqueness of this approach is that we could perform all the tasks from a Python kernel running inside the Jupyter Notebook. Developers can do everything it takes to train and deploy ML models from code. This is the real value of using an ML PaaS like Azure ML Service.

Janakiram MSV (Jani) is a practicing architect, research analyst, and advisor to Silicon Valley startups. He focuses on the convergence of modern infrastructure powered by cloud-native technology and machine intelligence driven by generative AI. Before becoming an entrepreneur, he spent...