Training and deploying an Iris flower classification model using TrueFoundry

Truefoundry

11 Jul 2022 • 5 min read

If you are training machine learning models to solve a problem, TrueFoundry helps you track different experiments and makes it easy and intuitive to deploy models with best practices and make it available for public use in a matter of minutes.

In this example, we train a model that can classify a flower of the iris genus into one of three species based on size measurements of its petal and sepal.

You can also follow this example on a Google Colaboratory notebook.

The Iris dataset contains three different species :

Iris Setosa
Iris Versicolor
Iris Virginica

We need to build a classifier that can identify the species of the flower given the following parameters:

Sepal Length
Sepal Width
Petal Length
Petal Width

About TrueFoundry

TrueFoundry provides two libraries for simplifying your ML workflows:

MLFoundry

mlfoundry library is used to track ML training experiments.

Why do you need experiment tracking? If you are training multiple ML models to solve a problem, you will likely train multiple models with multiple frameworks, hyper-parameters and multiple datasets. Tracking your experiment using a library like mlfoundry can help you organise your ML experiments.

You can use MLFoundry to log hyper parameters, metrics, datasets and models. You can then compare different experiments on the TrueFoundry dashboard and choose a model to deploy in production or decide to re-train the model.

We shall use 5 different APIs from MLFoundry in this example. They are:

log_params - use it to log hyper-parameters of the current experiment
log_dataset - used to log the entire dataset
log_metrics - log metrics like accuracy scores, f1 scores
set_tags - add tags to your experiment for easy filtering later on
log_model - to save a model including the trained weights

ServiceFoundry

Using the servicefoundry library, you can package, containerise and deploy a model into a Kubernetes cluster easily.

Let's train a model and log it using MLFoundry

Open an IPython notebook - you can either use Jupyter running locally on your machine or a Google Colab notebook that runs on the cloud.

Install required libraries.

!pip install mlfoundry
!pip install pandas
!pip install sklearn

Login to TrueFoundry. Create and copy an API key from the settings page. Use this API key to initialise the MLFoundry client and create a run. A run is an entity that represents a single experiment.

import mlfoundry as mlf

client = mlf.get_client(api_key='<TFY_API_KEY>')
run = client.create_run('iris-classifier')

Fetch the Iris dataset using the sklearn.datasets module. We then divide it into test and train datasets.

from sklearn import datasets
from sklearn.model_selection import train_test_split

data = datasets.load_iris()

X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.8, stratify=y, random_state=42)

Let's take a look at the target names. We will use this to map from the integer output from the model to the actual names of the species

print(data.target_names)

# Output
# ['setosa' 'versicolor' 'virginica']

Initialise a model. Then use MLFoundry to log the parameters of the model and create some tags for this current experiment run.

from sklearn.svm import SVC

clf = SVC(gamma='scale', kernel='rbf', probability=True, C=1.2)

run.set_tags({
	'framework': 'sklearn', 
	'task': 'classification'
})

run.log_params(clf.get_params())

Next, we train the model on our train dataset. Once the training is complete, we compute the various metrics and log them to MLFoundry using log_metrics.

clf.fit(X_train, y_train)

y_pred_train = clf.predict(X_train)
y_pred_test = clf.predict(X_test)

metrics = {
    'train/accuracy_score': accuracy_score(y_train, y_pred_train),
    'train/f1_weighted': f1_score(y_train, y_pred_train, average='weighted'),
    'train/f1_micro': f1_score(y_train, y_pred_train, average='micro'),
    'train/f1_macro': f1_score(y_train, y_pred_train, average='macro'),
    'test/accuracy_score': accuracy_score(y_test, y_pred_test),
    'test/f1_weighted': f1_score(y_test, y_pred_test, average='weighted'),
    'test/f1_micro': f1_score(y_test, y_pred_test, average='micro'),
    'test/f1_macro': f1_score(y_test, y_pred_test, average='macro'),
}

run.log_metrics(metrics)

If we are happy with the accuracy scores and other metrics, we can choose to deploy the current model. For that, we need to save the model and copy the current run id.

run.log_model(clf, framework=mlf.ModelFramework.SKLEARN)
print(run.run_id)
run.end()

You can see all your runs and compare metrics through the TrueFoundry experiment tracking dashboard.

Deploying our model as an API service

To deploy the model using ServiceFoundry, we need to create a Python file containing the function that we want to expose as an endpoint.

Inside that Python file, we will fetch the model we just trained and saved using the run id, using mlfoundry. Note that API key required by mlfoundry will be available as the environment variable TFY_API_KEY.

In your IPython notebook, create a block with the following contents and run it to create a Python file named predict.py. We use the Jupyter magic command %%writefile to create the file in the notebook environment.

%%writefile predict.py
import os
import json
import pandas as pd
import mlfoundry as mlf

client = mlf.get_client(api_key=os.environ.get('TFY_API_KEY'))
run = client.get_run('79e71482643f46dfa5bfef256dba5dc5') # replace with your run id
model = run.get_model()

def species(features):
  features = json.loads(features)
  df = pd.DataFrame.from_dict([features])
  prediction = model.predict(df)[0]
  return ['setosa', 'versicolor', 'virginica'][prediction]

Inside the species function, we load the features into a pandas DataFrame and make the prediction using the model. We translate from the integer class to species names using the target_names we printed during training.

That's pretty much all the work you'll need to do. Now let's deploy this model as an API service. First, install and import servicefoundry in your notebook. Login to servicefoundry.

!pip install servicefoundry
import servicefoundry.core as sfy
sfy.login()

Go to the TrueFoundry dashboard and create a workspace to deploy the service. Workspace are a way to group together related projects inside TrueFoundry. Once the workspace is created, copy the FQN so we can tell servicefoundry where to deploy the model.

servicefoundry library lets you gather all the dependencies of the file you just created using gather_requirements function.

requirements = sfy.gather_requirements("predict.py")

Now create a sfy.Service object, provide the workspace FQN and deploy it by calling .deploy()

auto_service = sfy.Service("predict.py", requirements, sfy.Parameters(
    name="iris-service",
    workspace="<workspace-fqn-you-copied>"
))

auto_service.deploy()

You can track the progress of this deployment on the dashboard. Once the deployment is complete, you can access the deployed service from there and try it out.

A screenshot of the service serving prediction

The TrueFoundry dashboard also links to metrics and logs that come out-of-the-box with TrueFoundry deployments in the form of Grafana dashboards. You can read more about them here.

More features

You can also deploy interactive UI applications and Gradio applications easily from an IPython notebook using servicefoundry. Read this guide to see how.

We are working on making the integration between experiment tracking and deployment even tighter and the experience, more delightful. You can read about other things you can do with TrueFoundry at our docs.