TensorFlow Model Analysis for Validating Machine Learning Models
It was a failure when I deployed my first Machine Learning model. Quite frankly, I was beyond excited when it was deployed — it was a simple Diabetes Diagnosis Model for potential diabetes mellitus patients.
After receiving feedback from users, the excitement quickly dissipated. The model was not liked by the users.
Although I was saddened by this, I now realize they were right. In terms of top-level metrics, the model may have performed well. However, from the perspective of the consumer, if a machine learning model provides a poor forecast, the consumer will have a negative experience with it.
There was a problem with the model’s performance due to specific model features.
Machine learning engineers must assess machine learning models before deploying them, ensure they meet strict quality standards, and ensure they behave as predicted for all relevant data slices.
What is TensorFlow Model Analysis?
In order to help Machine Learning engineers understand their models’ performance, Google developed TensorFlow Model Analysis (TFMA). By using Apache Beam, TFMA distributes computations over a huge amount of data.
With TFMA, you can dig deep into the model’s performance and understand how it differs based on different data slices. TFMA supports the calculation of both built-in metrics used at training time (that is, built-in metrics) and TFMA configuration settings that define metrics after the model has been saved.
Using a previously trained machine learning model, you will analyze and evaluate results. You will be using a model that is trained for a Chicago Taxi Example, which uses the Taxi Trips dataset made available by the city of Chicago. The entire dataset can be found here.
You will be able to run Apache Beam over the evaluation dataset once you have completed this tutorial. A distributed processing backend makes it possible to run Beam pipelines over massive datasets and calculate metrics more accurately.
Prerequisites
- Understanding Apache Beam at a basic level.
- Learning to understand machine learning models.
- You will need to create a new Google Colab notebook in your Google Drive in order to run the Python code. Here is a tutorial that will help you set it up.
Step 1 — How to Install TensorFlow Model Analysis (TFMA)
First, pull in all the dependencies for your Google Colab notebook. It will take some time to complete this.
In dark mode, a new blank notebook
The file should be renamed to TFMA.ipynb instead of Untitled.ipynb.

Pip is updated to the latest version in the first line. Python packages are installed and managed with pip, a package management system. Preferred installer program means “preferred installer program”. TensorFlow Model Analysis (TFMA) will be installed with the second line.
Restart the runtime after that, then run the cells below. The runtime must be restarted before the cells can be run.

A block of code imports the necessary libraries — sys, tensorflow, apache_beam, and tensorflow_model_analysis. The assert sys.version_info.major==3 command verifies that Python 3 is being used by the notebook.
Step 2 — How to Load the dataset
The tar file will be downloaded and extracted by you.

Tar files are used to download the dataset. Training datasets, evaluation datasets, the data schema, along with training and serving saved models, along with eval saved models, are all included in this package. This tutorial requires all of them.
Step 3 — How to Parse the Schema
The downloaded schema needs to be parsed before TFMA can use it.

TensorFlow’s schema_pb2 and the google.protobuf library’s text_format method will be used to convert the protobuf message to text format.
Step 4 — How to Use the Schema to Create TFRecords
After TFMA has access to our dataset, we will proceed to the next step. The TFRecords file must be created for this purpose. Our schema provided us with the correct type for each feature.

TFMA supports several different types of models, including TF Keras models, models based on generic TF2 signature APIs, and TF estimator-based models. You will, however, configure a Keras-based model for this tutorial.
Step 5 — How to Set Up and Run TFMA using Keras

This is the point where you’ll finally call and use the instance of tfma that you imported previously.

Additionally, the Keras model must be referenced in the tfma.EvalSharedModel.

The final step in this process is to run TFMA.

After running the evaluation, examine the visualizations using TFMA. Following are some examples of the results from evaluating the Keras model.
To view metrics, you will use tfma.view.render_slicing_metrics. The views display by default the Overall slice. Using slicing_column or tfma.SlicingSpec will allow you to view a particular slice.tfma.SlicingSpec.
Step 6 — How to Visualize the Metrics and Plots
In this regard, you should note that the dataset contains the following columns:
- pickup_community_area
- fare
- trip_start_month
- trip_start_hour
- trip_start_day
- trip_start_timestamp
- pickup_latitude
- pickup_longitude
- dropoff_latitude
- dropoff_longitude
- trip_miles
- pickup_census_tract
- dropoff_census_tract
- payment_type
- company
- trip_seconds
- dropoff_community_area, and
- tips
If you want to try this as a first step, you can set the slicing_column to the trip_start_hour feature from our previous slicing_specs. The column can then be visualized.

You will see the following interactions in this metrics visualization:
- Pan by clicking and dragging
- Zoom by scrolling
- View can be reset by right-clicking
- To see more details about a particular data point, hover over it.
- With the selections at the bottom, you can choose from four different types of views.
You can visualize the slicing_specs created by your tfma.EvalConfig by updating the slice information passed to tfma.view.render_slicing_metrics. Choose the trip_start_day slice (weekdays) here.

To analyze combinations of features, TFMA also supports feature crosses. This can be tested by creating a cross between trip_start_hour and trip_start_day.

There are many combinations that can be created by crossing the two columns! The cross will be narrowed down to only trips starting at 1pm. In the visualization below, select binary_accuracy.

Step 7 — How to Track Your Model’s Performance Over Time
For training your model, you will use your training dataset. Hopefully, it will be representative of the data you will use for your production model and the dataset you will use for your test dataset.
Inference requests may contain the same data as your training data, but they are likely to change enough to impact your model’s performance in many cases.
Therefore, you should monitor and measure the performance of your model continuously in order to be aware of changes and react accordingly.
TFMA can help in a number of ways.

You can validate and evaluate machine learning models across different slices of data by using the TFMA.
This image shows the metrics of the machine learning model including auc (area under the curve), auc_precision_recall, binary_accuracy, binary_crossentropy, calibration, example_count, mean_label, mean_prediction, precision, and recall.
Conclusion
The ability to evaluate multiple models simultaneously is another important feature of TFMA. This is often done to determine whether or not a new model performs better than a baseline (such as the current model) based on metrics (such as AUC).
Upon configuring thresholds, TFMA will generate a tfma.ValidationResult that indicates whether the performance corresponds to the expected performance.
In the event that you are wondering what the differences are between evaluating machine learning models using TensorBoard and there is a valid concern regarding TensorFlow Metrics Analysis (TFMA). Machine Learning workflows require both tools to provide measurements and visualizations.
Although you use them in various stages of development, it’s important to remember that they’re used at different times. A high-level analysis is performed by TensorBoard, whereas a deep analysis is performed by TFMA for the ‘finished’ training model.
Comments
Post a Comment