Run Machine Learning Inference on the NPU with PyTorch and ONNX

Run Machine Learning Inference on the NPU with PyTorch and ONNX#

Goals#

Introduce the Ryzen™ AI Software Platform
Show the ONNX model generation and inference flow on the NPU
Deploy a quantized ResNet-50 model onto Ryzen AI NPU for inference

References#

Ryzen AI Software Platform

Vitis AI Execution Provider

Matplotlib Gallery

CIFAR10

Confusion Matrix

This is not currently supported on the Linux release of Riallto.

Ryzen AI Software Platform#

The AMD Ryzen™ AI Software Platform enables developers to take machine learning models trained in PyTorch or TensorFlow and run them on laptops powered by Ryzen AI. The Ryzen AI software platform intelligently optimizes tasks and workloads, freeing-up CPU and GPU resources, and ensuring optimal performance at lower power. The diagram below shows the flow from trained models to execution.

Ryzen AI software platform

Step 1: Import Packages#

Run the following cell to import all the necessary packages to be able to run the inference in the Ryzen AI NPU.

import onnx
import onnxruntime as ort

import enum
import numpy as np
import cv2
import pickle
import os
import glob
import tarfile
import urllib.request
import matplotlib.pyplot as plt
from PIL import Image
from mpl_toolkits.axes_grid1 import ImageGrid

from sklearn.metrics import accuracy_score, confusion_matrix
import seaborn as sn
import pandas as pd

Step 2: Prepare the Data#

We are going to use a pre-trained ResNet-50 model from PyTorch Hub for the CIFAR-10 dataset.

Download the CIFAR-10 dataset#

Execute the following cells to download the CIFAR-10 dataset. The dataset is stored in data/cifar-10-batches-py/.

global models_dir, data_dir
models_dir = ".\\onnx"
data_dir= ".\\onnx\\data"

# License 1 (see end of notebook)

# Download data - One-time only

datadirname = ".\\onnx\\data"
if not os.path.exists(datadirname):
   data_download_tar = "cifar-10-python.tar.gz"
   urllib.request.urlretrieve("https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz", data_download_tar)
   file = tarfile.open(data_download_tar)
   file.extractall(data_dir)
   file.close()

# Delete cifar-10-python.tar.gz source file after all images are extracted
data_images_path = os.path.join(os.getcwd(), "cifar-10-python.tar.gz")
files = glob.glob(data_images_path)
for f in files:
    os.remove(f)

The CIFAR-10 dataset has 60,000 32x32 pixels color images in 10 classes, each class consists of 6,000 images. There are 50,000 training images and 10,000 test images.
The dataset contains five training batches and one test batch, 10,000 images in each. Each class in the test batch has 1,000 randomly selected images.

For inference we use the 10,000 test images.

The CIFAR10 classes are enumerated in the Cifar10Classes class below:

class Cifar10Classes(enum.Enum):
    airplane = 0
    automobile = 1
    bird = 2
    cat = 3
    deer = 4
    dog = 5
    frog = 6
    horse = 7
    ship = 8
    truck = 9

Run the following two cells to display a subset of the test images.

# License 2 (see end of notebook)

def unpickle(file):
    with open(file,'rb') as fo:
        dict = pickle.load(fo, encoding='latin1')
    return dict

datafile = r'./onnx/data/cifar-10-batches-py/test_batch'
metafile = r'./onnx/data/cifar-10-batches-py/batches.meta'

test_batch = unpickle(datafile) 
metadata = unpickle(metafile)

images = test_batch['data']
labels = test_batch['labels']
images = np.reshape(images,(10000, 3, 32, 32))

im = []

dirname = 'onnx/onnx_test_images'
if not os.path.exists(dirname):
   os.mkdir(dirname)

for i in range(20):
    im.append(cv2.cvtColor(images[i].transpose(1,2,0), cv2.COLOR_RGB2BGR))

fig = plt.figure(figsize=(10, 10))
grid = ImageGrid(fig, 111,  # similar to subplot(111)
                 nrows_ncols=(4, 5),  # creates 4x5 grid of axes
                 axes_pad=0.3,  # pad between axes in inch.
                 )

for ax, image, label in zip(grid, im, labels):
    ax.axis("off")
    ax.imshow(image)
    ax.set_title(f'Actual label: {Cifar10Classes(label).name}', fontdict={'fontsize':8})

plt.show()

../_images/2fbfbc285dd0b08d6366ab367d283be491d6203c6c8f9d3e3689ca38271a98e1.png

Step 3: Deploy the Model on the NPU#

Run the next cell to set up the XLNX_VART_FIRMWARE environmental variable to point to the NPU binary. The NPU binary 1x4.xclbin is an AI design that provides up to 2 TOPS performance. Up to four such AI streams can be run in parallel on the NPU without any visible loss of performance.

# 1x4 array 
os.environ['XLNX_VART_FIRMWARE'] = os.path.join("onnx", "xclbins","1x4.xclbin")

Load quantized ONNX model#

Run the following cell to load the provided ONNX quantized model.

We will use the following pre-trained quantized file:

The trained quantized ResNet-50 model on the CIFAR-10 dataset is saved at the following location: onnx/resnet.qdq.U8S8.onnx

If you would like to re-train and quantize your model, please review the PyTorch ONNX re-train notebook.

# License 2 (see end of notebook)

quantized_model_path = r'./onnx/resnet.qdq.U8S8.onnx'
model = onnx.load(quantized_model_path)

Deploy the quantized ONNX model on the Ryzen AI NPU#

For more information on provider options visit ONNX Runtime with Vitis AI Execution Provider

The file onnx/vaip_config.json is required when configuring Vitis AI Execution Provider (VAI EP) inside the ONNX Runtime code.

# License 2 (see end of notebook)

providers = ['VitisAIExecutionProvider']
cache_dir = os.path.join(os.getcwd(), "onnx")
provider_options = [{
            'config_file': 'onnx/xclbins/vaip_config.json',
            'cacheDir': str(cache_dir),
            'cacheKey': 'modelcachekey'
        }]

session = ort.InferenceSession(model.SerializeToString(), providers=providers,
                               provider_options=provider_options)

Inference#

The first 20 images are extracted from the CIFAR-10 test dataset and converted to the .png format.

The .png images are read, classified and visualized by running the quantized ResNet-50 model on the NPU.

# License 2 (see end of notebook)

# Extract and dump first 20 images 
for i in range(20): 
    im = images[i]
    im  = im.transpose(1,2,0)
    im = cv2.cvtColor(im,cv2.COLOR_RGB2BGR)
    im_name = f'./{dirname}/image_{i}.png'
    cv2.imwrite(im_name, im)

viz_predicted_labels = []
misclassified_images = []
misclassified_labels = []
show_imlist = []

# Pick dumped images and predict
for i in range(20): 
    image_name = f'./{dirname}/image_{i}.png'
    image = Image.open(image_name).convert('RGB')
    # Resize the image to match the input size expected by the model
    image = image.resize((32, 32))  
    image_array = np.array(image).astype(np.float32)
    image_array = image_array/255

    # Reshape the array to match the input shape expected by the model
    image_array = np.transpose(image_array, (2, 0, 1))  

    # Add a batch dimension to the input image
    input_data = np.expand_dims(image_array, axis=0)

    # Run the model
    outputs = session.run(None, {'input': input_data})

    # Process the outputs
    predicted_class = np.argmax(outputs[0])
    predicted_label = metadata['label_names'][predicted_class]
    viz_predicted_labels.append(predicted_class)
    label = metadata['label_names'][labels[i]]
    # print(f'Image {i}: Actual Label {label}, Predicted Label {predicted_label}')
    if (label != predicted_label):
        misclassified_images.append(i)
        misclassified_labels.append(predicted_label)

    show_imlist.append(cv2.cvtColor(images[i].transpose(1,2,0), cv2.COLOR_RGB2BGR))


fig = plt.figure(figsize=(10, 10))
grid = ImageGrid(fig, 111,  # similar to subplot(111)
                 nrows_ncols=(4, 5),  # creates 4x5 grid of axes
                 axes_pad=0.3,  # pad between axes in inch.
                 )

for ax, image, label in zip(grid, show_imlist, viz_predicted_labels):
    ax.axis("off")
    ax.imshow(image)
    ax.set_title(f'Predicted label: {Cifar10Classes(label).name}', fontdict={'fontsize':8})

plt.show()

../_images/496c4ba5e0bdd995a2eb7bdfd8380e736c1607b86ad0beb99f23fa07502c18c9.png

Display the misclassifications

show_imlist_mis = []

for i in misclassified_images:
    show_imlist_mis.append(cv2.cvtColor(images[i].transpose(1,2,0), cv2.COLOR_RGB2BGR))

varpltsize = len(misclassified_images)

fig = plt.figure(figsize=((1 * 2 * varpltsize), 1 * 2 * varpltsize))
grid = ImageGrid(fig, 111,  # similar to subplot(111)
                 nrows_ncols=(1, len(misclassified_images)),  
                 axes_pad=0.3,  # pad between axes in inch.
                 )

for ax, image, label in zip(grid, show_imlist_mis, misclassified_labels):
    ax.axis("off")
    ax.imshow(image)
    ax.set_title(f'Predicted label: {label}', fontdict={'fontsize':8})

plt.show()

../_images/810a776ba673f6fc2f8cd3d484e45eca3bb67a4fabd5dbb769878ecc57b1b829.png

Inference for more test images#

Note: the cell below may extract up to 5,000 images. You can delete the extracted images by following the instructions in Delete all Extracted Images.

The first 5,000 images are extracted from the CIFAR-10 test dataset and converted to the .png format.
The .png images are read, classified and visualized by running the quantized ResNet-50 model on the NPU.

# License 2 (see end of notebook)

max_images = len(images)//2 # 5000 test images

# Extract and dump all images in the test set 
for i in range(max_images): 
    im = images[i]
    im  = im.transpose(1,2,0)
    im = cv2.cvtColor(im,cv2.COLOR_RGB2BGR)
    im_name = f'./{dirname}/image_{i}.png'
    cv2.imwrite(im_name, im)

cm_predicted_labels = []
cm_actual_labels = []

# Pick dumped images and predict
for i in range(max_images): 
    image_name = f'./{dirname}/image_{i}.png'
    try:
        image = Image.open(image_name).convert('RGB')
    except:
        print(f"Warning: Image {image_name} maybe locked moving on to next image")
        continue
    # Resize the image to match the input size expected by the model
    image = image.resize((32, 32))  
    image_array = np.array(image).astype(np.float32)
    image_array = image_array/255

    # Reshape the array to match the input shape expected by the model
    image_array = np.transpose(image_array, (2, 0, 1))  

    # Add a batch dimension to the input image
    input_data = np.expand_dims(image_array, axis=0)

    # Run the model
    outputs = session.run(None, {'input': input_data})

    # Process the outputs
    predicted_class = np.argmax(outputs[0])
    predicted_label = metadata['label_names'][predicted_class]
    cm_predicted_labels.append(predicted_class)
    label = metadata['label_names'][labels[i]]
    cm_actual_labels.append(labels[i])
    if i%990 == 0:
        print(f'Status: Running Inference on image {i}... Actual Label: {label}, Predicted Label: {predicted_label}')

Status: Running Inference on image 0... Actual Label: cat, Predicted Label: cat
Status: Running Inference on image 990... Actual Label: automobile, Predicted Label: automobile
Status: Running Inference on image 1980... Actual Label: truck, Predicted Label: truck
Status: Running Inference on image 2970... Actual Label: dog, Predicted Label: dog
Status: Running Inference on image 3960... Actual Label: bird, Predicted Label: bird
Status: Running Inference on image 4950... Actual Label: bird, Predicted Label: bird

Confusion matrix#

The X-axis represents the predicted class and the Y-axis represents the actual class.

The diagonal cells show true positives, they show how many instances of each class were correctly predicted by the model. The off-diagonal cells show instances where the predicted class did not match the actual class.

cf_matrix = confusion_matrix(cm_actual_labels, cm_predicted_labels)
df = pd.DataFrame(cf_matrix/np.sum(cf_matrix,axis=1), index = [Cifar10Classes(i).name for i in range(10)], columns=[Cifar10Classes(i).name for i in range(10)])
plt.figure(figsize = (10,5));
sn.heatmap(df, annot=True, cmap="PiYG");

../_images/ed7011556e3e73ffecb7beb5fc707e135b891f967edb09a9a976724671bb7f6b.png

Accuracy of the quantized model for 5,000 test images#

The below accuracy on the test images is calculated for the quantized model run on the NPU.

print(f" Accuracy of the quantized model for the test set is : {(accuracy_score(cm_actual_labels, cm_predicted_labels)*100):.2f} %")

 Accuracy of the quantized model for the test set is : 75.76 %

Step 4: Deploy the Model on CPU#

Deploy the Quantized ONNX Model on CPU (default provider)

providers = ['CPUExecutionProvider']
provider_options = [{}]

session = ort.InferenceSession(model.SerializeToString(), providers=providers,
                               provider_options=provider_options)

# License 2 (see end of notebook)

#Pick dumped images and predict
for i in range(10): 
    image_name = f'./{dirname}/image_{i}.png'
    image = Image.open(image_name).convert('RGB')
    # Resize the image to match the input size expected by the model
    image = image.resize((32, 32))  
    image_array = np.array(image).astype(np.float32)
    image_array = image_array/255

    # Reshape the array to match the input shape expected by the model
    image_array = np.transpose(image_array, (2, 0, 1))  

    # Add a batch dimension to the input image
    input_data = np.expand_dims(image_array, axis=0)


    # Run the model
    outputs = session.run(None, {'input': input_data})


    # Process the outputs
    predicted_class = np.argmax(outputs[0])
    predicted_label = metadata['label_names'][predicted_class]
    label = metadata['label_names'][labels[i]]
    print(f'Image {i}: Actual Label {label}, Predicted Label: {predicted_label}')

Image 0: Actual Label cat, Predicted Label: cat
Image 1: Actual Label ship, Predicted Label: ship
Image 2: Actual Label ship, Predicted Label: ship
Image 3: Actual Label airplane, Predicted Label: airplane
Image 4: Actual Label frog, Predicted Label: frog
Image 5: Actual Label frog, Predicted Label: frog
Image 6: Actual Label automobile, Predicted Label: automobile
Image 7: Actual Label frog, Predicted Label: frog
Image 8: Actual Label cat, Predicted Label: cat
Image 9: Actual Label automobile, Predicted Label: automobile

Delete all Extracted Images#

# Delete all extracted images to save disk space 
images_path = os.path.join(os.getcwd(), "onnx", "onnx_test_images","*")
files = glob.glob(images_path)
for f in files:
    try:
        os.remove(f)
    except:
        continue

Licenses#

License 1

# -------------------------------------------------------------------------
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License.
# --------------------------------------------------------------------------

License 2

#################################################################################  
# License
# Ryzen AI is licensed under `MIT License <https://github.com/amd/ryzen-ai-documentation/blob/main/License>`_ . Refer to the `LICENSE File <https://github.com/amd/ryzen-ai-documentation/blob/main/License>`_ for the full license text and copyright notice.