This repository contains code to accompany a tutorial given at NCSA on inference optimization for deep learning models using NVIDIA TensorRT.
In many applications of deep learning models, we would benefit from reduced latency (time taken for inference). This tutorial will introduce NVIDIA TensorRT, an SDK for high-performance deep learning inference. We will go through all the steps necessary to convert a trained deep learning model to an inference-optimized model.
Webinar Date: April 13, 2022 Speakers: Nikil Ravi and Pranshu Chaturvedi, UIUC