11-TensorRT之使用条件
10-TensorRT中的循环
TensorRT与onnx
1-简介
TensorRT-plugin
TensorRT INT8量化代码
Calibration file
1 简介
我们在使用Tensorrt的隐形量化时,需要生成一个cache文件,用于onnx生成engine文件使用。如果我们使用trtexec来将onnx文件生成为和我们的GPU相关的隐形量化后的engine文件时需要参数 参考A.2.1.2. Serialized Engine Generation A.2.1.4. Commonly Used Command-line Flags
Polygraphy
QAT
附录:
for QDQ documents and how tensorrt process QDQ nodes, pls ref our developer guide: https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#work-with-qat-networks
And TensorRT provide a tool to do PTQ and QAT in pytorch: https://github.com/NVIDIA/TensorRT/blob/release/8.5/tools/pytorch-quantization/examples/torchvision/classification_flow.py
Besides, our team develop a sample to guide how to got best perf on Yolov7: https://github.com/NVIDIA-AI-IOT/yolo_deepstream/tree/main/yolov7_qat
And the QDQ best placement guide is here: https://github.com/NVIDIA-AI-IOT/yolo_deepstream/blob/main/yolov7_qat/doc/Guidance_of_QAT_performance_optimization.md
