3 cmake使用规则

3.1 从命令行定义全局变量

4 cmake官方文档

CMake 教程提供了一个循序渐进的指南，涵盖了 CMake 帮助解决的常见构建系统问题。了解示例项目中各个主题是如何运作是很有帮助的。教程的文档和示例代码可以在 CMake 源代码的 Help/guide/tutorial 目录中找到。每个步骤都有自己的子目录，其中包含可以用作起点的代码。教程示例是渐进的，因此，每个步骤都为前一个步骤提供完整的解决方案。

阅读全文 »

1 cmake简介

cmake允许开发者编写一种平台无关的 CMakeList.txt 文件来定制整个编译流程，然后再根据目标用户的平台进一步生成所需的本地化 Makefile 和工程文件，如 Unix 的 Makefile 或 Windows 的 Visual Studio 工程。从而做到“Write once, run everywhere”。

阅读全文 »

2 cmake简单示例

2.1 编译单个源文件

阅读全文 »

带有QDQ的onnx用trtexec转engine日日志分析

发表于 2024-12-01 分类于 TensorRT

1 简介

目前使用TensorRT量化模型有两种方式，一种是使用TensorRT的黑盒模式，给出量化的数据集和量化方法隐形量化，另一种是修改模型结构，插入QDQ节点，再给定数据集或者重新训练模型来调整QDQ节点参数做到计算scales。具体的方式这里就不多说了，以后详谈。

阅读全文 »

trtexec —onnx=apa_lidarfreespace_u2nettp_6xdownsample_v9_20240810_scale_simlified_edit.onnx —saveEngine=apa_lidarfreespace_u2nettp_6xdownsample_v9_20240810_scale_simlified_edit_dla.engine —int8 —useDLACore=0 —allowGPUFallback —verbose —dumpLayerInfo —dumpProfile —useSpinWait —separateProfileRun >apa_lidarfreespace_u2nettp_6xdownsample_v9_20240810_scale_simlified_edit_dla.engine.log 2>&1
#转dla控制层的精度，进而控制一层不在dla上运行
trtexec —onnx=apa_lidarfreespace_u2nettp_6xdownsample_v9_20240810_scale_simlified_edit.onnx —saveEngine=apa_lidarfreespace_u2nettp_6xdownsample_v9_20240810_scale_simlified_edit_layerPercision_dla.engine —int8 —useDLACore=0 —allowGPUFallback —verbose —dumpLayerInfo —dumpProfile —useSpinWait —separateProfileRun —precisionConstraints=obey —layerPrecisions=”/Concat”:fp32 >apa_lidarfreespace_u2nettp_6xdownsample_v9_20240810_scale_simlified_edit_layerPercision_dla.engine.log 2>&1
trtexec —onnx=v9_nearest.onnx —saveEngine=v9_nearest_layerPercision_dla.engine —int8 —useDLACore=0 —allowGPUFallback —verbose —dumpLayerInfo —dumpProfile —useSpinWait —separateProfileRun —precisionConstraints=obey —layerPrecisions=”/Concat”:fp32 >v9_nearest_layerPercision_dla.engine.log 2>&1
#构建一个经过qdq_translator后的noqdq onnx 和cache文件生成DLA的engine
trtexec —onnx=v9_nearestqdq_noqdq.onnx —calib=v9_nearestqdq_precision_config_calib.cache —useDLACore=0 —int8 —fp16 —allowGPUFallback —saveEngine=v9_nearestqdq_noqdq.onnx_dlaFP16INT8.engine >v9_nearestqdq_noqdq.onnx_dlaFP16INT8.engine.log 2>&1
trtexec —onnx=v9_nearestqdq_noqdq.onnx —calib=v9_nearestqdq_precision_config_calib.cache —useDLACore=0 —int8 —allowGPUFallback —saveEngine=v9_nearestqdq_noqdq.onnx_dlaINT8.engine >v9_nearestqdq_noqdq.onnx_dlaINT8.engine.log 2>&1
trtexec —onnx=v9_nearestqdq_noqdq.onnx —calib=v9_nearestqdq_precision_config_calib.cache —useDLACore=0 —int8 —fp16 —allowGPUFallback —saveEngine=v9_nearestqdq_noqdq.onnx_dlaFP16INT8_layerPercision.engine —precisionConstraints=obey —layerPrecisions=”/Concat”:fp32 >v9_nearestqdq_noqdq.onnx_dlaFP16INT8_layerPercision.engine.log 2>&1
trtexec —onnx=v9_nearestqdq_noqdq.onnx —calib=v9_nearestqdq_precision_config_calib.cache —useDLACore=0 —fp16 —allowGPUFallback —saveEngine=v9_nearestqdq_noqdq.onnx_dlaINT8_layerPercision.engine —precisionConstraints=obey —layerPrecisions=”/Concat”:fp32 >v9_nearestqdq_noqdq.onnx_dlaINT8_layerPercision.log 2>&1

#精度对比
#Lidar trtexec 给定输入
trtexec —loadEngine=v9_nearest_layerPercision_dla.engine —loadInputs=’input’:lidar_preBin.bin —exportOutput=Lidar_output_dla.json —dumpOutput
CheckDLAOutputScripts$ python3 check_outputs_diff.py Lidar_output_FP32.json Lidar_output_dla.json 0
CheckDLAOutputScripts$ python3 check_cosine_sim.py Lidar_output_FP32.json Lidar_output_dla.json 0

阅读全文 »

engine显存计算

发表于 2024-12-01 分类于 TensorRT

说明

本文档记录了使用tensort过程中的琐碎知识点总结，因为没哟总的归类，就汇总到一起吧。

阅读全文 »

TensorRT量化实战课YOLOv7量化：pytorch

发表于 2024-12-01 分类于 TensorRT

奔跑的IC