人工智能

jf_49040007

1年用户 76经验值

擅长:嵌入式技术

私信关注

【爱芯派 Pro 开发板试用体验】爱芯元智AX650N部署yolov8s 自定义模型

爱芯元智AX650N部署yolov8s 自定义模型

本博客将向你展示零基础一步步的部署好自己的yolov8s模型（博主展示的是自己训练的手写数字识别模型），本博客教你从训练模型到转化成利于Pulsar2 工具量化部署到开发板上

训练自己的YOLOv8s模型

准备自定义数据集

数据集结构可以不像下面一样，这个只是记录当前测试适合的数据集目录结构，常见结构也有VOC结构，所以看个人喜好

└─yolov8s_datasets:		自定义数据集
├─test	
│  └─images 图片文件
│  └─label  标签文件
├─train		
│  └─images 图片文件
│  └─label  标签文件
├─valid	
│  └─images 图片文件
│  └─label  标签文件
├─data.yaml  路径和类别

本博客的data.yaml内容如下：

train: ../train/images
val: ../valid/images
test: ../test/images
nc: 10
names: ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']

YOLOv8训练环境搭建

配置环境
YOLOv8官方配置环境

# Clone the ultralytics repository

git clone https://github.com/ultralytics/ultralytics

# Navigate to the cloned directory

cd ultralytics

# Install the package in editable mode for development

pip install -e .

下载预训练权重

https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8s.pt

测试环境

model路径可以指定绝对路径，source也可以指定图片的绝对路径

yolo predict model=yolov8n.pt source='https://ultralytics.com/images/bus.jpg'

训练自己的YOLOv8s模型

训练模型（官方有两种方式一种是使用CLI命令，另一种是使用PYTHON命令）

我比较喜欢训练用PYTHON命令，测试用CLI命令吗，看个人喜好

YOLOv8官方PYTHON的用法
 YOLOv8官方CLI的用法

cd ultralytics
touch my_train.py
将下面内容填写到py文件
from ultralytics import YOLO
model = YOLO('/root/ultralytics/yolov8s.pt')
results = model.train(data='/root/data1/wxw/yolov8s_datasets/data.yaml',epochs=80,amp=False,batch=16,val=True,device=0)

在此路径下执行python3 my_train.py

测试模型

yolo predict model=/root/ultralytics/runs/detect/train17/weights/best.pt source='/root/ultralytics/ultralytics/assets/www.png' imgsz=640

0e1e2a6add3d92cd839c5c6189aceca74776662719284af4469b7729b1fca14a.png

模型部署和实机测试

前期准备

导出适宜pular2的onnx模型

（1）导出onnx模型（记得加上opset=11）

yolo task=detect mode=export model=/root/ultralytics/runs/detect/train17/weights/best.pt format=onnx opset=11

（2）onnx模型onnxsim化

python3 -m onnxsim best.onnx yolov8s_number_sim.onnx

终端输出信息：

Simplifying...
Finish! Here is the difference:
┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓
┃            ┃ Original Model ┃ Simplified Model ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩
│ Add        │ 9              │ 8                │
│ Concat     │ 24             │ 19               │
│ Constant   │ 153            │ 139              │
│ Conv       │ 64             │ 64               │
│ Div        │ 2              │ 1                │
│ Gather     │ 4              │ 0                │
│ MaxPool    │ 3              │ 3                │
│ Mul        │ 60             │ 58               │
│ Reshape    │ 5              │ 5                │
│ Resize     │ 2              │ 2                │
│ Shape      │ 4              │ 0                │
│ Sigmoid    │ 58             │ 58               │
│ Slice      │ 2              │ 2                │
│ Softmax    │ 1              │ 1                │
│ Split      │ 9              │ 9                │
│ Sub        │ 2              │ 2                │
│ Transpose  │ 2              │ 2                │
│ Unsqueeze  │ 7              │ 0                │
│ Model Size │ 42.6MiB        │ 42.6MiB          │
└────────────┴────────────────┴──────────────────┘

（3）获得onnxsim化模型的sub

touch zhuanhuan.py
把下面内容加入进去，记得路径替换为自己模型
import onnx

input_path = "/root/ultralytics/runs/detect/train17/weights/yolov8s_number_sim.onnx"
output_path = "yolov8s_number_sim_sub.onnx"
input_names = ["images"]
output_names = ["400","433"]

onnx.utils.extract_model(input_path, output_path, input_names, output_names)

得到模型如下图：

为模型量化部署的data
quick_start_example 文件夹

└─data:		
├─config	
│  └─yolov8s_config_b1.json
├─dataset		
│  └─calibration_data.tar 四张数据集照片
├─model	
│  └─yolov8s_number_sim_sub.onnx
├─pulsar2-run-helper

其中yolov8s_config_b1.json文件配置如下：

{
"model_type": "ONNX",
"npu_mode": "NPU1",
"quant": {
"input_configs": [
{
"tensor_name": "images",
"calibration_dataset": "./dataset/calibration_data.tar",
"calibration_size": 4,
"calibration_mean": [0, 0, 0],
"calibration_std": [255.0, 255.0, 255.0]
}
],
"calibration_method": "MinMax",
"precision_analysis": true,
"precision_analysis_method":"EndToEnd"
},
"input_processors": [
{
"tensor_name": "images",
"tensor_format": "BGR",
"src_format": "BGR",
"src_dtype": "U8",
"src_layout": "NHWC"
}
],
"output_processors": [
{
"tensor_name": "400",
"dst_perm": [0, 1, 3, 2]
},
{
"tensor_name": "433",
"dst_perm": [0, 2, 1]
}
],
"compiler": {
"check": 0
}
}

axmodel模型获取

进入docker环境（怎么搭建可以查看yolov5的自定义模型），将data文件拷贝到其中

执行下面命令：

cd data/
pulsar2 build --input model/yolov8s_number_sim_sub.onnx --output_dir output --config config/yolov8s_config_b1.json

终端输出信息：

root@1657ec5355e2:/data# pulsar2 build --input model/yolov8s_number_sim_sub.onnx --output_dir output --config config/yolov8s_config_b1.json
2023-11-24 17:00:31.661 | WARNING  | yamain.command.build:fill_default:320 - ignore images csc config because of src_format is AutoColorSpace or src_format and tensor_format are the same
Building onnx ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
2023-11-24 17:00:33.226 | INFO     | yamain.command.build:build:444 - save optimized onnx to [output/frontend/optimized.onnx]
2023-11-24 17:00:33.229 | INFO     | yamain.common.util:extract_archive:21 - extract [dataset/calibration_data.tar] to [output/quant/dataset/images]...
Quant Config Table
┏━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Input  ┃ Shape            ┃ Dataset Directory ┃ Data Format ┃ Tensor Format ┃ Mean            ┃ Std                   ┃
┡━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━┩
│ images │ [1, 3, 640, 640] │ images            │ Image       │ BGR           │ [0.0, 0.0, 0.0] │ [255.0, 255.0, 255.0] │
└────────┴──────────────────┴───────────────────┴─────────────┴───────────────┴─────────────────┴───────────────────────┘
Transformer optimize level: 0
4 File(s) Loaded.
[17:00:35] AX LSTM Operation Format Pass Running ...      Finished.
[17:00:35] AX Set MixPrecision Pass Running ...           Finished.
[17:00:35] AX Refine Operation Config Pass Running ...    Finished.
[17:00:35] AX Reset Mul Config Pass Running ...           Finished.
[17:00:35] AX Tanh Operation Format Pass Running ...      Finished.
[17:00:35] AX Confused Op Refine Pass Running ...         Finished.
[17:00:35] AX Quantization Fusion Pass Running ...        Finished.
[17:00:35] AX Quantization Simplify Pass Running ...      Finished.
[17:00:35] AX Parameter Quantization Pass Running ...     Finished.
Calibration Progress(Phase 1): 100%|██████████████████████████████████████████████████████████████████████| 4/4 [00:02<00:00,  1.54it/s]
Finished.
[17:00:38] AX Passive Parameter Quantization Running ...  Finished.
[17:00:38] AX Parameter Baking Pass Running ...           Finished.
[17:00:38] AX Refine Int Parameter Pass Running ...       Finished.
[17:00:39] AX Refine Weight Parameter Pass Running ...    Finished.
--------- Network Snapshot ---------
Num of Op:                    [166]
Num of Quantized Op:          [166]
Num of Variable:              [320]
Num of Quantized Var:         [320]
------- Quantization Snapshot ------
Num of Quant Config:          [521]
BAKED:                        [64]
OVERLAPPED:                   [230]
ACTIVATED:                    [147]
SOI:                          [17]
PASSIVE_BAKED:                [63]
Network Quantization Finished.
quant.axmodel export success: output/quant/quant_axmodel.onnx
===>export per layer debug_data(float data) to folder: output/quant/debug/float
Writing npy... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
===>export input/output data to folder: output/quant/debug/test_data_set_0
Building native ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00
2023-11-24 17:00:43.829 | WARNING  | yamain.command.load_model:pre_process:454 - preprocess tensor [images]
2023-11-24 17:00:43.830 | INFO     | yamain.command.load_model:pre_process:456 - tensor: images, (1, 640, 640, 3), U8
2023-11-24 17:00:43.830 | INFO     | yamain.command.load_model:pre_process:456 - op: op:pre_dequant_1, AxDequantizeLinear, {'const_inputs': {'x_zeropoint': array(0, dtype=int32), 'x_scale': array(1., dtype=float32)}, 'output_dtype': <class 'numpy.float32'>, 'quant_method': 0}
2023-11-24 17:00:43.830 | INFO     | yamain.command.load_model:pre_process:456 - tensor: tensor:pre_norm_1, (1, 640, 640, 3), FP32
2023-11-24 17:00:43.830 | INFO     | yamain.command.load_model:pre_process:456 - op: op:pre_norm_1, AxNormalize, {'dim': 3, 'mean': [0.0, 0.0, 0.0], 'std': [255.0, 255.0, 255.0]}
2023-11-24 17:00:43.830 | INFO     | yamain.command.load_model:pre_process:456 - tensor: tensor:pre_transpose_1, (1, 640, 640, 3), FP32
2023-11-24 17:00:43.830 | INFO     | yamain.command.load_model:pre_process:456 - op: op:pre_transpose_1, AxTranspose, {'perm': [0, 3, 1, 2]}
2023-11-24 17:00:43.831 | WARNING  | yamain.command.load_model:post_process:475 - postprocess tensor [400]
2023-11-24 17:00:43.831 | INFO     | yamain.command.load_model:handle_postprocess:473 - op: op:post_transpose_1, AxTranspose
2023-11-24 17:00:43.831 | WARNING  | yamain.command.load_model:post_process:475 - postprocess tensor [433]
2023-11-24 17:00:43.831 | INFO     | yamain.command.load_model:handle_postprocess:473 - op: op:post_transpose_2, AxTranspose
tiling op...   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 303/303 0:00:00
new_ddr_tensor = []
<frozen backend.ax650npu.oprimpl.normalize>:186: RuntimeWarning: divide by zero encountered in divide
<frozen backend.ax650npu.oprimpl.normalize>:187: RuntimeWarning: invalid value encountered in divide
build op...   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1254/1254 0:00:06
add ddr swap...   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2269/2269 0:00:00
calc input dependencies...   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2659/2659 0:00:00
calc output dependencies...   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2659/2659 0:00:00
assign eu heuristic   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2659/2659 0:00:00
assign eu onepass   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2659/2659 0:00:00
assign eu greedy   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2659/2659 0:00:00
2023-11-24 17:00:52.838 | INFO     | yasched.test_onepass:results2model:2004 - max_cycle = 8,507,216
2023-11-24 17:00:53.860 | INFO     | yamain.command.build:compile_npu_subgraph:1076 - QuantAxModel macs: 14,226,048,000
2023-11-24 17:00:53.862 | INFO     | yamain.command.build:compile_npu_subgraph:1084 - use random data as gt input: images, uint8, (1, 640, 640, 3)
2023-11-24 17:00:58.726 | INFO     | yamain.command.build:compile_ptq_model:1003 - fuse 1 subgraph(s)
root@1657ec5355e2:/data# ls
config  dataset  model  output  pulsar2-run-helper
root@1657ec5355e2:/data# cp -r output /mnt/

axmodel转化成功后可以在后缀加上.onnx，如下：

部署到开发板

开发板镜像为1.27版本，采用本地编译

下载源码：

git clone https://github.com/AXERA-TECH/ax-samples.git

修改ax_yolov8s_steps.cc文件中：

修改classname标签和类别数量
const char* CLASS_NAMES[] = {
"0", "1", "2", "3", "4", "5", "6", "7", "8", "9"};

int NUM_CLASS = 10;

cd ax-samples
mkdir build && cd build
cmake -DBSP_MSP_DIR=/soc/ -DAXERA_TARGET_CHIP=ax650 ..
make -j6
make install

编译完成后，生成的可执行示例存放在 ax-samples/build/install/ax650/ 路径下：

ax-samples/build$ tree install
install
└── ax650
    ├── ax_classification
    ├── ax_detr
    ├── ax_dinov2
    ├── ax_glpdepth
    ├── ax_hrnet
    ├── ax_imgproc
    ├── ax_pfld
    ├── ax_pp_humanseg
    ├── ax_pp_liteseg_stdc2_cityscapes
    ├── ax_pp_ocr_rec
    ├── ax_pp_person_attribute
    ├── ax_pp_vehicle_attribute
    ├── ax_ppyoloe
    ├── ax_ppyoloe_obj365
    ├── ax_realesrgan
    ├── ax_rtmdet
    ├── ax_scrfd
    ├── ax_segformer
    ├── ax_simcc_pose
    ├── ax_yolo_nas
    ├── ax_yolov5_face
    ├── ax_yolov5s
    ├── ax_yolov5s_seg
    ├── ax_yolov6
    ├── ax_yolov7
    ├── ax_yolov7_tiny_face
    ├── ax_yolov8
    ├── ax_yolov8_pose
    └── ax_yolox

将axmodel模型放在可执行文件下和测试图片：

root@maixbox:/home/ax-samples/build/install/ax650# ./ax_yolov8 -m yolov8snumber.axmodel -i 1.jpg
--------------------------------------
model file : yolov8snumber.axmodel
image file : 1.jpg
img_h, img_w : 640 640
--------------------------------------
WARN,Func(__is_valid_file),NOT find file = '/etc/ax_syslog.conf'
ERROR,Func(__syslog_parma_cfg_get), NOT find = '/etc/ax_syslog.conf'
Engine creating handle is done.
Engine creating context is done.
Engine get io info is done.
Engine alloc io is done.
Engine push input is done.
--------------------------------------
post process cost time:0.49 ms
--------------------------------------
Repeat 1 times, avg time 10.92 ms, max_time 10.92 ms, min_time 10.92 ms
--------------------------------------
detection num: 4
 2:  94%, [ 275,   38,  362,  168], 2
 3:  94%, [  58,   47,  145,  175], 3
 1:  92%, [  75,  250,  140,  378], 1
 1:  90%, [ 288,  249,  336,  378], 1
--------------------------------------

更多回帖

jf_49040007

【爱芯派 Pro 开发板试用体验】爱芯元智AX650N部署yolov8s 自定义模型

爱芯元智AX650N部署yolov8s 自定义模型

训练自己的YOLOv8s模型

准备自定义数据集

YOLOv8训练环境搭建

训练自己的YOLOv8s模型

模型部署和实机测试

前期准备

axmodel模型获取

部署到开发板

相关帖子

【爱芯派 Pro 开发板试用体验】爱芯元智AX650N部署yolov5s 自定义模型

【爱芯派 Pro 开发板试用体验】部署爱芯派官方YOLOV5模型

【爱芯派 Pro 开发板试用体验】ax650使用ax-pipeline进行推理

【爱芯派 Pro 开发板试用体验】使用yolov5s模型（官方）

【爱芯派 Pro 开发板试用体验】yolov8模型转换

【爱芯派 Pro 开发板试用体验】AX650 移植大疆PSDK

【爱芯派 Pro 开发板试用体验】利用爱芯派 Pro部署USB摄像头

【爱芯派 Pro 开发板试用体验】在爱芯派 Pro上部署坐姿检测

【爱芯派 Pro 开发板试用体验】人体姿态估计模型部署后期尝试

【爱芯派 Pro 开发板试用体验】模型部署（以mobilenetV2为例）

20万+工程师都在用，免费PCB检查工具

jf_49040007

【爱芯派 Pro 开发板试用体验】爱芯元智AX650N部署yolov8s 自定义模型

爱芯元智AX650N部署yolov8s 自定义模型

训练自己的YOLOv8s模型

准备自定义数据集

YOLOv8训练环境搭建

训练自己的YOLOv8s模型

模型部署和实机测试

前期准备

axmodel模型获取

部署到开发板

相关帖子

【爱芯派 Pro 开发板试用体验】爱芯元智AX650N部署yolov5s 自定义模型

【爱芯派 Pro 开发板试用体验】部署爱芯派官方YOLOV5模型

【爱芯派 Pro 开发板试用体验】ax650使用ax-pipeline进行推理

【爱芯派 Pro 开发板试用体验】使用yolov5s模型（官方）

【爱芯派 Pro 开发板试用体验】yolov8模型转换

【爱芯派 Pro 开发板试用体验】AX650 移植 大疆PSDK

【爱芯派 Pro 开发板试用体验】利用爱芯派 Pro部署USB摄像头

【爱芯派 Pro 开发板试用体验】在爱芯派 Pro上部署坐姿检测

【爱芯派 Pro 开发板试用体验】人体姿态估计模型部署后期尝试

【爱芯派 Pro 开发板试用体验】模型部署（以mobilenetV2为例）

20万+工程师都在用，免费PCB检查工具

【爱芯派 Pro 开发板试用体验】AX650 移植大疆PSDK