像素 | 类型 | NPU | 进程数(线程数) | Batch | AI-CPU使用率(%) | ctrl-CPU使用率(%) | NPU使用率(%) | NPU显存(M) | 带宽(%) | 时间(ms) | 每天处理图片数量(张) | cpu使用率(%) | 内存使用(M) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
200W | JPEG | 1 | 1(8) | 16 | 30-45 | 38-50 | 50 | 68 | 40-60 | 50.61 | 30 | 26000 | |
JPEG | 1 | 1(7) | 16 | 59 | 45.17 | 22000 | |||||||
JPEG | 1 | 1(6) | 16 | 25-35 | 30-90 | 50 | 56 | 25-60 | 39.86 | ||||
JPEG | 1 | 1(5) | 16 | 32-36 | 25-35 | 50 | 56 | 30-50 | 29.56 | ||||
JPEG | 1 | 1(3) | 16 | 15-20 | 12-15 | 36 | 53 | 20-40 | 18.38 | ||||
H264 | 1(1) | 41.75 | 不编码40.93/AiToMat删除后37.50 | ||||||||||
H264 | 1 | 1(3) | 16 | 15-40 | 3-40 | 20-36 | 56 | 5-54 | 47.47 | 150-250 | |||
H264 | 1 | 1(6) | 16 | 73.64 | |||||||||
H264 | 1 | 1(8) | 16 | 15-30 | 89-95 | 48-50 | 76 | 50-60 | 147.22 | 不编码135.21/AiToMat删除后139.59 | 170-250 | 2500 |
原始预处理
Dvpp预处理
像素 | 类型 | NPU | 进程数(线程数) | Batch | ARM使用率(%) | NPU使用率(%) | NPU显存(M) | 时间(ms) | 每天处理图片数量(张) | cpu使用率(%) | 内存使用(M) |
---|---|---|---|---|---|---|---|---|---|---|---|
200W | H264 | 1 | 1(6) | 4 | 46.9279 | 11046733 | |||||
1(5) | 8 | 35.53565 | 12156806 | ||||||||
1(4) | 16 | 22.9951 | 15029288 | ||||||||
JPEG | 1(8) | 16 | 125 | 5529600 | |||||||
H264 | 1(8) | 16 | 175 | 3949714 | |||||||
20191226 | |||||||||||
200W | JPEG | 4 | 1(9) | 卡口目标少速度比视频目标多快50%,1.93个目标 | 8.342ms 改host为DMalloc申请buf后貌似有提速 | 单核9.162 | 37721972 | ||||
500w | 1(7) | 6.44个目标 | 17.235 | 20051842 | |||||||
700w | 1(5) | 7.43个目标 | 20.672 | 16718322 | |||||||
200W | H264 | 4 | 1(9) | 包括回传编码跟踪等 | 22.086 | 15648197 | |||||
1(9) | 去掉回传编码跟踪 | 比上面速度快20% | 17.918 | 19287894 | |||||||
1(9) | 目标少 比目标多视频快35% | 14.25 | 24251220 |
像素 | 类型 | NPU | 进程数(线程数) | Batch | ARM使用率(%) | NPU使用率(%) | NPU显存(M) | 算法时间(ms) | 解码时间(ms) | 每天处理图片数量(张) | cpu使用率(%) | 内存使用(M) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
200万 | 1-3车 | ascend310 | 1(host:18,device:46) | 4 | 22 | 75(+-20) | 1656 | 20.388*2 | 3.649 | 4237787/2 | 250 | 1585 |
2 | 22*2 | 80 | 1761*2 | 23.026*2 | 4.384 | 7504560/2 | 225*2 | 1602+1341 | ||||
跑到1000多张崩了 | 3 | 22*3 | 115 | 1545*3 | 29.831*2 | 6.336 | 8688947/2 | 220*3 | 1484*3 | |||
上面的统计有误时间统一要乘以2 | ||||||||||||
20190816 | main中sendData改为200张循环一次 | 1 | 4 | 35-45 | 130 | 1000(参考,内存会上升) | 23.736 | 3.649 | 3640040 | 60 | 517 | |
2 | 4 | 33*2 | 1000*2 | 35.256 | 5.512 | 4901293 | 522 | |||||
3 | 4 | 28*3 | 1000 | 52.755 | 6.411 | 4913278 | ||||||
20190817 | 发4张返回4张结果 | 8 | (5-10)*8 | 40-120 | 246*8 | 135.867 | 3.6 | 5087328 | (5-10)*8 | 513*8 | ||
20190820 | 不返回decodeBuf | 8 | 103.608 | 6671299 | ||||||||
20190823 | 8k内存池 | 5 | 4 | (5-10)*8 | 639 | 64.446 | 6703286 | |||||
1 | 4 | 27.972 | 3088803 | |||||||||
8 | 4 | 246 | 93.261 | 7411458 | ||||||||
1 | 8 | 26.217 | 3295571 | |||||||||
4k内存池、内存管理 | 7 | 8 | 433 | 78.433 | 7711040 | 531 | ||||||
fp16 | 5 | 16 | 44.115 | 9792657 | 519 | |||||||
int8 | 6 | 16 | 574 | 43.941 | 11797571 | 519 | ||||||
int8 | 6 | 16 | 559.8 | 42.667 | 12149816 | 527 | ||||||
500万 | ||||||||||||
700万 | 5车 | int8 | 6 | 16 | 569 | 47.719 | 10863617 | 541 |
模块(单进程) | batch=4耗时 | Batch=8耗时 | Batch=16耗时 |
---|---|---|---|
DvppJpegDecode | 2.878*4 | 2.873*8 | 2.859 |
ObjectDetectStage1_v3_Input | 0.028 | 0.036 | 0.036 |
ObjectDetectStage1_v3_PreProcess | 6.018 | 11.945 | 11.985 |
ObjectDetectStage1_v3_Predict | 24.847 | 48.648 | 97.002 |
ObjectDetectStage1_v3_GetLayer | 0.743 | 0.77 | 0.771 |
ObjectDetectStage1_v3_PostProcess | 0.664*4 | 0.661*8 | 0.661 |
ObjectDetectStage1_v3_Output | 0.011*4 | 0.011*8 | 0.01 |
VehicleDetectStage2_Input | 0.028 | 0.034 | 0.034 |
VehicleDetectStage2_Index | 0.005 | 0.007 | 0.009 |
VehicleDetectStage2_PreProcess | 0.842 | 1.563 | 2.643 |
VehicleDetectStage2_Predict | 4.516 | 7.334 | 13.754 |
VehicleDetectStage2_GetLayer | 0.206 | 0.201 | 0.188 |
VehicleDetectStage2_PostProcess | 0.172*4 | 0.163*8 | 0.159 |
VehicleDetectStage2_Output | 0.003*4 | 0.003*8 | 0.021 |
VehicleDetectStage2_Segment_Index | 0.005 | 0.009 | 0.01 |
VehicleDetectStage2_Segment_PreProcess | 0.354 | 0.578 | 0.544 |
VehicleDetectStage2_Segment_Predict | 0.784 | 0.947 | 1.02 |
VehicleDetectStage2_Segment_Output | 0.002 | 0.002 | 0.002 |
Vehicle_Input | 0.023 | 0.027 | 0.027 |
Vehicle_Index | 0.002 | 0.005 | 0.006 |
Vehicle_PreProcess | 0.759 | 1.526 | 2.741 |
Vehicle_Predict | 1.555 | 2.247 | 3.327 |
Vehicle_Classification | 0.011*4 | 0.02*8 | 0.021 |
Vehicle_ExtractFeature | 0.008*4 | 0.016*8 | 0.018 |
Vehicle_FeatureCode | 0.487 | 0.996 | 1.084 |
VehicleDriver_Input | 0.025 | 0.031 | 0.028 |
VehicleDriver_Index | 0.006 | 0.01 | 0.012 |
VehicleDriver_PreProcess | 0.289 | 0.354 | 0.323 |
VehicleDriver_Predict | 4.38 | 6.736 | 11.514 |
VehicleDriver_ArgMax | 0.001*4 | 0.001*8 | 0.001 |
VehicleDriver_Output | 0.001*4 | 0.001*8 | 0.001 |
VehiclePlate_Input | 0.024 | 0.03 | 0.027 |
VehiclePlateAlign_Index | 0.063 | 0.066 | 0.07 |
VehiclePlateAlign_PreProcess | 0.644 | 0.646 | 0.934 |
VehiclePlateAlign_Predict | 1.614 | 2.774 | 6.691 |
VehiclePlateAlign_Output | 0.001*4 | 0.001*8 | 0.001 |
VehiclePlate_ProcessPlate | 9.818 | 16.827 | 17.364 |
VehiclePlate_Index | 5.442 | 10.142 | 10.578 |
VehiclePlate_CropToMat | 0.186 | 0.152 | 0.154 |
VehiclePlate_ImageRotate | 1.21 | 1.178 | 1.251 |
VehiclePlate_PreProcess | 0.05 | 0.098 | 0.163 |
VehiclePlate_Predict | 1.461 | 2.154 | 3.594 |
VehiclePlate_Output | 0.094 | 0.172 | 0.279 |
VehiclePlate_PostProcess | 0.054 | 0.098 | 0.1 |
VehicleSpecial_Input | 0.019 | 0.019 | 0.018 |
VehicleSpecial_Index | 0.018 | 0.026 | 0.027 |
VehicleSpecial_PreProcess | 1.127 | 2.012 | 2.71 |
VehicleSpecial_Predict | 2.053 | 3.078 | 5.366 |
VehicleSpecial_GetLayer | 0.022 | 0.02 | 0.019 |
VehicleSpecial_PostProcess | 0.039*4 | 0.035*8 | 0.035 |
VehicleSpecial_Output | 0.001*4 | 0.001*8 | 0.001 |
其他(框架消耗?) | 111.888-76.774=35.114 | ||
网络 | 耗时(batch=1)(3.4GB) | 耗时(Batch=4) |
---|---|---|
ObjectDetectStageCentG320x384 | 18.99000 | |
ObjectDetectStageV160x128 | 4.001000 | |
VehiclePlateSegmentCH | 0.879000 | |
VehicleDriverGeneral | 3.446000 | |
VehiclePlateNo | 2.013000 | |
VehiclePlateAlignCH | 1.655000 | |
VehiclePlateNameCH | 1.576000 | |
VehiclePlateExceptionPlate | 1.012000 | |
VehiclePlateExceptionHead | 0.895000 | |
VehicleLabel | 2.778000 | |
VehicleColor | 1.679000 | |
VehicleType | 1.083000 | |
ObjectDetectStageT160x96 | 1.498000 | |
Graph整体运行时间 | 60ms |