橦言无忌

一个不想改变世界的程序媛

hi3559a开发记录

前言

hisi视觉模型开发,主要是检测任务,三个类别,本文包含项目开发的基本流程,包括模型训练,模型转化以及hisi官方提供的关于yolov3的源码及其调用分析。

1,模型训练

1.1 运行环境

在caffe的基础docker中操作的,方便后续其他视觉框架转到caffe的模型,其实训练框架用的是darknet。

1
docker pull bvlc/caffe

1.2 使用darknet进行训练

1
cd darknet
  • 编译darknet
    darket环境搭建参考
  • 根据kmeans算法生成对应数据集和对应尺寸的anchors,并修改cfg
    1
    ./darknet detector calc_anchors data/visdrone.data -num_of_clusters 6 -width 416 -height 416
  • 训练yolo-fastest
    1
    ./darknet detector train ./data/visdrone.data ./cfg/yolo-fastest.cfg yolo-fastest.conv.109

1.3 其他训练命令

1
2
3
./darknet detector train ./data/visdrone.data ./cfg/yolo-fastest-xl.cfg yolo-fastest-xl.conv.109
./darknet detector train ./data/visdrone.data ./cfg/yolov4-visdrone.cfg yolov4.conv.137
./darknet detector train ./data/visdrone.data ./cfg/MobileNetV2-yolov3-lite.cfg MobileNetV2--Lite.conv.57

2,模型转化

2.1 转化为caffe模型

yolov4转caffe_link
yolofastest转caffe_link
自训练的yolov4模型darknet转caffe

2.2 仿真模型转化

  • 仿真模型转化参数

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    [prototxt_file] ./mark_prototxt/yolov4-visdrone-2class_mark_nnie_20210219113115.prototxt
    [caffemodel_file] D:\BaiduNetdiskDownload\yolov4-visdrone-2class-shortcut16\yolov4-visdrone-2class.caffemodel
    [batch_num] 1
    [net_type] 0
    [sparse_rate] 0
    [compile_mode] 0
    [is_simulation] 1
    [log_level] 2
    [instruction_name] ./../data/detection/yolov4/inst/yolov4_func
    [RGB_order] BGR
    [data_scale] 0.0039062
    [internal_stride] 16
    [image_list] ./../data/detection/yolov4/image_ref_list.txt
    [image_type] 1
    [mean_file] null
    [norm_type] 3
  • 仿真模型转化输出

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    Start [RuyiStudio Wk NNIE Mapper] [D:\hi3559a\SVP_PC\HiSVP_PC_V1.1.3.0\software\data\detection\yolov4\yolov4_func.cfg] sample_simulator (2021-02-19 11:32:18)
    Mapper Version 1.1.3.0_B010 (NNIE_1.1) 1905091707159355

    begin net parsing....

    end net parsing

    begin prev optimizing....

    end prev optimizing....

    begin net quantalizing(GPU)....

    end quantalizing

    begin POST optimizing....

    end POST optimizing

    begin NNIE[0] mem allocation....

    .end NNIE[0] memory allocating

    begin NNIE[0] instruction generating....

    ..............end NNIE[0] instruction generating

    begin lbs binary code generating....

    end lbs binary code generating


    ===============D:\hi3559a\SVP_PC\HiSVP_PC_V1.1.3.0\software\data\detection\yolov4\yolov4_func.cfg Successfully!===============

    End [RuyiStudio Wk NNIE Mapper] [D:\hi3559a\SVP_PC\HiSVP_PC_V1.1.3.0\software\data\detection\yolov4\yolov4_func.cfg] sample_simulator (2021-02-19 11:34:58)

2.3 板子模型转化

  • 板子模型转化参数

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    [prototxt_file] ./mark_prototxt/yolov4-visdrone-2class_mark_nnie_20210219114447.prototxt
    [caffemodel_file] D:\BaiduNetdiskDownload\yolov4-visdrone-2class-shortcut16\yolov4-visdrone-2class.caffemodel
    [batch_num] 1
    [net_type] 0
    [sparse_rate] 0
    [compile_mode] 0
    [is_simulation] 0
    [log_level] 2
    [instruction_name] ./../data/detection/yolov4/inst/yolov4__inst
    [RGB_order] BGR
    [data_scale] 0.0039062
    [internal_stride] 16
    [image_list] ./../data/detection/yolov4/image_ref_list.txt
    [image_type] 1
    [mean_file] null
    [norm_type] 3
  • 板子转化输出

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    Start [RuyiStudio Wk NNIE Mapper] [D:\hi3559a\SVP_PC\HiSVP_PC_V1.1.3.0\software\data\detection\yolov4\yolov4_inst.cfg] sample_simulator (2021-02-19 11:45:53)
    Mapper Version 1.1.3.0_B010 (NNIE_1.1) 1905091707159355

    begin net parsing....

    end net parsing

    begin prev optimizing....

    end prev optimizing....

    begin net quantalizing(GPU)....

    end quantalizing

    begin optimizing....

    end optimizing

    begin NNIE[0] mem allocation....

    end NNIE[0] memory allocating

    begin NNIE[0] instruction generating....

    ...............end NNIE[0] instruction generating

    begin parameter compressing....

    end parameter compressing

    begin compress index generating....

    end compress index generating

    begin binary code generating....

    .end binary code generating


    ===============D:\hi3559a\SVP_PC\HiSVP_PC_V1.1.3.0\software\data\detection\yolov4\yolov4_inst.cfg Successfully!===============

    End [RuyiStudio Wk NNIE Mapper] [D:\hi3559a\SVP_PC\HiSVP_PC_V1.1.3.0\software\data\detection\yolov4\yolov4_inst.cfg] sample_simulator (2021-02-19 11:48:43)

3,hisi源码调用分析【yolov3为例】

函数入口源文件,sample_nnie_main.c

选择Yolov3,会进入sammple_nnie.c 中的SAMPLE_SVP_NNIE_Yolov3,其中可以配置加载模型与图片文件的顺序;

1
2
HI_CHAR *pcSrcFile = "./data/nnie_image/rgb_planar/dog_bike_car_416x416.bgr";
HI_CHAR *pcModelName = "./data/nnie_model/detection/inst_yolov3_cycle.wk";

而后分别调用:

  • SAMPLE_COMM_SVP_CheckSysInit();
  • SAMPLE_COMM_SVP_NNIE_LoadModel();
  • SAMPLE_SVP_NNIE_Yolov3_ParamInit();
  • SAMPLE_SVP_NNIE_FillSrcData();
  • SAMPLE_SVP_NNIE_Forward();
  • SAMPLE_SVP_NNIE_Yolov3_GetResult;
  • SAMPLE_SVP_NNIE_Detection_PrintResult()
    依次是系统初始化、NNie加载模型、参数初始化、读取数据、前向推导、获取结果、结果打印。
    至此,yolov3例程的函数调用基本完成

4,关于yolov4的说明

yolov4的原版模型中,在转仿真或者板子模型时,不支持mish激活函数,有两种解决思路:

  • 用leakyReLU替换mish【本次交付模型方案】
    相对于mish作为激活函数,leakyReLU在准确性上略有降低,但是在效率上有优势,因此更加适用于yolov4在嵌入式中的运行;

  • 数学公式替换
    caffe中定义好了6种常用的激活函数:ReLu、Sigmod、Tanh、Absval、Power、BNll
    mish表达式:Mish = x*tanh(ln(1+e^x))
    BNLL表达式:f(x) = log(1+exp(x))
    Tanh表达式
    caffe中eltwise层有PROD类型操作可计算基于元素的乘法

// 代码折叠