ptz 发布的文章

1. 基本控制命令

c - v, m - v, c - l, 前滚,后滚,居中或者顶部或者底部 c - p, c - n, 上一行,下一行 c - b, c - f, m - b, m - f, 下一个,上一个, 下一词,上一词 c - a, c - e, m - a, m - e, 行首,行尾,句首,句尾 m - <, m - >, 文首,文尾

c - u 数值 c - v 这样的命令用于重复执行数值次数的命令. 如果重复的是 c-n,那么只是光标下移多少行,如果是 c-v,那么不是翻多少页,而是页面整体滚动多少行。 c - g 停止并取消命令 c-x k 关闭当前的buffer

c-x 1 只保留一个窗口。

删除以下3种 del, c-d, 删除前一个,删除后一个 m-del, m-d, 删除前一个词,删除后一个词语 c-k, m-k, 删除到行尾,删除到句尾

粘贴 c-y, m-y, 粘贴, 粘贴之后的文本选择

c-@ 选中 c-w 剪切 m-w 复制 c-/ 撤销

c-x c-f 打开文件 c-x c-s 保存当前文件 c-x s 保存所有改变了的文件 c-x c-c 退出 emacs c-x c-b 列出所有当前 buffer c-x b 输入 buffer 名字,然后跳转buffer c-z 在命令行界面停止 emacs,然后可以执行其他命令, %emacs 恢复 emacs c-x u 撤销 c-x, m-x, 单个按键命令,命令词命令

m-x 额外命令,输入的时候,输入几个字母后,按空格,会补充命令到 ‘-’ 字符, 按 tab 按键可以补充完整。比如 replace-string, 可以输入 repl 然后空格,再输入 s,然后 tab。

- 阅读剩余部分 -

1. 我需要把 ubuntu 按键进行互换映射

原功能 目标功能 caps lctrl lctrl return return rctrl rctrl caps

sudo vim /usr/share/X11/xkb/keycodes/evdev

<RTRN> = 37;
<CAPS> = 105;
<LCTL> = 66;
<RCTL> = 36;

参考: https://blog.csdn.net/Elliott_Yoho/article/details/78650838

2. 我需要把 win10 的按键进行一样的互换映射,需要修改注册表,HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Keyboard Layout 增加二进制 Scancode Map,具体值是:

00,00,00,00,00,00,00,00,05,00,00,00,1d,00,3a,00,1c,00,1d,00,
1d,e0,1c,00,3a,00,1d,e0,00,00,00,00

导出的注册表是:

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Keyboard Layout]
"Scancode Map"=hex:00,00,00,00,00,00,00,00,05,00,00,00,1d,00,3a,00,1c,00,1d,00,\
  1d,e0,1c,00,3a,00,1d,e0,00,00,00,00

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Keyboard Layout\DosKeybCodes]
"00000402"="bg"
"00000404"="ch"
"00000405"="cz"
"00000406"="dk"
"00000407"="gr"
"00000408"="gk"
"00000409"="us"
"0000040a"="sp"
"0000040b"="su"
"0000040c"="fr"
"0000040e"="hu"
"0000040f"="is"
"00000410"="it"
"00000411"="jp"
"00000412"="ko"
"00000413"="nl"
"00000414"="no"
"00000415"="pl"
"00000416"="br"
"00000418"="ro"
"00000419"="ru"
"0000041a"="yu"
"0000041b"="sl"
"0000041C"="us"
"0000041d"="sv"
"0000041f"="tr"
"00000422"="us"
"00000423"="us"
"00000424"="yu"
"00000425"="et"
"00000426"="us"
"00000427"="us"
"00000442"="tk"
"00000452"="uk"
"0000046e"="sf"
"00000804"="ch"
"00000807"="sg"
"00000809"="uk"
"0000080a"="la"
"0000080c"="be"
"00000813"="be"
"00000816"="po"
"00000c04"="ch"
"00000c0c"="cf"
"00000c1a"="us"
"00001004"="ch"
"00001009"="us"
"0000100c"="sf"
"00001404"="ch"
"00001809"="us"
"00010402"="us"
"00010405"="cz"
"00010407"="gr"
"00010408"="gk"
"00010409"="dv"
"0001040a"="sp"
"0001040e"="hu"
"00010410"="it"
"00010415"="pl"
"00010418"="ro"
"00010419"="ru"
"0001041b"="sl"
"0001041f"="tr"
"00010426"="us"
"00010c0c"="cf"
"00010c1a"="us"
"00020402"="bg"
"00020408"="gk"
"00020409"="us"
"00020418"="ro"
"00020422"="us"
"00030402"="bg"
"00030409"="usl"
"00040402"="bg"
"00040409"="usr"
"00050408"="gk"

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Keyboard Layout\DosKeybIDs]
"00000410"="141"
"0000041f"="179"
"00000442"="440"
"00010408"="220"
"00010410"="142"
"00010415"="214"
"0001041f"="440"
"00020408"="319"

参考: https://liang.blog.csdn.net/article/details/84637767?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromBaidu-1.control&depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromBaidu-1.control https://blog.csdn.net/lhdalhd1996/article/details/90741092

1. 下载 gcc 4.9.3 的源代码。

http://ftp.gnu.org/gnu/gcc/gcc-4.9.3/ 或者国内的交大 https://mirrors.sjtug.sjtu.edu.cn/gnu/gcc/gcc-4.9.3/

2. 解压 gcc 源码,然后从 gcc-4.9.3/contrib/download_prerequisites 这个文件中读取需要依赖的包。下载相应的依赖包。

# Necessary to build GCC.
MPFR=mpfr-2.4.2
GMP=gmp-4.3.2
MPC=mpc-0.8.1
  ISL=isl-0.12.2
  CLOOG=cloog-0.18.1

https://mirrors.sjtug.sjtu.edu.cn/gnu/ 可以下载 gcc, mpfr, gmp https://sourceforge.net/projects/d2718c/ 可以下载好几个东西,但是速度是个问题。 https://src.fedoraproject.org/lookaside/extras/gcc/isl-0.12.2.tar.bz2 可以下载 isl ,速度还可以。

3. 把第二部的压缩包解压到 gcc 源码目录下,然后 ln -sf xxx-xxx xxx,建立这几个包的软连接。如果不想这么麻烦,并且网络很好的话,直接在上一步执行 download_prerequisites 也可以。

4.

cd ..
mkdir gcc-4.9.3-build-temp
cd gcc-4.9.3-build-temp

5. 根据原来机器的 gcc -v 获得相应的 config 选项,然后配置新的选项,做成脚本进行执行。

6. config 的时候,报错:

GNAT is required to build ada 这个因为不需要 ada,所以直接在 language 选项里面去掉 ada 就可以了。 参考: http://gcc.1065356.n8.nabble.com/GNAT-is-required-to-build-ada-td692409.html https://github.com/spack/spack/issues/15867 https://github.com/owent-utils/bash-shell/issues/2

6. make -jX 之后,报错:

In file included from ../../gcc-4.9.3/gcc/cp/except.c:1013:cfns.gperf:101:1: error: ‘const char libc_name_p(const char, unsigned int)’ redeclared inline with ‘gnu_inline’ attribute

这个使用 4.9.4 版本就可以了。 参考: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=831142 https://github.com/jens-maus/RaspberryMatic/issues/28

7. 报错:

/md-unwind-support.h: In function ‘aarch64_fallback_frame_state’:error: field ‘uc’ has incomplete typestruct ucontext uc; 这个是因为 glibc 有更新导致的,需要修改 gcc-4.9.4/libgcc/config/XXX/linux_unwind.h 这个头文件,把 struct ucontext uc; 去掉,换成 ucontext_t uc; 因为 /usr/include/sys/ucontext.h 里面是这个定义。所以要按照新的 libc 来修改一下。当然也可以打补丁代替手动修改。

参考: https://stackoverflow.com/questions/46999900/how-to-compile-gcc-6-4-0-with-gcc-7-2-in-archlinux https://blog.csdn.net/XCCCCZ/article/details/80958414 https://patchwork.ozlabs.org/project/buildroot/patch/20170923212414.16744-10-romain.naour@gmail.com/ http://lists.busybox.net/pipermail/buildroot/2017-September/202526.html https://blog.csdn.net/wang805447391/article/details/83380302 https://unix.stackexchange.com/questions/566650/how-do-i-compile-gcc-5-from-source https://stackoverflow.com/questions/52498431/compile-gcc6-4-0-using-gcc8-1-1 https://gcc.gnu.org/git/?p=gcc.git;a=blobdiff;f=libgcc/config/aarch64/linux-unwind.h;h=d46d5f53be379ec2dbc9a5ba95d51e22c1d52c2f;hp=d5d6980442fd47b1f1e499e99cb25b5fffbdbeb3;hb=883312dc79806f513275b72502231c751c14ff72;hpb=601d22f69093aa98dcf9593bc138da7ba8281e05 https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=883312dc79806f513275b72502231c751c14ff72

8. 报错:

/home/openailab/Downloads/test/gcc-4.9.4-build-temp/./gcc/xgcc: /home/openailab/Downloads/test/gcc-4.9.4-build-temp/aarch64-unknown-linux-gnu/libstdc++-v3/src/.libs/libstdc++.so.6: version `CXXABI_1.3.9' not found (required by /home/openailab/Downloads/test/gcc-4.9.4-build-temp/./gcc/xgcc)
make[3]: *** [Makefile:942: libgcc_s.so] Error 1

这个错误网上找不到解决方法,这个尝试暂停。

参考:

https://www.cnblogs.com/succeed/p/6204438.html https://www.cnblogs.com/alianbog/p/12498915.html https://developer.aliyun.com/article/90390 https://blog.csdn.net/llh_1178/article/details/79329250 https://www.theobroma-systems.com/rk3399-q7-user-manual/04-software.html https://www.jianshu.com/p/0caef3ce8e06 https://gcc.gnu.org/install/ https://www.cnblogs.com/uestc-mm/p/7511063.html https://blog.csdn.net/xiexievv/article/details/50620170 https://zhuanlan.zhihu.com/p/107133028 https://www.jianshu.com/p/fc162672fae2 https://blog.csdn.net/u013946356/article/details/83106133

1. createTrackbar是Opencv中的API,其可在显示图像的窗口中快速创建一个滑动控件,用于手动调节阈值,具有非常直观的效果。具体定义如下:

    CV_EXPORTS int createTrackbar(const string& trackbarname, const string& winname,
                                  int* value, int count,
                                  TrackbarCallback onChange = 0,
                                  void* userdata = 0);

形式参数一、trackbarname:滑动空间的名称;

形式参数二、winname:滑动空间用于依附的图像窗口的名称;

形式参数三、value:初始化阈值;

形式参数四、count:滑动控件的刻度范围;

形式参数五、TrackbarCallback是回调函数,其定义如下:

typedef void (CV_CDECL *TrackbarCallback)(int pos, void* userdata);

参考: https://blog.csdn.net/mysee1989/article/details/41379817 https://docs.opencv.org/3.4.1/dc/dfa/Morphology_1_8cpp-example.html#a20

2. Rect 是矩形类,成员变量x、y、width、height,分别为左上角点的坐标和矩形的宽和高。常用的成员函数有Size()返回值为一个Size,area()返回矩形的面积,contains(Point)用来判断点是否在矩形内,inside(Rect)函数判断矩形是否在该矩形内,tl()返回左上角点坐标,br()返回右下角点坐标。

Rect类的使用

    rect = rect ± point (shifting a rectangle by a certain offset)
    rect = rect ± size (expanding or shrinking a rectangle by a certain amount)
    rect += point, rect -= point, rect += size, rect -= size (augmenting operations)
    rect = rect1 & rect2 (rectangle intersection)
    rect = rect1 | rect2 (minimum area rectangle containing rect1 and rect2 )
    rect &= rect1, rect |= rect1 (and the corresponding augmenting operations)
    rect == rect1, rect != rect1 (rectangle comparison)

参考: https://www.cnblogs.com/happyamyhope/p/9844629.html

3. copyTo 是深拷贝,但是要根据大小信息,决定是否重新申请空间,clone 不管大小信息,全部重新申请空间进行深拷贝。

参考: https://blog.csdn.net/u013806541/article/details/70154719

4. opencv 提供的字符串格式化如下:

string formated_str = format("I have made %d dollars on this product.", 500);

参考: https://blog.csdn.net/yiyeshuanglinzui/article/details/108388683

5. 获取矩阵的行,列,位数。

Mat(int rows, int cols, int type), 直接使用属性就行。

m.rows
m.cols

参考: https://blog.csdn.net/renweiyi1487/article/details/101616758

6. 添加文字, putText

void cv::putText    (   InputOutputArray    img,
        const String &      text,
        Point   org,
        int     fontFace,
        double      fontScale,
        Scalar      color,
        int     thickness = 1,
        int     lineType = LINE_8,
        bool    bottomLeftOrigin = false 
    )   
    cv::putText(image, text, origin, font_face, font_scale, cv::Scalar(0, 255, 255), thickness, 8, 0);

参考:https://docs.opencv.org/3.4/d6/d6e/group__imgproc__draw.html#ga5126f47f883d730f633d74f07456c576 https://blog.csdn.net/guduruyu/article/details/68491211

7. imutils 有 c++ 的部分实现,https://github.com/minooei/imutils

8. 获得轮廓最小矩形,使用 boundingRect

Rect boundRect = boundingRect( contours_poly[i] );

参考: https://docs.opencv.org/3.4/da/d0c/tutorial_bounding_rects_circles.html

9. 矩形 rectangle()

void cv::rectangle  (   InputOutputArray    img,
        Point   pt1,
        Point   pt2,
        const Scalar &      color,
        int     thickness = 1,
        int     lineType = LINE_8,
        int     shift = 0 
    )   
void cv::rectangle  (   Mat &   img,
        Rect    rec,
        const Scalar &      color,
        int     thickness = 1,
        int     lineType = LINE_8,
        int     shift = 0 
    )   
rectangle(src, boundRect[i], Scalar(0, 255, 0));
rectangle(flipFrame, Point(roi_ws, roi_hs), Point(roi_we, roi_he), Scalar(0, 255, 0), 0)

参考: https://docs.opencv.org/3.4/d6/d6e/group__imgproc__draw.html#ga07d2f74cadcf8e305e810ce8eed13bc9 https://kknews.cc/code/66yekj3.html

10. 获得轮廓最小面积

◆ contourArea()
double cv::contourArea  (   InputArray      contour,
        bool    oriented = false 
    )   
fabs(contourArea(Mat(c)));

参考: https://docs.opencv.org/master/d3/dc0/group__imgproc__shape.html#ga2c759ed9f497d4a618048a2f56dc97f1 https://docs.opencv.org/3.4.1/dd/d9d/segment_objects_8cpp-example.html#a1

11. 膨胀

◆ dilate()
void cv::dilate     (   InputArray      src,
        OutputArray     dst,
        InputArray      kernel,
        Point   anchor = Point(-1,-1),
        int     iterations = 1,
        int     borderType = BORDER_CONSTANT,
        const Scalar &      borderValue = morphologyDefaultBorderValue() 
    )   
dilate(mid_filer,gray_dilate1,element);

参考: https://docs.opencv.org/3.4/d4/d86/group__imgproc__filter.html#ga4ff0f3318642c4f469d0e11f242f3b6c https://www.itread01.com/articles/1478557515.html https://docs.opencv.org/3.4.1/d8/dc0/morphology2_8cpp-example.html#a10 https://zhuanlan.zhihu.com/p/40326127 https://www.jianshu.com/p/ee72f5215e07 https://www.cnblogs.com/ssyfj/p/9276999.html

12. 查找轮廓

    findContours(image,contours,hierarchy,RETR_TREE,CHAIN_APPROX_SIMPLE,Point());

查找轮廓的参数导致的结果,参考: https://blog.csdn.net/dcrmg/article/details/51987348

13. mat 的创建,复制和释放,构造函数等等

参考: https://blog.csdn.net/wanggao_1990/article/details/53150926 https://blog.csdn.net/guyuealian/article/details/70159660

14. createTrackbar 用于创建滑动控件,方便调试效果。

参考: https://blog.csdn.net/u013270326/article/details/72821149

15. 求差 absdiff

absdiff(frameNow,framePre,frameDet);

参考: https://blog.csdn.net/dcrmg/article/details/52234929

16. 翻转图像

flip()
void cv::flip   (   InputArray      src,
        OutputArray     dst,
        int     flipCode 
    )   

参考: https://docs.opencv.org/3.4/d2/de8/group__core__array.html#gaca7be533e3dac7feb70fc60635adf441

17. 截取部分矩阵

◆ Mat() [15/29]
cv::Mat::Mat    (   const Mat &     m,
        const Range &   rowRange,
        const Range &   colRange = Range::all() 
    )   
roiFrame = Mat(flipFrame, Range(roi_hs, roi_he), Range(roi_ws, roi_we));

参考: https://docs.opencv.org/master/d3/d63/classcv_1_1Mat.html#a92a3e9e5911a2eb0cf0950a0a9670c76

18. 转换颜色

◆ cvtColor()
void cv::cvtColor   (   InputArray      src,
        OutputArray     dst,
        int     code,
        int     dstCn = 0 
    )   
cvtColor(roiFrame, hsvFrame, COLOR_BGR2HSV)

参考: https://docs.opencv.org/3.4/d8/d01/group__imgproc__color__conversions.html#ga397ae87e1288a81d2363b61574eb8cab

19. 通过上下限阈值从图像中提取作为前景,其他作为后景,完成图像的二值化。

inRange()
void cv::inRange    (   InputArray      src,
        InputArray      lowerb,
        InputArray      upperb,
        OutputArray     dst 
    )   
inRange(hsvFrame, Scalar(0, 20, 70), Scalar(20, 255, 255), maskFrame);

参考: https://docs.opencv.org/master/d2/de8/group__core__array.html#ga48af0ab51e36436c5d04340e036ce981 https://docs.opencv.org/master/d6/d7f/samples_2cpp_2camshiftdemo_8cpp-example.html#a33

20. 寻找轮廓 findContours 获得的轮廓类型是: std::vector<std::vector >, 获取轮廓面积的时候输入的轮廓参数是 std::vector

参考: https://docs.opencv.org/master/d3/dc0/group__imgproc__shape.html#gadf1ad6a0b82947fa1fe3c3d497f260e0 https://docs.opencv.org/master/d3/dc0/group__imgproc__shape.html#ga2c759ed9f497d4a618048a2f56dc97f1

21. 画轮廓线

◆ drawContours()
void cv::drawContours   (   InputOutputArray    image,
        InputArrayOfArrays      contours,
        int     contourIdx,
        const Scalar &      color,
        int     thickness = 1,
        int     lineType = LINE_8,
        InputArray      hierarchy = noArray(),
        int     maxLevel = INT_MAX,
        Point   offset = Point() 
    )   

参考: https://docs.opencv.org/master/d6/d6e/group__imgproc__draw.html#ga746c0625f1781f1ffc9056259103edbc

22. 计算曲线长度或者轮廓周长。

◆ arcLength()
double cv::arcLength    (   InputArray      curve,
        bool    closed 
    )   

参考: https://docs.opencv.org/master/d3/dc0/group__imgproc__shape.html#ga8d26483c636be6b35c3ec6335798a47c

23. 计算轮廓近似多边形

◆ approxPolyDP()
void cv::approxPolyDP   (   InputArray      curve,
        OutputArray     approxCurve,
        double      epsilon,
        bool    closed 
    )   

参考: https://docs.opencv.org/master/d3/dc0/group__imgproc__shape.html#ga0012a5fdaea70b8a9970165d98722b4c

24. 计算凸包

◆ convexHull()
void cv::convexHull     (   InputArray      points,
        OutputArray     hull,
        bool    clockwise = false,
        bool    returnPoints = true 
    )   

参考: https://docs.opencv.org/master/d3/dc0/group__imgproc__shape.html#ga014b28e56cb8854c0de4a211cb2be656

25. 计算凸性缺陷

◆ convexityDefects()
void cv::convexityDefects   (   InputArray      contour,
        InputArray      convexhull,
        OutputArray     convexityDefects 
    )   

注意: contour 是 vector, convexhull 是 vector,convexityDefects 是 vector

如果有报错: OpenCV Error: Assertion failed (hpoints > 0) in convexityDefects, file /home/neha/opencv-3.4.0/modules/imgproc/src/convhull.cpp, line 284 terminate called after throwing an instance of 'cv::Exception' what(): /home/neha/opencv-3.4.0/modules/imgproc/src/convhull.cpp:284: error: (-215) hpoints > 0 in function convexityDefects 那么需要检查 convexhull 是不是 vector

参考: https://docs.opencv.org/master/d3/dc0/group__imgproc__shape.html#gada4437098113fd8683c932e0567f47ba https://github.com/wonderseen/Sparse-Points-Gen-Convex/issues/1

26. Vec4i 是 4个 int 组成的 vector 向量,即 vector

参考: https://www.coder.work/article/826469

27. 画圆圈。

◆ circle()
void cv::circle     (   InputOutputArray    img,
        Point   center,
        int     radius,
        const Scalar &      color,
        int     thickness = 1,
        int     lineType = LINE_8,
        int     shift = 0 
    )   

参考: https://docs.opencv.org/master/d6/d6e/group__imgproc__draw.html#gaf10604b069374903dbd0f0488cb43670
https://blog.csdn.net/caomin1hao/article/details/81876836

28. 画线

◆ line()
void cv::line   (   InputOutputArray    img,
        Point   pt1,
        Point   pt2,
        const Scalar &      color,
        int     thickness = 1,
        int     lineType = LINE_8,
        int     shift = 0 
    )   

参考: https://docs.opencv.org/master/d6/d6e/group__imgproc__draw.html#ga7078a9fae8c7e7d13d24dac2520ae4a2

29. 高斯模糊

◆ GaussianBlur()
void cv::GaussianBlur   (   InputArray      src,
        OutputArray     dst,
        Size    ksize,
        double      sigmaX,
        double      sigmaY = 0,
        int     borderType = BORDER_DEFAULT 
    )

模糊参数的影响效果可以参考: https://www.cnblogs.com/sdu20112013/p/11600436.html

参考: https://docs.opencv.org/master/d4/d86/group__imgproc__filter.html#gaabe8c836e97159a9193fb0b11ac52cf1

30. 手势识别,用到凸包和缺陷。

具体可参考: http://www.zfhblog.com/index.php/archives/22/ https://www.cnblogs.com/Anita9002/p/5332122.html https://docs.opencv.org/master/d7/d1d/tutorial_hull.html https://blog.csdn.net/lichengyu/article/details/38392473

31. resize

    cv::Mat dst(300, 300, image.type());
    cv::resize(image, dst, dst.size(), 0, 0, cv::INTER_LINEAR);

参考: https://blog.csdn.net/i_chaoren/article/details/54564663 https://blog.csdn.net/u012005313/article/details/51943442 https://www.jianshu.com/p/11879a49d1a0

32. 复制所有的图像数据

    cv::Mat image;
    image = cv::imread(imgPath);
    memcpy(data, image.data, image.dataend - image.datastart);

参考: https://www.jianshu.com/p/cfc0c1f87bf8

33. 清除图像数据

Mat mat3 = Mat::zeros(1, 4, CV_32F);
mat3.release();

参考: https://blog.csdn.net/wanggao_1990/article/details/53150926

1. cv2.GaussianBlur()

def GaussianBlur(src, ksize, sigmaX, dst=None, sigmaY=None, borderType=None):
"""
使用高斯滤波器模糊图像
Argument:
    src: 原图像
    dst: 目标图像
    ksize: 高斯核的大小;(width, height);两者都是正奇数;如果设为0,则可以根据sigma得到;
    sigmaX: X方向的高斯核标准差;
    sigmaY: Y方向的高斯核标准差;
        如果sigmaY设为0,则与sigmaX相等;
        如果两者都为0,则可以根据ksize来计算得到;
    (推荐指定ksize,sigmaX,sigmaY)
    borderType: pixel extrapolation method
"""

参考: https://www.cnblogs.com/chenzhen0530/p/10742536.html https://blog.csdn.net/wuqindeyunque/article/details/103694900

2. cv2.imshow() 用来在窗口上显示图像。代码如下:

img = cv2.imread('3.jpg',1)
cv2.imshow('imshow',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
    cv2.namedWindow('image', cv2.WINDOW_NORMAL)
    cv2.imshow('image',img)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

cv2.namedWindow()函数可以指定窗口是否可以调整大小。在默认情况下,标志为cv2.WINDOW_AUTOSIZE。但是,如果指定标志为cv2.WINDOW_Normal,则可以调整窗口的大小。当图像尺寸太大,并在窗口中添加跟踪条时,这些操作可以让我们的工作更方便一点。

cv2.waitKey(0): 是一个和键盘绑定的函数,它的作用是等待一个键盘的输入(因为我们创建的图片窗口如果没有这个函数的话会闪一下就消失了,所以如果需要让它持久输出,我们可以使用该函数)。它的参数是毫秒级。该函数等待任何键盘事件的指定毫秒。如果您在此期间按下任何键,程序将继续进行。我们也可以将其设置为一个特定的键。

cv2.destroyALLWindows(): 销毁我们创建的所有窗口。如果要销毁任何特定窗口,请使用函数cv2.destroyWindow(),其中传递确切的窗口名称作为参数。(应该是使用创建窗口时所使用的窗口名称,字符串类型。)

参考: https://blog.csdn.net/weixin_38383877/article/details/82659779

3. waitKey() 用于等待一段时间,接受用户按键。如果参数是 0, 那么循环就只执行一次,然后一直等待用户按键。如果参数不是 0,那么每次等待参数给定的 ms,如果没有按键,那么继续循环。

ord() 函数用来获得按键对应的 ASCII 码,用来和 waitKey() 返回值进行对比。

if cv2.waitKey(1) & 0xFF == ord("q"):

参考: https://blog.csdn.net/qq_39377418/article/details/101393007 https://www.runoob.com/python/python-func-ord.html

4. release() 用来释放视频,然后再调用 cv2.destroyAllWindows() 来关闭所有窗口。

参考: https://www.jianshu.com/p/949683764115

5. imutils.resize() 用来改变图像大小,但是不改变长宽比例。

参考: https://www.jianshu.com/p/bb34ddf2a947

6. cv2.cvtColor() 用来转换图像色彩。

def cvtColor(src, code, dst=None, dstCn=None):
"""
转换图像的颜色空间
Argument:
    src: 原图像;
    code: 指定颜色空间转换类型;
    dst: 目标图像;与原图像大小深度一致;
    dstCn: 指定目标图像通道数;默认None,则会根据src、code自动计算;
"""

参考: https://www.cnblogs.com/chenzhen0530/p/10741264.html

7. cv2.absdiff() 用来对图像求差,主要是对于灰度图。用来比较两幅图像的差别。

参考: https://blog.csdn.net/u014737138/article/details/80388482 https://zhuanlan.zhihu.com/p/42940310

8. cv2.threshold() 阈值函数。

def threshold(src, thresh, maxval, type, dst=None):
"""
设置固定级别的阈值应用于多通道矩阵
    例如,将灰度图像变换二值图像,或去除指定级别的噪声,或过滤掉过小或者过大的像素点;
Argument:
    src: 原图像
    dst: 目标图像
    thresh: 阈值
    type: 指定阈值类型;下面会列出具体类型;
    maxval: 当type指定为THRESH_BINARY或THRESH_BINARY_INV时,需要设置该值;
"""

参考: https://www.cnblogs.com/chenzhen0530/p/10742540.html https://www.cnblogs.com/yinliang-liang/p/9293310.html

9. cv2.dilate() 形态学膨胀。用于白色增大。

dst = cv2.dilate(src,kernel,anchor,iterations,borderType,borderValue)
        src: 输入图像对象矩阵,为二值化图像
        kernel:进行腐蚀操作的核,可以通过函数getStructuringElement()获得
        anchor:锚点,默认为(-1,-1)
        iterations:腐蚀操作的次数,默认为1
        borderType: 边界种类
        borderValue:边界值

参考: https://www.cnblogs.com/silence-cho/p/11069903.html https://zhuanlan.zhihu.com/p/110330329 https://www.aiuai.cn/aifarm350.html https://www.cnblogs.com/my-love-is-python/p/10394908.html

10. cv2.findContours() 查找轮廓,最好使用原图像的拷贝。

参考: https://www.cnblogs.com/wmy-ncut/p/9889294.html https://blog.csdn.net/gaoranfighting/article/details/34877549

11. imutils.grab_contours 返回轮廓,是配合 cv2.findContours() 使用的。cv2.findContours()在就版本返回两个值,在新版本返回3个值,通过imutils.grab_contours 把返回中的轮廓拿到。

参考: https://blog.csdn.net/nima1994/article/details/90542992

13. cv2.contourArea() 对轮廓线求面积。

参考: https://blog.csdn.net/greatwall_sdut/article/details/108862018

14. cv2.boundingRect() 对找到的形状用最小的矩形框起来。

参考: https://www.cnblogs.com/Anita9002/p/8033101.html

15. cv2.rectangle() 用于在图像上画出一个矩形。

参考: https://blog.csdn.net/Gaowang_1/article/details/103087922

16. cv2.putText() 用于在图像上增加文字

参考: https://blog.csdn.net/GAN_player/article/details/78155283

17. cv2.FONT_HERSHEY_SIMPLEX 字体效果可以参考:

https://blog.csdn.net/hgkdzbf6/article/details/102093323

18. 形态学详细使用,包括不同的核的效果。

参考: https://blog.csdn.net/sunny2038/article/details/9137759

19. cv2.flip 图像翻转。

flip(src, flipCode[, dst]) flipCode Anno 1 水平翻转 0 垂直翻转 -1 水平垂直翻转

参考: https://blog.csdn.net/JNingWei/article/details/78753607

20. 从矩阵中取出一部分。

 Mat() [15/29]
cv::Mat::Mat    (   const Mat &     m,
        const Range &   rowRange,
        const Range &   colRange = Range::all() 
    )   
roi = frame[0 : 300, 0: 300]

注意第一组参数是 height / row,第二组参数是 width / col. 注意这个范围是 左闭右开的。 参考:https://docs.opencv.org/master/d3/d63/classcv_1_1Mat.html#a92a3e9e5911a2eb0cf0950a0a9670c76

21. 保持长宽比转换图像后,获得转换后的尺寸。

            ret, frame = capture.read()
            width = 500
            frame = imutils.resize(frame, width)
            height = frame.shape[1]
            print("width %d, height %d\n" % (frame.shape[0], frame.shape[1]))

参考: https://vimsky.com/zh-tw/examples/detail/python-method-imutils.resize.html

22. 彩色图像转为 HSV 格式,主要用于图像前处理,然后会对某个颜色区间去做检测,主要用于去背景处理后的,物体检测。

色相(H):色彩的顏色名稱,如紅色、黃色等。 飽和度(S):色彩的純度,越高色彩越純,低則逐漸變灰,數值為0-100%。 明度(V):亮度,數值為0-100%。

hsv = cv2.cvtColor(roi, cv2.COLOR_BGR2HSV)

参考: https://shengyu7697.github.io/blog/2020/03/22/Python-OpenCV-rgb-to-hsv/ https://blog.csdn.net/u012193416/article/details/79312798 https://docs.opencv.org/3.4/da/d97/tutorial_threshold_inRange.html

23. 转换数组为 uint8 类型

skin = np.array([0, 20, 70], dtype = np.uint8)

参考: https://numpy.org/doc/stable/reference/generated/numpy.array.html

24. 检查数据是否在范围内,在范围内就设置为 255,不在范围内就设置为 0。也是一种对图像二值化的方法。

◆ inRange()
void cv::inRange    (   InputArray      src,
        InputArray      lowerb,
        InputArray      upperb,
        OutputArray     dst 
    )       
Python:
    dst =   cv.inRange( src, lowerb, upperb[, dst]  )

#include <opencv2/core.hpp>

Checks if array elements lie between the elements of two other arrays.

The function checks the range as follows:

    For every element of a single-channel input array:

    dst(I)=lowerb(I)0≤src(I)0≤upperb(I)0
    For two-channel arrays:

    dst(I)=lowerb(I)0≤src(I)0≤upperb(I)0∧lowerb(I)1≤src(I)1≤upperb(I)1
    and so forth.

That is, dst (I) is set to 255 (all 1 -bits) if src (I) is within the specified 1D, 2D, 3D, ... box and 0 otherwise.

When the lower and/or upper boundary parameters are scalars, the indexes (I) at lowerb and upperb in the above formulas should be omitted. 
mask = cv2.inRange(hsv, skin_low, skin_up)

25. 画轮廓线

            contours = cv2.findContours(mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
            contours = imutils.grab_contours(contours)
            cv2.drawContours(frame, contours, -1, (0, 255, 0), 3)

参考: https://stackoverflow.com/questions/48948769/how-to-draw-contours-using-opencv-in-python

26. 计算轮廓周长或曲线长度

◆ arcLength()
double cv::arcLength    (   InputArray      curve,
        bool    closed 
    )       
Python:
    retval  =   cv.arcLength(   curve, closed   )

#include <opencv2/imgproc.hpp>

Calculates a contour perimeter or a curve length.

The function computes a curve length or a closed contour perimeter.

Parameters
    curve   Input vector of 2D points, stored in std::vector or Mat.
    closed  Flag indicating whether the curve is closed or not. 
cv2.arcLength(cnt,True)

参考: https://docs.opencv.org/master/d3/dc0/group__imgproc__shape.html#ga8d26483c636be6b35c3ec6335798a47c https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_contours/py_contour_features/py_contour_features.html

27. 用更少顶点的曲线或多边形来逼近给定的曲线或多边形

◆ approxPolyDP()
void cv::approxPolyDP   (   InputArray      curve,
        OutputArray     approxCurve,
        double      epsilon,
        bool    closed 
    )       
Python:
    approxCurve =   cv.approxPolyDP(    curve, epsilon, closed[, approxCurve]   )

#include <opencv2/imgproc.hpp>

Approximates a polygonal curve(s) with the specified precision.

The function cv::approxPolyDP approximates a curve or a polygon with another curve/polygon with less vertices so that the distance between them is less or equal to the specified precision. It uses the Douglas-Peucker algorithm http://en.wikipedia.org/wiki/Ramer-Douglas-Peucker_algorithm

Parameters
    curve   Input vector of a 2D point stored in std::vector or Mat
    approxCurve Result of the approximation. The type should match the type of the input curve.
    epsilon Parameter specifying the approximation accuracy. This is the maximum distance between the original curve and its approximation.
    closed  If true, the approximated curve is closed (its first and last vertices are connected). Otherwise, it is not closed. 
            epsilon = 0.0005 * cv2.arcLength(cnt, True)
            approx = cv2.approxPolyDP(cnt, epsilon, True)

28. 寻找轮廓的凸包,可以用来手势识别

hull = cv2.convexHull(cnt)

参考: https://www.cnblogs.com/jclian91/p/9728488.html https://kk665403.pixnet.net/blog/post/403518029-%5Bpython%5D-%E5%88%A9%E7%94%A8opencv%E7%B9%AA%E8%A3%BD%E5%87%B8%E5%8C%85(convexhull)-%E8%BC%AA%E5%BB%93(contour

29. 显示凸包,因为凸包是一维的,需要增加一维,才能通过 drawContours 显示出来。

            hull = cv2.convexHull(cnt)
            hull_list = []
            hull_list.append(hull)
            cv2.drawContours(frame, hull_list, -1, (0, 0, 255), 3)

参考: https://stackoverflow.com/questions/36683556/drawing-convex-hull-of-the-biggest-contour-using-opencv-c https://docs.opencv.org/3.4/d7/d1d/tutorial_hull.html https://docs.opencv.org/master/d7/d1d/tutorial_hull.html

30. 生成一个二维数组。

kernel = np.ones((3, 3), np.uint8)

返回的是一个二维数组

[[1 1 1]
 [1 1 1]
 [1 1 1]]

参考: https://numpy.org/doc/stable/reference/generated/numpy.ones.html

31. shape 不仅仅指的是一个图像的高度,宽度,像素通道数,其实也是指的图像本身矩阵的行数,列数,表示每个像素的数组长度。

通道数,灰度时候为 1, RGB 为 3, RGBA 为 4,RGB555 和 RGB565 是 2。 其实这样看就是字节数,用几个字节来表示像素。 RGB一个像素点的打印如下:

[22 22 15]

RGB 一行打印如下:

[[22 22 15]
 [18 21 13]
 ...
 [127 126 127]]

RGB 图像打印如下:

[[[22 22 15]
  [18 21 13]
  ...
  [127 126 127]]
 ...
 [[10 15 16]
  [15 13 12]
  ...
  [5 5 5]]]

参考:https://blog.csdn.net/qq_28618765/article/details/78618724 https://blog.csdn.net/mvtechnology/article/details/9008499

32. 打印异常

import traceback
try:
    2/0
except Exception as e:
    traceback.print_exc()

参考: https://blog.csdn.net/feiyang5260/article/details/86661103

33. 用 lambda 来获取轮廓当中面积最大的那个:

cnt = max(contours, key = lambda x: cv2.contourArea(x))

参考: https://www.cnblogs.com/bjwu/articles/9028399.html

34. 凸性缺陷 函数是 convexityDefects

convexityDefects()
void cv::convexityDefects   (   InputArray      contour,
        InputArray      convexhull,
        OutputArray     convexityDefects 
    )       
Python:
    convexityDefects    =   cv.convexityDefects(    contour, convexhull[, convexityDefects] )
    epsilon = 0.0005 * cv2.arcLength(cnt, True)
    approx = cv2.approxPolyDP(cnt, epsilon, True)
    hull = cv2.convexHull(approx, returnPoints = False)
    defects = cv2.convexityDefects(approx, hull)

如上: 首先对图像简化成简易多边形,然后是生成凸包,最后生成凸性缺陷。 打印 hull如下:

[[165]
 [163]
 [128]
 [124]
 ...
 [209]
 [208]
 [168]
 [166]]

打印 defects 如下:

[[[166 168 167 114]]
 [[168 208 189 10549]]
 [[209 243 229 11994]]
 ...
 [[122 124 123 114]]
 [[124 128 125 181]]
 [[128 163 152 1784]]
 [[163 165 164 186]]]

defects 四个数是起始点,结束点,最远点,最远点到 hull 的距离。这些数值都是索引,用来索引的。 approx 打印如下:

[[[193 92]]
 [[192 92]]
 ...
 [[192 109]]
 [[193 109]]]
    for i in range(defects.shape[0]):
        s, e, f, d = defects[i, 0]
        start = tuple(approx[s][0])
        end = tuple(approx[e][0])
        far = tuple(approx[f][0])
        pt = (100, 180)

如上,当 i = 0 的时候, sefd 从 defects 里面取出 第一行,第一列的数组,[166 168 167 114], 即 s = 166, e = 168, f = 167, d = 114。然后 start = tuple(approx[s][0]),即 从 approx 里面取出 166行 0 列的数组,然后转为元组。 这个数组只有两个数值,估计就是像素点的 x, y 值。

参考: https://docs.opencv.org/master/d3/dc0/group__imgproc__shape.html#gada4437098113fd8683c932e0567f47ba https://docs.opencv.org/master/d7/d1d/tutorial_hull.html https://zhuanlan.zhihu.com/p/56360621 https://vimsky.com/zh-tw/examples/detail/python-method-cv2.convexityDefects.html https://zhuanlan.zhihu.com/p/140384182

35. 手势识别,需要用到凸包和缺陷。

可以参考: https://blog.csdn.net/qq_41562704/article/details/88975569 https://zhuanlan.zhihu.com/p/140384182 https://www.pythonf.cn/read/29390 https://blog.csdn.net/weixin_44885615/article/details/97811684

36. putText 中文,使用 pil

import cv2
import numpy
from PIL import Image, ImageDraw, ImageFont

class EspVisionUtil:
    @staticmethod
    def cv2ImgAddText(img, txt, left, top, color = (0, 255, 0), size = 20):
        # 判断是 opencv 的图片,就转换RGB因为 PIL 和 CV 的颜色格式不同
        if (isinstance(img, numpy.ndarray)):
            img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
        # 创建绘图对象
        draw = ImageDraw.Draw(img)
        # 字体格式
        fontStyle = ImageFont.truetype("SourceHanSansCN-Light.otf", size,
                                       encoding = "utf-8")
        # 绘制文本
        draw.text((left, top), txt, color, font = fontStyle)
        # 转换 opencv 格式
        return cv2.cvtColor(numpy.asarray(img), cv2.COLOR_RGB2BGR)

参考: https://blog.csdn.net/ctwy291314/article/details/91492048 https://www.cnblogs.com/vipstone/p/8998249.html https://blog.csdn.net/javastart/article/details/88796482 https://blog.csdn.net/zizi7/article/details/70145150 https://www.cnblogs.com/arkenstone/p/6961453.html https://www.cnblogs.com/YouXiangLiThon/p/7815124.html https://blog.csdn.net/qq_41895190/article/details/90513453 https://zhuanlan.zhihu.com/p/161385206

1. 解码 mp3 的时候,使用 ffprobe,当 ffmpeg 的版本不同,出来的结果也不一样。

在 早期版本的时候,显示 sample format 是 s16 在 3.X 版本的时候,显示是 S16P 在 4.X 版本的时候,显示是 FLTP

经过实际测试,这个只是在解码的时候,写入 PCM 的格式。而且和 ffmpeg 本身的 build 相关。 当我们在代码中循环打印 支持的 格式的时候:

printf("support formats: %d \n", *(codec->sample_fmts + i));  

这个 codec 对应的是 AVCodec,它的 sample_fmts 对应的是 const enum AVSampleFormat *sample_fmts; AVSampleFormat 的 定义是在 libavutil/samplefmt.h 中。

enum AVSampleFormat {
    AV_SAMPLE_FMT_NONE = -1,
    AV_SAMPLE_FMT_U8,          ///< unsigned 8 bits
    AV_SAMPLE_FMT_S16,         ///< signed 16 bits
    AV_SAMPLE_FMT_S32,         ///< signed 32 bits
    AV_SAMPLE_FMT_FLT,         ///< float
    AV_SAMPLE_FMT_DBL,         ///< double

    AV_SAMPLE_FMT_U8P,         ///< unsigned 8 bits, planar
    AV_SAMPLE_FMT_S16P,        ///< signed 16 bits, planar
    AV_SAMPLE_FMT_S32P,        ///< signed 32 bits, planar
    AV_SAMPLE_FMT_FLTP,        ///< float, planar
    AV_SAMPLE_FMT_DBLP,        ///< double, planar
    AV_SAMPLE_FMT_S64,         ///< signed 64 bits
    AV_SAMPLE_FMT_S64P,        ///< signed 64 bits, planar

    AV_SAMPLE_FMT_NB           ///< Number of sample formats. DO NOT USE if linking dynamically
};

AVCodec 的定义在 libavcodec/avcodec.h 中。 codecpar 是在 AVStream 中定义的,AVCodecParameters *codecpar; AVStream 是在 libavformat/avformat.h 中。 发现 3.X 版本支持的是 1 和 6,对应到 S16 和 S16P, 而 4.X 版本支持的是 3 和 8.对应到 FLT 和 FLTP。

所以使用 3.X 版本解压出来的 PCM 的播放指令是:

play -t raw -r 48k -e signed-integer -b 16 -c 2 test.pcm

使用 4.X 版本解压出来的 PCM 的播放指令是:

play -t raw -r 48k -e floating-point -b 32 -c 2 ./data_decode/out.pcm

参考: https://trac.ffmpeg.org/ticket/7321 https://stackoverflow.com/questions/35226255/audio-sample-format-s16p-ffmpeg-or-audio-codec-bug https://bbs.csdn.net/topics/391984409

2. ffprobe 的用法可以参考:

https://my.oschina.net/u/4324861/blog/4325767 https://blog.csdn.net/byc6352/article/details/96729348 https://www.cnblogs.com/renhui/p/9209664.html

3. mp3 的 文件格式 和 编码,参考:

https://www.cnblogs.com/ranson7zop/p/7655474.html https://blog.csdn.net/xiahouzuoxin/article/details/7849249

4. API 变更记录

https://blog.csdn.net/leixiaohua1020/article/details/41013567

5. 使用 lame 编码 mp3 ,可以参考:

https://www.jianshu.com/p/dce4e2e9ed75 https://blog.csdn.net/bjrxyz/article/details/73435407 https://blog.csdn.net/rrrfff/article/details/18701885?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-1.channel_param&depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-1.channel_param https://blog.csdn.net/gonner_2011/article/details/77183947?utm_medium=distribute.pc_relevant.none-task-blog-title-2&spm=1001.2101.3001.4242 https://blog.csdn.net/jody1989/article/details/75642579 https://blog.csdn.net/ssllkkyyaa/article/details/90400302

6. 编码 mp3 除了 lame 还可以考虑 libshine,参考:

http://zhgeaits.me/android/2016/06/17/android-ffmpeg.html

7. 旧版本的 avcodec_decode_audio4 被废弃了,需要用收发机制。

旧版本的写法:

        const char * infile = IN_FILE;
        const char * outfile = OUT_FILE;

        //注册所有容器解码器
        av_register_all();
        //printf("the video file is %s\n",argv[1]);
        AVFormatContext * fmt_ctx = avformat_alloc_context();

        //打开文件
        if (avformat_open_input(&fmt_ctx, infile , NULL, NULL) < 0) {
                printf("open file error");
                return -1;
        }

        //读取音频格式文件信息
        if (avformat_find_stream_info(fmt_ctx, NULL) < 0) {
                printf("find stream info error");
                return -1;
        }
        // 打印出解析到的媒体信息
        av_dump_format(fmt_ctx, 0, infile, 0);

        //获取音频索引
        int audio_stream_index = -1;
        for (int i = 0; i < fmt_ctx->nb_streams; i++) {
                if (fmt_ctx->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_AUDIO) {
                        audio_stream_index = i;
                        printf("find audio stream index\n");
                        break;
                }
        }
        if (audio_stream_index == -1) {
                printf("did not find a audio stream\n");
                return -1;
        }
        //获取解码器
        AVCodecContext *codec_ctx = avcodec_alloc_context3(NULL);
        avcodec_parameters_to_context(codec_ctx, fmt_ctx->streams[audio_stream_index]->codecpar);
        AVCodec *codec = avcodec_find_decoder(codec_ctx->codec_id);
        if (codec == NULL) {
                printf("unsupported codec !\n");
                return -1;
        }

        //打开解码器
        if (avcodec_open2(codec_ctx, codec, NULL) < 0) {
                printf("could not open codec");
                return -1;
        }

        //分配AVPacket和AVFrame内存,用于接收音频数据,解码数据
        AVPacket *packet = av_packet_alloc();
        AVFrame *frame = av_frame_alloc();
        //接收解码结果
        int got_frame;
        int index = 0;

        //pcm输出文件
        FILE *out_file = fopen(outfile, "wb");
        //将音频数据读入packet
        while (av_read_frame(fmt_ctx, packet) == 0) {
                //取音频索引packet
                if (packet->stream_index == audio_stream_index) {
                        //将packet解码成AVFrame
                        if (avcodec_decode_audio4(codec_ctx, frame, &got_frame, packet) < 0) {
                                printf("decode error:%d", index);
                                break;
                        }
                        if (got_frame > 0) {
                                //printf("decode frame:%d", index++);
                                //想将单个声道pcm数据写入文件
                                fwrite(frame->data[0], 1, static_cast<size_t>(frame->linesize[0]), out_file);
                        }
                }
        }
        printf("decode finish...");

        //释放资源
        av_packet_unref(packet);
        av_frame_free(&frame);
        avcodec_close(codec_ctx);
        avformat_close_input(&fmt_ctx);
        fclose(out_file);
}

新版本的解码这样写:

void decode(const char * infile, const char * outfile)
{
    AVFormatContext * fmt_ctx = 0;  // ffmpeg的全局上下文,所有ffmpeg操作都需要
    AVCodecContext * codec_ctx = 0; // ffmpeg编码上下文
    AVCodec * codec = 0;        // ffmpeg编码器
    AVPacket * packet = 0;      // ffmpag单帧数据包
    AVFrame * frame = 0;        // ffmpeg单帧缓存

    FILE * out_file = NULL;     // 用于文件操作
    int audio_stream_index = -1;    // 音频序号

    //注册所有容器解码器
    av_register_all();
    fmt_ctx = avformat_alloc_context();
    if (fmt_ctx == NULL) {
        printf("failed to alloc av format context\n");
        goto END;
    }

    //打开文件
    if (avformat_open_input(&fmt_ctx, infile , NULL, NULL) < 0) {
        printf("open file error");
        goto END;
    }

    //读取音频格式文件信息
    if (avformat_find_stream_info(fmt_ctx, NULL) < 0) {
        printf("find stream info error");
        goto END;
    }

    // 打印出解析到的媒体信息
    av_dump_format(fmt_ctx, 0, infile, 0);

    //获取音频索引
    for (int i = 0; i < fmt_ctx->nb_streams; i++) {
        if (fmt_ctx->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_AUDIO) {
            audio_stream_index = i;
            printf("find audio stream index\n");
            break;
        }
    }
    if (audio_stream_index == -1) {
        printf("did not find a audio stream\n");
        goto END;
    }

    //获取解码器
    codec_ctx = avcodec_alloc_context3(NULL);
    avcodec_parameters_to_context(codec_ctx, fmt_ctx->streams[audio_stream_index]->codecpar);
    codec = avcodec_find_decoder(codec_ctx->codec_id);
    if (codec == NULL) {
        printf("unsupported codec !\n");
        goto END;
    }

    //打开解码器
    if (avcodec_open2(codec_ctx, codec, NULL) < 0) {
        printf("could not open codec");
        goto END;
    }

    printf("codec name: %s, channels: %d, sample rate: %d, sample format %d\n", codec->name, codec_ctx->channels, codec_ctx->sample_rate, codec_ctx->sample_fmt);

    //分配AVPacket和AVFrame内存,用于接收音频数据,解码数据
    packet = av_packet_alloc();
    frame = av_frame_alloc();
    if (!packet || !frame) {
        printf("failed to alloc packet or frame\n");
        goto END;
    }

    //pcm输出文件
    out_file = fopen(outfile, "wb");
    //将音频数据读入packet
    while (av_read_frame(fmt_ctx, packet) == 0) {
        //取音频索引packet
        if (packet->stream_index == audio_stream_index) {
            int ret = 0;
            // 将封装包发往解码器
            if ((ret = avcodec_send_packet(codec_ctx, packet))) {
                printf("failed to avcodec_send_packet, ret = %d\n", ret);
                break;
            }
            // 从解码器循环拿取数据帧
            while (!avcodec_receive_frame(codec_ctx, frame)) {
                // 获取每个通道每次采样占用几个字节, S16P格式是2字节
                int bytes_num = av_get_bytes_per_sample(codec_ctx->sample_fmt);
                for (int index = 0; index < frame->nb_samples; index++) {
                    // 交错的方式写入
                    for (int channel = 0; channel < codec_ctx->channels; channel++) {
                        fwrite((char *)frame->data[channel] + bytes_num * index, 1, bytes_num, out_file);
                    }
                }
                av_packet_unref(packet);
            }
        }
    }
    printf("decode finish...\n");

    //释放资源
END:
    fclose(out_file);
    if (frame) {
        av_frame_free(&frame);
        printf("free frame.\n");
    }
    if (packet) {
        av_packet_unref(packet);
        printf("free packet.\n");
    }
    if (codec_ctx) {
        avcodec_close(codec_ctx);
        printf("free codec context.\n");
    }
    if (fmt_ctx) {
        avformat_close_input(&fmt_ctx);
        printf("free format context.\n");
    }
}

编码这样写:

int encode(const char * infile, const char *outfile, uint32_t sample_rate, uint8_t channel_num)
{
    // 输入输出文件指针
    FILE * in_file = NULL;
    FILE * out_file = NULL;

    // 打开输入文件
    if ((in_file = fopen(infile, "rb+")) == NULL) {
        printf("failed to open %s \n", infile);
        return -1;
    }

    // 打开输出文件
    if ((out_file = fopen(outfile, "wb+")) == NULL) {
        printf("failed to open %s \n", outfile);
        fclose(in_file);
        return -1;
    }

    // 初始化编码参数
    lame_t lame = lame_init();
    // 设置编码参数
    lame_set_in_samplerate(lame, sample_rate);  
    lame_set_VBR(lame, vbr_default);
    lame_set_num_channels(lame, channel_num);
    // 初始化编码器
    lame_init_params(lame);

    // 编码时用来存放数据的数组,大小建议为 mp3 的采样率 * 1.25 + 7200
    // PCM 通常使用16bit 数据,占用两个字节,如果是双通道,那么读取PCM交错数据时一次最好是 2 * 2 = 4 个字节。
    uint32_t size = (sample_rate * 1.25) / (2 * channel_num) * (2 * channel_num) + 7200;

    // 申请存放数据内存, 如果申请出错,需要释放占用的资源
    int16_t * pcm_buffer = NULL;
    uint8_t * mp3_buffer = NULL;
    pcm_buffer = (int16_t *)malloc(size);
    mp3_buffer = (uint8_t *)malloc(size);
    if (!pcm_buffer || !mp3_buffer) {
        if (pcm_buffer)
            free(pcm_buffer);
        if (mp3_buffer)
            free(mp3_buffer);
        lame_close(lame);
        fclose(in_file);
        fclose(out_file);

        printf("buffer malloc error!\n");
        return -1;
    }

    printf("encode start...\n");

    // 读取 PCM 的字节数目
    size_t read_num = 0;

    do {
        // 读取 PCM 数据
        read_num = fread(pcm_buffer, 1, size, in_file);
        // 转换 MP3 数据,如果获得的数目是0,说明转换结束,需要把 lame 转换剩余的数据全部存放到 MP3 数组里面。
        int write_num = 0;
        if (read_num == 0) {
            write_num = lame_encode_flush(lame, mp3_buffer, size);
        } else {
            write_num = lame_encode_buffer_interleaved(lame, pcm_buffer, static_cast<int>(read_num / sizeof(int16_t) / channel_num), mp3_buffer, size);
        }
        // 转换后的数据写入文件。
        fwrite(mp3_buffer, write_num, 1, out_file);
    } while (read_num > 0);

    printf("encode finish\n");

    // 给文件添加 MP3 的 TAG 信息。
    lame_mp3_tags_fid(lame, out_file);
    // 释放资源
    lame_close(lame);
    fclose(in_file);
    fclose(out_file);

    return 0;
}

参考: https://blog.csdn.net/weixin_41353840/article/details/108000466

8. 解码 pcm 当中容易碰到的坑和相应的编码格式,可以参考:

https://blog.csdn.net/qq21497936/article/details/108799279 https://blog.csdn.net/leixiaohua1020/article/details/50534316

9. 雷神关于编码和解码相关的文章:

https://blog.csdn.net/leixiaohua1020/article/details/50534316 https://blog.csdn.net/leixiaohua1020/article/details/42181571 https://blog.csdn.net/leixiaohua1020/article/details/8652605 https://blog.csdn.net/leixiaohua1020/article/details/15811977 https://blog.csdn.net/leixiaohua1020/article/details/50534316 https://blog.csdn.net/leixiaohua1020/article/list/3

10. ffmpeg 读取码率和帧信息的可以参考:

https://blog.51cto.com/ticktick/1869849 https://blog.51cto.com/ticktick/1872008 https://blog.51cto.com/ticktick/1867059

11. ffmpeg 的例程可以参考:

https://stackoverflow.com/questions/2641460/ffmpeg-c-api-documentation-tutorial

12. av_register_all() 这个函数废弃了。

参考:https://github.com/leandromoreira/ffmpeg-libav-tutorial/issues/29

13. 测试的 mp3 可以从这个网站下载:

http://www.goodkejian.com/erge.htm

14. 命令行形式使用 ffmpeg

参考: https://segmentfault.com/a/1190000016652277 https://cloud.tencent.com/developer/article/1566587

15. 解码例程可以参考:

https://github.com/iamyours/FFmpegAudioPlayer https://gitee.com/zouwm1995/ffmpeg-demo/blob/master/demo/2.%E8%A7%A3%E7%A0%81/decode.c https://www.jianshu.com/p/8ff162ac55bd https://blog.csdn.net/weixin_44721044/article/details/104736782 https://bbs.csdn.net/topics/390401066

16. 编译 ffmpeg

https://www.cnblogs.com/CoderTian/p/6655568.html

17. S16 变成了 S16P,参考:

https://blog.csdn.net/chinabinlang/article/details/47616257 https://blog.csdn.net/disadministrator/article/details/43734335

18. 查看当前的 ffmpeg 支持哪些编码解码器,类似于这样。

ffmpeg -codecs | grep mp3

19. 用于 ffmpeg 的 cmake 可以参考:

https://www.cnblogs.com/liuxia19872003/archive/2012/11/09/2763173.html

1. cmake 文件中的库具体路径是什么? 比如说 ${CURL_INCLUDE_DIR}。 这个其实是 /usr/share/cmake-3.10/Modules/ 下面的 FindXXX.cmake 去寻找相应的具体目录,可以通过命令查看,当前的cmake 支持哪些库的寻找。

ll -th /usr/share/cmake-3.10/Modules/ | grep Find
cmake --help-module FindCURL

上面这条命令可以看到 cmake 关于 curl 具体能找到哪些东西,可以用于 CMakeLists.txt 的编写。

参考: https://blog.csdn.net/haluoluo211/article/details/80559341

2. ninja 安装

sudo apt install ninja-build

3. cmake 简单示例

CMakeLists.txt:

cmake_minimum_required(VERSION 3.5)
​
add_executable(hello-world main.c)

main.c:

#include <stdio.h>
​
int main(char argc, char *argv[])
{
    printf("Hello world\n");
​
    return 0;
}
mkdir build && cd build && cmake .. && make && cd .. && build/hello-world

常用命令:

include_directories(dir):指定包含的头文件目录 dir

add_subdirectory(subdir):包含的子目录 subdir,subdir 中也必须存在 CMakeLists.txt

aux_source_directory(dir var):将 dir 目录下的所有源文件赋值给变量 var

add_library(target src):用变量 src 指定的源文件生成目标文件 target,目标文件为一个库文件

target_link_libraries(target lib):链接 lib 到 target 

${PROJECT_NAME} 指的是最近的 project(name) 里面的 name.

参考: https://zhuanlan.zhihu.com/p/87283287

4. cmake 设置当前项目用的环境变量

# Set Environment Variable
# 这个环境变量只对当前cmake工程有效,对外界是无效的。
set(ENV{<variable>} [<value>])

# 找到所有dir目录下的源文件(不会递归遍历子文件夹),源文件是.c文件(也就是makefile中可以生成.o的文件)
aux_source_directory(<dir> <variable>)

参考: https://zhuanlan.zhihu.com/p/93895403

5. ${PROJECT_NAME}${CMAKE_PROJECT_NAME} 这个需要大写。 小写的话,就认为是自定义的,大写的才能对应到 project() 之类的。

6. target_link_libraries 里面的库,需要先通过 link_directories 指定位置。头文件的包含,需要 include_directories 来指定目录。

7. 简单 CMake 文件,包含对外部库的链接。

#注意:如果工程有依赖库的话,ADD_EXECUTABLE指令要放在LINK_DIRECTORIES指令之后,
#       不然会报错:Linking C executable main
#                   /usr/bin/ld: cannot find -lhello
#                   collect2: ld 返回 1

#1) 设置 cmake 的最低版本
cmake_minimum_required(VERSION 3.10)

#2) 设置 project 名称
project(ffmpeg_test)

#3) 设置代码源文件列表
set(SRC_LIST main.cpp)

#4) 增加头文件搜索路径,解决编译期间找不到头文件的问题
#COMMAND: INCLUDE_DIRECTORIES([AFTER|BEFORE] [SYSTEM] dire1 dire2 ...)
#定义:向工程添加多个特定的头文件搜索路径,路径之间用空格分开,
#       如果路径中包含空格,可以使用双引号括起来
#       默认是追加到当前的头文件搜索路径之后,你可以用2种方式控制搜索路径的添加>方式
#       1)CMAKE_INCLUDE_DIRECTORIES_BEFORE 通过SET设置其为on,使用前置模式
#       2)通过AFTER或BEFORE参数,控制追加还是置前
include_directories("/usr/include/x86_64-linux-gnu")

#5) 增加库文件: 解决链接期间找不到调用外部接口的问题
#main.cpp:(.text+0x5): undefined reference to `HelloFunc()'
#collect2: error: ld returned 1 exit status

#6) 增加库文件搜索路径:解决链接期间找不到库文件的问题
#COMMAND: LINK_DIRECTORIES(dir1 dir2 ...)
#定义:添加非标准的共享库搜索路径
#/usr/bin/ld: cannot find -lhello
#collect2: error: ld returned 1 exit status
#好像相对路径会找不到库文件
link_directories("/usr/lib/x86_64-linux-gnu")

#7) 生成二进制文件
add_executable(ffmpeg_test ${SRC_LIST})

#8) 链接库
#COMMAND: TARGET_LINK_LIBRARIES(target  library1
#                                <debug | optimized> library2
#                                ...)
#定义:用来为target添加需要链接的共享库
#TARGET_LINK_LIBRARIES(${PROJECT_NAME} hello) #链接动态库指令
#TARGET_LINK_LIBRARIES(${PROJECT_NAME} libhello.a)  #链接静态库指令
target_link_libraries(${PROJECT_NAME} PRIVATE avutil avcodec avformat swscale)

参考: https://www.cnblogs.com/jacklikedogs/p/3780064.html https://blog.csdn.net/yangbing1113/article/details/8036707 https://cloud.tencent.com/developer/ask/115652 https://blog.csdn.net/laibowon/article/details/103746594 https://www.cnblogs.com/xianghang123/p/3556425.html https://blog.csdn.net/turbock/article/details/90034787 https://blog.csdn.net/arackethis/article/details/43488177

8. so 的库文件,后缀名必须是 .so,才能被 cmake 添加进去,如果是 .so.57 这样的,就添加不进去。

9. cmake 变量 参考: https://cmake.org/cmake/help/latest/manual/cmake-variables.7.html

cmake 命令 参考: https://cmake.org/cmake/help/latest/manual/cmake-commands.7.html cmake 教程 参考: https://cmake.org/cmake/help/v3.16/guide/tutorial/index.html

10. cmake 多层目录如下:

.
├── CMakeLists.txt
├── CMakeLists.txt.bak
├── common.c
├── common.h
├── compile.sh
├── json
│   ├── cJSON.c
│   ├── cJSON.h
│   ├── CJsonObject.cpp
│   ├── CJsonObject.hpp
│   └── CMakeLists.txt
├── main.cpp
├── main.h
├── run.sh
├── sa_in.txt
├── token.c
└── token.h

顶层 CMakeLists.txt :

cmake_minimum_required(VERSION 3.10)
project(sentiment_analysis)

find_package(CURL REQUIRED)
include_directories(${CURL_INCLUDE_DIR})

add_subdirectory(json)

add_executable(${PROJECT_NAME} main.cpp common.c token.c)
target_link_libraries(${PROJECT_NAME} ${CURL_LIBRARY} json_binary)

下层 CMakeLists.txt :

project(json_binary)

add_library(${PROJECT_NAME} cJSON.c CJsonObject.cpp)

target_include_directories(${PROJECT_NAME} PUBLIC ${PROJECT_SOURCE_DIR})

参考: https://www.cnblogs.com/svenzhang9527/p/10704777.html

11. 设置 C++ 标准为 C++11

set(CMAKE_CXX_STANDARD 11)

参考: https://www.cnblogs.com/svenzhang9527/p/10704718.html https://blog.csdn.net/justidle/article/details/105240822

12.

PROJECT(projectname [CXX] [C] [JAVA])

用于指定工程名字,[]为可选内容,默认表示支持所有语言。注意这条指令还隐式定义了另外两个变量<projectName>_BINARY_DIR<projectName>_SOURCE_DIR。我们这里的project定义为了hello,所以这两个变量就是${hello_BINARY_DIR}${hello_SOURCE_DIR}。什么意思呢,可以用message命令打印出来看看他的值。

${hello_BINARY_DIR} :就是cmake要(构建)编译我们的项目(main.c)的具体路径。这里当然就是build。

${hello_SOURCE_DIR} :就是我们项目的源码的具体路径,这里当然是项目的根目录。

如果采用的是内部构建的方式,即直接在项目根目录下运行cmake,那这两个变量的值是一样的。不过一般很少使用这两个变量。大家都比较喜欢用 ${CMAKE_SOURCE_DIR}${CMAKE_BINARY_DIR}.

参考: https://blog.csdn.net/Tommy_wxie/article/details/77675895

13. 编译报错:

interpreter_builder.cc:(.text+0x394): undefined reference to `dlopen'
interpreter_builder.cc:(.text+0x3a4): undefined reference to `dlsym'

在链接库里面添加 ${CMAKE_DL_LIBS} 即可

 target_link_libraries(${PROJECT_NAME} PRIVATE tensorflow-lite ${CMAKE_DL_LIBS})

参考: https://bl.ocks.org/kwk/3595733

14. 编译报错:

thread_pool.cc:(.text+0x228): undefined reference to `pthread_create'
thread_pool.cc:(.text+0x238): undefined reference to `pthread_create'

在链接库里面添加 pthread 即可

target_link_libraries(${PROJECT_NAME} PRIVATE tensorflow-lite pthread ${CMAKE_DL_LIBS})

参考: https://stackoverflow.com/questions/956640/linux-c-error-undefined-reference-to-dlopen

15. opencv

#opencv 4.+需要c++11以上版本的编译器
set(CMAKE_CXX_FLAGS "-std=c++11")
# 引入Opencv包
find_package(OpenCV REQUIRED)
include_directories(${OpenCV_INCLUDE_DIRS})
# 注意use_opencv.cpp是我源代码文件名,你需要改成你源代码名
add_executable(use_opencv use_opencv.cpp)
# 链接OpenCV库
target_link_libraries(use_opencv ${OpenCV_LIBS})

参考: https://blog.csdn.net/varyshare/article/details/94162064

16. 当子文件夹需要使用第三方的 静态库 的时候,有三种方法:

1. 直接把静态库编译到子文件夹里面,这样 顶层 cmake 不需要关心这个第三方库。

#link_directories("./lib")
add_library(${PROJECT_NAME} ${ALL_SOURCE_LIST} ./lib/libtensorflow-lite.a)
target_link_libraries(${PROJECT_NAME} PUBLIC ${CMAKE_DL_LIBS})

上面这种方法,不需要 link_directories 来链接第三方库,只需要生成子文件库的时候,指明需要添加进去的库即可,这样第三方库变为子文件库二进制文件的一部分,所以对于上层来说,只要不直接调用第三方库的 API,那么就不需要添加第三方库的库文件夹路径和头文件夹路径。

2. 不把静态库编译到子文件夹里面,顶层 cmake 需要关系这个第三方库。

add_library(${PROJECT_NAME} ${ALL_SOURCE_LIST})
target_link_libraries(${PROJECT_NAME} PRIVATE tensorflow-lite ${CMAKE_DL_LIBS})
link_directories("./sub/lib")

子文件夹下不用把 第三方库编译进去,只需要指明链接的库就行,在顶层 cmake 里面需要指定库所在的路径,否则最后链接库的时候,会找不到这个第三方库。

3. 类似于第一种方法

link_directories("./lib")
add_library(${PROJECT_NAME} ${ALL_SOURCE_LIST})
link_libraries(${PROJECT_NAME} PRIVATE tensorflow-lite ${CMAKE_DL_LIBS})

这样,在编译子文件夹后,立刻链接第三方库,然后生成子文件夹的二进制文件,这样顶层 cmake 也不需要关心第三方库了。

参考: https://segmentfault.com/a/1190000022075547 https://zhuanlan.zhihu.com/p/149191302 https://cmake.org/cmake/help/v3.5/command/link_libraries.html

17. add_definitions

add_definitions 作用是代码中的 #define 的功能。类似于 add_definitions(-DTEST_IT_CMAKE) 这样的写法。

option(TEST_IT_CMAKE "test" ON)
message(${TEST_IT_CMAKE})
if(TEST_IT_CMAKE)
    message("itis" ${TEST_IT_CMAKE})
    add_definitions(-DTEST_IT_CMAKE)
endif()

参考: https://blog.csdn.net/qq_35699473/article/details/115837708
https://cmake.org/cmake/help/latest/command/add_definitions.html

18. cmake 代码分支编译

代码写法类似

option (USE_MYMATH "Use provided math implementation" ON)
if (USE_MYMATH)
    message(STATUS "USE_MYMATH")
elseif (USE_XXX)
    message(STATUS "USE_XXX")
endif()

if(${address} STREQUAL "ON") 这种字符串比较经过实际测试,发现不起效果,可能是版本问题,或者其他情况。 具体参考: https://www.cnblogs.com/lidabo/p/13846640.html
https://www.jianshu.com/p/f0f71d36411a
https://blog.csdn.net/maizousidemao/article/details/104103279
https://www.cnblogs.com/rickyk/p/3872568.html
https://blog.csdn.net/Calvin_zhou/article/details/104025714
https://blog.csdn.net/hp_cpp/article/details/110373926
https://blog.csdn.net/lyq308152569/article/details/109388437

19. cmake 调试, 线程,报警信息

add_definitions("-Wall -lpthread -g -rdynamic")  

参考: https://www.cnblogs.com/lidabo/p/7359422.html
https://blog.csdn.net/liu0808/article/details/79046022/

20. gcc 报警信息综合

https://blog.csdn.net/XiaoH0_0/article/details/107513936

21. sanitize

target_link_libraries(${PROJECT_NAME} PRIVATE libtengine-lite.so libface.so ${OpenCV_LIBS} -fsanitize=address -fsanitize=leak )

cmake 资料参考:

https://www.cnblogs.com/binbinjx/p/5626916.html https://www.cnblogs.com/is-smiling/p/3269059.html https://www.cnblogs.com/coderfenghc/archive/2012/06/23/2559603.html https://blog.csdn.net/bigdog_1027/article/details/79113342

ninja 资料参考:

https://ninja-build.org/manual.html#_the_literal_phony_literal_rule https://blog.csdn.net/qiuguolu1108/article/details/103842556 https://www.jianshu.com/p/c4d3ba6c6470 https://ninja-build.org/manual.html https://github.com/ninja-build/ninja https://blog.csdn.net/u010164190/article/details/104932437 https://www.cnblogs.com/fuland/p/3641311.html https://www.cnblogs.com/fuland/p/3641314.html https://www.cnblogs.com/fuland/p/3641317.html https://www.cnblogs.com/sandeepin/p/ninja.html

1. 录音的 pcm 文件直接播放,使用:

#!/bin/bash
play -t raw -r 44.1k -e signed-integer -b 16 -c 2 loved.pcm
play -t raw -r 48k -e floating-point -b 32 -c 2 ./data_decode/out.pcm

参考: https://blog.csdn.net/lc999102/article/details/80579866

2. json.h 没有相应的头文件。json.h, curl.h

sudo apt-get install libjsoncpp-dev 
sudo ln -s /usr/include/jsoncpp/json/ /usr/include/json

sudo apt install libcurl4-openssl-dev
sudo ln -s /usr/include/x86_64-linux-gnu/curl /usr/include/curl

sudo apt-get install libopencv-dev

参考: https://blog.csdn.net/zhangpeterx/article/details/92175479

3. QCoreApplication 找不到定义的地方。

QCoreApplication 在 5.14.2/Src/qtbase/src/corelib/kernel/qcoreapplication.h 里面,定义为 class Q_CORE_EXPORT QCoreApplication。 参考: https://www.cnblogs.com/lyggqm/p/6281581.html

4. ffmpeg 的 cmake 配置

cmake_minimum_required(VERSION 3.10)

project(ffmpeg_test)

set(SRC_LIST main.cpp)
include_directories("/usr/include/x86_64-linux-gnu")
link_directories("/usr/lib/x86_64-linux-gnu")

add_executable(ffmpeg_test ${SRC_LIST})

#target_link_libraries(${PROJECT_NAME} libavutil.so libavcodec.so libavformat.so libavdevice.so.57 libavfilter.so libswscale.so libpostproc.so)

#target_link_libraries(${PROJECT_NAME} libavutil.so libavcodec.so libavformat.so libswscale.so)

target_link_libraries(${PROJECT_NAME} avutil avcodec avformat swscale)

参考: https://blog.csdn.net/wangchao1412/article/details/103454371 https://www.jianshu.com/p/72cdcb8d06a7 https://blog.csdn.net/BigDream123/article/details/89741253

5. 使用百度的 tts,需要安装百度 aip sdk

pip3 install baidu-aip --user

参考: https://blog.csdn.net/m0_37886429/article/details/85222593

6. pcm 和 wav 互转

参考: https://blog.csdn.net/sinat_37816910/article/details/105054372 https://blog.csdn.net/huplion/article/details/81260874

7. alsaaudio 中的 openPCM 这个参数的顺序有问题,不要按照 api 上面的顺序写。全部用关键词的方式去写,就没有问题。

        try:
            self.__alsaDev = alsaaudio.PCM(type = alsaaudio.PCM_PLAYBACK, mode = alsaaudio.PCM_NORMAL, rate = 16000, channels = 8, format = alsaaudio.PCM_FORMAT_S16_LE, periodsize = 160, device = "plughw:" + self.__devName)
        except Exception as e:
            print("alsaaudio open pcm exception: ", e)

8. 其他格式转换为 wav 格式,使用 pydub 中的 AudioSegment

    @staticmethod
    def extractToWave(srcPath, destDir = None, destPrefix = None):
        (srcDir, fileName) = os.path.split(srcPath)
        (fileNoExt, ext) = os.path.splitext(fileName)
        if ext == ".wav":
            return srcPath

        if destDir is None:
            destDir = srcDir
        if destPrefix is None:
            destPrefix = ""
        if not os.path.exists(destDir):
            os.makedirs(destDir)
        destName = destPrefix + fileNoExt + ".wav"
        destPath = destDir + "/" + destName
        #print(destPath)
        if ext == ".mp3":
            data = AudioSegment.from_mp3(srcPath)
        else:
            return None
        data.export(destPath, format = "wav")
        return destPath

参考: https://www.cnblogs.com/xingshansi/p/6799994.html https://ithelp.ithome.com.tw/articles/10252078 https://blog.csdn.net/baidu_29198395/article/details/86694365

9. alsaaudio 播放 wav 格式

    def playThreadWav(self, path, index):
        if self.__alsaDev:
            self.__alsaDev.close()
        print("alsa playback wav thread run: %d" % index)

        with wave.open(path, 'rb') as f:
            self.__rate = f.getframerate()
            self.__channels = f.getnchannels()
            self.__depthBits = f.getsampwidth() * 8
            self.__format = self.bitsToFormat(self.__depthBits)
            self.__periodSize = int(self.__rate / 100)
            try:
                self.__alsaDev = alsaaudio.PCM(type = alsaaudio.PCM_PLAYBACK,
                                               mode = alsaaudio.PCM_NORMAL,
                                               rate = self.__rate,
                                               channels = self.__channels,
                                               format = self.__format,
                                               periodsize = self.__periodSize,
                                               device = "plughw:" + self.__devName)
            except Exception as e:
                print("alsaaudio open exception: ", e)

            if self.__alsaDev is None:
                print("open alsa audio device failed")
                self.clearThreadParam(index)
                return "finished"

            data = f.readframes(self.__periodSize)
            while data and self.__eStop == False:
                try:
                    self.__alsaDev.write(data)
                except ALSAAudioError as e:
                    print("alsa audio play except: ", e)
                    break
                data = f.readframes(self.__periodSize)

        self.afterThreadComplete(index)
        return "finished"

    def clearThreadParam(self, index):
        del self.__poolDict[index]
        self.__rate = 0
        self.__channels = 0
        self.__depthBits = 0
        self.__format = 0
        self.__periodSize = 0

    def afterThreadComplete(self, index):
        self.__alsaDev.close()
        self.__alsaDev = None
        if index in self.__hookDict:
            if self.__eStop != True:
                self.__hookDict[index]()
            del self.__hookDict[index]
        self.clearThreadParam(index)

参考: https://www.programcreek.com/python/example/91453/alsaaudio.PCM

10. 使用 websocket 的时候, pip3 install --user websocket-client 而不是 websocket

11. pcm 转 wav

        (file, ext) = os.path.splitext(path)
        wavPath = file + ".wav"
        EspAudioUtil.pcmToWave(path, wavPath, rate, channels, bits)
        os.remove(path)

参考: https://stackoverflow.com/questions/16111038/how-to-convert-pcm-files-to-wav-files-scripting

12. 多通道音频抽取单通道数据

    @staticmethod
    def pcmExtractOneChannal(multiChannArray, channels, index):
        array = multiChannArray
        array.shape = -1, channels
        array = array.T
        return array[index]

    @staticmethod
    def pcmExtractOneChannalFile(multiPath, channels, index, dataBits, onePath):
        audioData = None
        if dataBits == 16:
            dataType = np.uint16
        with open(multiPath, 'rb') as f:
            audioData = np.fromfile(f, dtype = dataType)
        oneData = __class__.pcmExtractOneChannal(audioData, channels, index)
        oneData.tofile(onePath)

    @staticmethod
    def pcmExtractOneChannalBinary(multiBinary, channels, index, dataBits):
        audioData = None
        if dataBits == 16:
            dataType = np.uint16
        audioData = np.fromstring(multiBinary, dtype = dataType)
        oneData = __class__.pcmExtractOneChannal(audioData, channels, index)
        return oneData.tobytes()

参考: https://www.pythonf.cn/read/128012

13 pcm 和 wave 互转

    @staticmethod
    def pcmToWave(pcmPath, wavPath, rate, channels, depthBits):
        with open(pcmPath, "rb") as pcmFile:
            print("pcm open")
            pcmData = pcmFile.read()
        with wave.open(wavPath, "wb") as wavFile:
            print(channels, int(depthBits / 8), rate)
            print(len(pcmData))
            wavFile.setparams((channels, int(depthBits / 8), rate, 0, 'NONE', 'NONE'))
            wavFile.writeframes(pcmData)

    @staticmethod
    def waveToPCM(wavPath, pcmPath, dataBits = 16):
        if dataBits == 16:
            dataType = np.uint16
        with open(wavPath, 'rb') as f:
            f.seek(0)
            f.read(44)
            data = np.fromfile(f, dtype = dataType)
            data.tofile(pcmPath)
        with wave.open(wavPath, 'rb') as f:
            return f.getparams()

参考: https://blog.csdn.net/sinat_37816910/article/details/105054372 https://docs.python.org/3/library/wave.html

14. 停止 baidu 的 websocket,需要发送 cancel,至于是否 self.ws.keep_running = False 不太确定

参考: https://www.coder.work/article/1269314

15. 播放音乐并立即停止

cmd = "AUDIODEV=hw:realtekrt5651co play ~/esp_run/speech/test.wav"
sub = subprocess.Popen(cmd, shell = True)
print(sub.poll())
time.sleep(5)
print("kill")
print(time.time())
sub.kill()  #sub.send_signal(signal.SIGKILL)
# sub.wait()
print(time.time())
print(sub.poll())
sub = subprocess.Popen("stty echo", shell = True)

16. 如果需要使用 alsaaudio 在录音的时候播放其他音频,那么可能录音回发生 overrun,主要是播放音频的 set 函数回导致 overrun,其他一些耗时的处理也会导致 overrun,比如 mp3 解码。

17. subprocess kill 之后,使用 wait() 函数的时候,提示 EOFError 的时候,可以使用

stty sane

来恢复。

18. 寻找目录,寻找文件

def searchDir(path, dirName):
    for root, dirs, files in os.walk(path):
        if dirName in dirs:
            return os.path.join(root, dirName)
    return None

def searchFile(path, fileName):
    for root, dirs, files in os.walk(path):
        if fileName in files:
            return os.path.join(root, fileName)

19. 需要依赖的文件收集

def assembleDepends(path):
    cmd = 'grep -R "import" ' + path
    f = os.popen(cmd)
    data = f.readlines()
    f.close()
    #print(data)
    dependDict = {}
    if data != None:
        for line in data:
            lineData = line[line.find(":") + 1 : ]
            if lineData.startswith("#"):
                continue
            print(lineData)
            dependList = []
            if lineData.find("from") == -1:
                lineData = lineData.replace(" ", "").replace("\n", "")
                dependList = lineData[lineData.find("import") + len("import") : ].split(",")
                #print(dependList)
                for depend in dependList:
                    if depend.startswith("esp_"):
                        dependDict[depend] = 1
            else:
                start = lineData.find("from") + len("from") + 1
                end = lineData.find("import")
                lineData = lineData[start : end].strip()
                #print(lineData)
                if lineData.startswith("esp_"):
                    dependDict[lineData] = 1
        #print(dependDict)
        return dependDict
    return None

20. 降噪算法效果好,耗时低的是 WebRTC, python 可以使用 https://github.com/xiongyihui/python-webrtc-audio-processing 这边的代码。