标签 python 下的文章

X86 直接安装

sudo apt install python3-pyqt5

或者

python3 -m pip install --upgrade pip
pip3 install pyqt5==5.14.2 --user

pyqt5 最好和 qt5 版本对应。 参考: https://stackoverflow.com/questions/59711301/install-pyqt5-5-14-1-on-linux

ARM 源码安装

安装包见附件

安装必要的开发版本软件

sudo dnf install python3-devel
sudo dnf install qt5-devel

安装 sip

tar zxvf sip-4.19.8.tar.gz
cd sip-4.19.8
python3 configure.py
make
sudo make install

安装 pyQt5

tar zxvf PyQt5_gpl-5.10.1.tar.gz
cd PyQt5_gpl-5.10.1
python3 configure.py --qmake /usr/bin/qmake-qt5
make
sudo make install

为了方便后续安装,可以把编译好的文件进行打包,然后写一个脚本用来安装。

#!/usr/bin/bash

SIP=sip-4.19.8
PYQT=PyQt5_gpl-5.10.1

rm -rf ~/${SIP}
rm -rf ~/${PYQT}

tar xvfz ./${SIP}_build.tar.gz -C ~/
tar xvfz ./${PYQT}_build.tar.gz -C ~/

cd ~/${SIP}
sudo make install

cd ~/${PYQT}
sudo make install

rm -rf ~/${SIP}
rm -rf ~/${PYQT}

echo "install done."

附件

附件中压缩包内是编译好的 rk3399 的二进制,解压出来的是 编译好的 rk3228 的二进制。 附件pyqt5_install.tar.gz

尝试在 3399 上面安装 pyside2 ,结果碰到很多问题。

  1. 需要 qt5.12
  2. 需要 clang

全部需求如下:

General Requirements¶

    Python: 3.5+ and 2.7

    Qt: 5.12+ is recommended

    libclang: The libclang library, recommended: version 10 for PySide2 5.15. Prebuilt versions of it can be downloaded here.

    CMake: 3.1+ is needed.

尝试使用 qt5.10 编译 pyside2

结果发现 pyside2 里面没有 5.10 的分支,而且还要先编译 shiboken ,然后在编译 pyside2。放弃。 https://github.com/pyside/pyside2-setup https://wiki.qt.io/Qt_for_Python/GettingStarted

qt5.12

参考下面这个网页来编译 https://blog.csdn.net/qqwangfan/article/details/84964856

clang 二进制版本

https://releases.llvm.org/download.html#10.0.1 从上面下载 aarch64 版本,结果运行起来找不到 tinfo 的库文件。 放弃。 有说法是可以吧 ncurse 的库链接过去,不过没有尝试。

clang 源码版本

http://clang.llvm.org/get_started.html 按照上面的步骤来,但是要注意默认的是 debug 模式,编译出来空间不够,需要设置为 release 模式,编译出来就只有 1.7 G 左右了。当时从 github 上面指定下载 10.x 的最后一次提交。 具体配置可以参考: https://llvm.org/docs/CMake.html 参考: https://blog.csdn.net/petersmart123/article/details/78418765

1. cv2.GaussianBlur()

def GaussianBlur(src, ksize, sigmaX, dst=None, sigmaY=None, borderType=None):
"""
使用高斯滤波器模糊图像
Argument:
    src: 原图像
    dst: 目标图像
    ksize: 高斯核的大小;(width, height);两者都是正奇数;如果设为0,则可以根据sigma得到;
    sigmaX: X方向的高斯核标准差;
    sigmaY: Y方向的高斯核标准差;
        如果sigmaY设为0,则与sigmaX相等;
        如果两者都为0,则可以根据ksize来计算得到;
    (推荐指定ksize,sigmaX,sigmaY)
    borderType: pixel extrapolation method
"""

参考: https://www.cnblogs.com/chenzhen0530/p/10742536.html https://blog.csdn.net/wuqindeyunque/article/details/103694900

2. cv2.imshow() 用来在窗口上显示图像。代码如下:

img = cv2.imread('3.jpg',1)
cv2.imshow('imshow',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
    cv2.namedWindow('image', cv2.WINDOW_NORMAL)
    cv2.imshow('image',img)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

cv2.namedWindow()函数可以指定窗口是否可以调整大小。在默认情况下,标志为cv2.WINDOW_AUTOSIZE。但是,如果指定标志为cv2.WINDOW_Normal,则可以调整窗口的大小。当图像尺寸太大,并在窗口中添加跟踪条时,这些操作可以让我们的工作更方便一点。

cv2.waitKey(0): 是一个和键盘绑定的函数,它的作用是等待一个键盘的输入(因为我们创建的图片窗口如果没有这个函数的话会闪一下就消失了,所以如果需要让它持久输出,我们可以使用该函数)。它的参数是毫秒级。该函数等待任何键盘事件的指定毫秒。如果您在此期间按下任何键,程序将继续进行。我们也可以将其设置为一个特定的键。

cv2.destroyALLWindows(): 销毁我们创建的所有窗口。如果要销毁任何特定窗口,请使用函数cv2.destroyWindow(),其中传递确切的窗口名称作为参数。(应该是使用创建窗口时所使用的窗口名称,字符串类型。)

参考: https://blog.csdn.net/weixin_38383877/article/details/82659779

3. waitKey() 用于等待一段时间,接受用户按键。如果参数是 0, 那么循环就只执行一次,然后一直等待用户按键。如果参数不是 0,那么每次等待参数给定的 ms,如果没有按键,那么继续循环。

ord() 函数用来获得按键对应的 ASCII 码,用来和 waitKey() 返回值进行对比。

if cv2.waitKey(1) & 0xFF == ord("q"):

参考: https://blog.csdn.net/qq_39377418/article/details/101393007 https://www.runoob.com/python/python-func-ord.html

4. release() 用来释放视频,然后再调用 cv2.destroyAllWindows() 来关闭所有窗口。

参考: https://www.jianshu.com/p/949683764115

5. imutils.resize() 用来改变图像大小,但是不改变长宽比例。

参考: https://www.jianshu.com/p/bb34ddf2a947

6. cv2.cvtColor() 用来转换图像色彩。

def cvtColor(src, code, dst=None, dstCn=None):
"""
转换图像的颜色空间
Argument:
    src: 原图像;
    code: 指定颜色空间转换类型;
    dst: 目标图像;与原图像大小深度一致;
    dstCn: 指定目标图像通道数;默认None,则会根据src、code自动计算;
"""

参考: https://www.cnblogs.com/chenzhen0530/p/10741264.html

7. cv2.absdiff() 用来对图像求差,主要是对于灰度图。用来比较两幅图像的差别。

参考: https://blog.csdn.net/u014737138/article/details/80388482 https://zhuanlan.zhihu.com/p/42940310

8. cv2.threshold() 阈值函数。

def threshold(src, thresh, maxval, type, dst=None):
"""
设置固定级别的阈值应用于多通道矩阵
    例如,将灰度图像变换二值图像,或去除指定级别的噪声,或过滤掉过小或者过大的像素点;
Argument:
    src: 原图像
    dst: 目标图像
    thresh: 阈值
    type: 指定阈值类型;下面会列出具体类型;
    maxval: 当type指定为THRESH_BINARY或THRESH_BINARY_INV时,需要设置该值;
"""

参考: https://www.cnblogs.com/chenzhen0530/p/10742540.html https://www.cnblogs.com/yinliang-liang/p/9293310.html

9. cv2.dilate() 形态学膨胀。用于白色增大。

dst = cv2.dilate(src,kernel,anchor,iterations,borderType,borderValue)
        src: 输入图像对象矩阵,为二值化图像
        kernel:进行腐蚀操作的核,可以通过函数getStructuringElement()获得
        anchor:锚点,默认为(-1,-1)
        iterations:腐蚀操作的次数,默认为1
        borderType: 边界种类
        borderValue:边界值

参考: https://www.cnblogs.com/silence-cho/p/11069903.html https://zhuanlan.zhihu.com/p/110330329 https://www.aiuai.cn/aifarm350.html https://www.cnblogs.com/my-love-is-python/p/10394908.html

10. cv2.findContours() 查找轮廓,最好使用原图像的拷贝。

参考: https://www.cnblogs.com/wmy-ncut/p/9889294.html https://blog.csdn.net/gaoranfighting/article/details/34877549

11. imutils.grab_contours 返回轮廓,是配合 cv2.findContours() 使用的。cv2.findContours()在就版本返回两个值,在新版本返回3个值,通过imutils.grab_contours 把返回中的轮廓拿到。

参考: https://blog.csdn.net/nima1994/article/details/90542992

13. cv2.contourArea() 对轮廓线求面积。

参考: https://blog.csdn.net/greatwall_sdut/article/details/108862018

14. cv2.boundingRect() 对找到的形状用最小的矩形框起来。

参考: https://www.cnblogs.com/Anita9002/p/8033101.html

15. cv2.rectangle() 用于在图像上画出一个矩形。

参考: https://blog.csdn.net/Gaowang_1/article/details/103087922

16. cv2.putText() 用于在图像上增加文字

参考: https://blog.csdn.net/GAN_player/article/details/78155283

17. cv2.FONT_HERSHEY_SIMPLEX 字体效果可以参考:

https://blog.csdn.net/hgkdzbf6/article/details/102093323

18. 形态学详细使用,包括不同的核的效果。

参考: https://blog.csdn.net/sunny2038/article/details/9137759

19. cv2.flip 图像翻转。

flip(src, flipCode[, dst]) flipCode Anno 1 水平翻转 0 垂直翻转 -1 水平垂直翻转

参考: https://blog.csdn.net/JNingWei/article/details/78753607

20. 从矩阵中取出一部分。

 Mat() [15/29]
cv::Mat::Mat    (   const Mat &     m,
        const Range &   rowRange,
        const Range &   colRange = Range::all() 
    )   
roi = frame[0 : 300, 0: 300]

注意第一组参数是 height / row,第二组参数是 width / col. 注意这个范围是 左闭右开的。 参考:https://docs.opencv.org/master/d3/d63/classcv_1_1Mat.html#a92a3e9e5911a2eb0cf0950a0a9670c76

21. 保持长宽比转换图像后,获得转换后的尺寸。

            ret, frame = capture.read()
            width = 500
            frame = imutils.resize(frame, width)
            height = frame.shape[1]
            print("width %d, height %d\n" % (frame.shape[0], frame.shape[1]))

参考: https://vimsky.com/zh-tw/examples/detail/python-method-imutils.resize.html

22. 彩色图像转为 HSV 格式,主要用于图像前处理,然后会对某个颜色区间去做检测,主要用于去背景处理后的,物体检测。

色相(H):色彩的顏色名稱,如紅色、黃色等。 飽和度(S):色彩的純度,越高色彩越純,低則逐漸變灰,數值為0-100%。 明度(V):亮度,數值為0-100%。

hsv = cv2.cvtColor(roi, cv2.COLOR_BGR2HSV)

参考: https://shengyu7697.github.io/blog/2020/03/22/Python-OpenCV-rgb-to-hsv/ https://blog.csdn.net/u012193416/article/details/79312798 https://docs.opencv.org/3.4/da/d97/tutorial_threshold_inRange.html

23. 转换数组为 uint8 类型

skin = np.array([0, 20, 70], dtype = np.uint8)

参考: https://numpy.org/doc/stable/reference/generated/numpy.array.html

24. 检查数据是否在范围内,在范围内就设置为 255,不在范围内就设置为 0。也是一种对图像二值化的方法。

◆ inRange()
void cv::inRange    (   InputArray      src,
        InputArray      lowerb,
        InputArray      upperb,
        OutputArray     dst 
    )       
Python:
    dst =   cv.inRange( src, lowerb, upperb[, dst]  )

#include <opencv2/core.hpp>

Checks if array elements lie between the elements of two other arrays.

The function checks the range as follows:

    For every element of a single-channel input array:

    dst(I)=lowerb(I)0≤src(I)0≤upperb(I)0
    For two-channel arrays:

    dst(I)=lowerb(I)0≤src(I)0≤upperb(I)0∧lowerb(I)1≤src(I)1≤upperb(I)1
    and so forth.

That is, dst (I) is set to 255 (all 1 -bits) if src (I) is within the specified 1D, 2D, 3D, ... box and 0 otherwise.

When the lower and/or upper boundary parameters are scalars, the indexes (I) at lowerb and upperb in the above formulas should be omitted. 
mask = cv2.inRange(hsv, skin_low, skin_up)

25. 画轮廓线

            contours = cv2.findContours(mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
            contours = imutils.grab_contours(contours)
            cv2.drawContours(frame, contours, -1, (0, 255, 0), 3)

参考: https://stackoverflow.com/questions/48948769/how-to-draw-contours-using-opencv-in-python

26. 计算轮廓周长或曲线长度

◆ arcLength()
double cv::arcLength    (   InputArray      curve,
        bool    closed 
    )       
Python:
    retval  =   cv.arcLength(   curve, closed   )

#include <opencv2/imgproc.hpp>

Calculates a contour perimeter or a curve length.

The function computes a curve length or a closed contour perimeter.

Parameters
    curve   Input vector of 2D points, stored in std::vector or Mat.
    closed  Flag indicating whether the curve is closed or not. 
cv2.arcLength(cnt,True)

参考: https://docs.opencv.org/master/d3/dc0/group__imgproc__shape.html#ga8d26483c636be6b35c3ec6335798a47c https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_contours/py_contour_features/py_contour_features.html

27. 用更少顶点的曲线或多边形来逼近给定的曲线或多边形

◆ approxPolyDP()
void cv::approxPolyDP   (   InputArray      curve,
        OutputArray     approxCurve,
        double      epsilon,
        bool    closed 
    )       
Python:
    approxCurve =   cv.approxPolyDP(    curve, epsilon, closed[, approxCurve]   )

#include <opencv2/imgproc.hpp>

Approximates a polygonal curve(s) with the specified precision.

The function cv::approxPolyDP approximates a curve or a polygon with another curve/polygon with less vertices so that the distance between them is less or equal to the specified precision. It uses the Douglas-Peucker algorithm http://en.wikipedia.org/wiki/Ramer-Douglas-Peucker_algorithm

Parameters
    curve   Input vector of a 2D point stored in std::vector or Mat
    approxCurve Result of the approximation. The type should match the type of the input curve.
    epsilon Parameter specifying the approximation accuracy. This is the maximum distance between the original curve and its approximation.
    closed  If true, the approximated curve is closed (its first and last vertices are connected). Otherwise, it is not closed. 
            epsilon = 0.0005 * cv2.arcLength(cnt, True)
            approx = cv2.approxPolyDP(cnt, epsilon, True)

28. 寻找轮廓的凸包,可以用来手势识别

hull = cv2.convexHull(cnt)

参考: https://www.cnblogs.com/jclian91/p/9728488.html https://kk665403.pixnet.net/blog/post/403518029-%5Bpython%5D-%E5%88%A9%E7%94%A8opencv%E7%B9%AA%E8%A3%BD%E5%87%B8%E5%8C%85(convexhull)-%E8%BC%AA%E5%BB%93(contour

29. 显示凸包,因为凸包是一维的,需要增加一维,才能通过 drawContours 显示出来。

            hull = cv2.convexHull(cnt)
            hull_list = []
            hull_list.append(hull)
            cv2.drawContours(frame, hull_list, -1, (0, 0, 255), 3)

参考: https://stackoverflow.com/questions/36683556/drawing-convex-hull-of-the-biggest-contour-using-opencv-c https://docs.opencv.org/3.4/d7/d1d/tutorial_hull.html https://docs.opencv.org/master/d7/d1d/tutorial_hull.html

30. 生成一个二维数组。

kernel = np.ones((3, 3), np.uint8)

返回的是一个二维数组

[[1 1 1]
 [1 1 1]
 [1 1 1]]

参考: https://numpy.org/doc/stable/reference/generated/numpy.ones.html

31. shape 不仅仅指的是一个图像的高度,宽度,像素通道数,其实也是指的图像本身矩阵的行数,列数,表示每个像素的数组长度。

通道数,灰度时候为 1, RGB 为 3, RGBA 为 4,RGB555 和 RGB565 是 2。 其实这样看就是字节数,用几个字节来表示像素。 RGB一个像素点的打印如下:

[22 22 15]

RGB 一行打印如下:

[[22 22 15]
 [18 21 13]
 ...
 [127 126 127]]

RGB 图像打印如下:

[[[22 22 15]
  [18 21 13]
  ...
  [127 126 127]]
 ...
 [[10 15 16]
  [15 13 12]
  ...
  [5 5 5]]]

参考:https://blog.csdn.net/qq_28618765/article/details/78618724 https://blog.csdn.net/mvtechnology/article/details/9008499

32. 打印异常

import traceback
try:
    2/0
except Exception as e:
    traceback.print_exc()

参考: https://blog.csdn.net/feiyang5260/article/details/86661103

33. 用 lambda 来获取轮廓当中面积最大的那个:

cnt = max(contours, key = lambda x: cv2.contourArea(x))

参考: https://www.cnblogs.com/bjwu/articles/9028399.html

34. 凸性缺陷 函数是 convexityDefects

convexityDefects()
void cv::convexityDefects   (   InputArray      contour,
        InputArray      convexhull,
        OutputArray     convexityDefects 
    )       
Python:
    convexityDefects    =   cv.convexityDefects(    contour, convexhull[, convexityDefects] )
    epsilon = 0.0005 * cv2.arcLength(cnt, True)
    approx = cv2.approxPolyDP(cnt, epsilon, True)
    hull = cv2.convexHull(approx, returnPoints = False)
    defects = cv2.convexityDefects(approx, hull)

如上: 首先对图像简化成简易多边形,然后是生成凸包,最后生成凸性缺陷。 打印 hull如下:

[[165]
 [163]
 [128]
 [124]
 ...
 [209]
 [208]
 [168]
 [166]]

打印 defects 如下:

[[[166 168 167 114]]
 [[168 208 189 10549]]
 [[209 243 229 11994]]
 ...
 [[122 124 123 114]]
 [[124 128 125 181]]
 [[128 163 152 1784]]
 [[163 165 164 186]]]

defects 四个数是起始点,结束点,最远点,最远点到 hull 的距离。这些数值都是索引,用来索引的。 approx 打印如下:

[[[193 92]]
 [[192 92]]
 ...
 [[192 109]]
 [[193 109]]]
    for i in range(defects.shape[0]):
        s, e, f, d = defects[i, 0]
        start = tuple(approx[s][0])
        end = tuple(approx[e][0])
        far = tuple(approx[f][0])
        pt = (100, 180)

如上,当 i = 0 的时候, sefd 从 defects 里面取出 第一行,第一列的数组,[166 168 167 114], 即 s = 166, e = 168, f = 167, d = 114。然后 start = tuple(approx[s][0]),即 从 approx 里面取出 166行 0 列的数组,然后转为元组。 这个数组只有两个数值,估计就是像素点的 x, y 值。

参考: https://docs.opencv.org/master/d3/dc0/group__imgproc__shape.html#gada4437098113fd8683c932e0567f47ba https://docs.opencv.org/master/d7/d1d/tutorial_hull.html https://zhuanlan.zhihu.com/p/56360621 https://vimsky.com/zh-tw/examples/detail/python-method-cv2.convexityDefects.html https://zhuanlan.zhihu.com/p/140384182

35. 手势识别,需要用到凸包和缺陷。

可以参考: https://blog.csdn.net/qq_41562704/article/details/88975569 https://zhuanlan.zhihu.com/p/140384182 https://www.pythonf.cn/read/29390 https://blog.csdn.net/weixin_44885615/article/details/97811684

36. putText 中文,使用 pil

import cv2
import numpy
from PIL import Image, ImageDraw, ImageFont

class EspVisionUtil:
    @staticmethod
    def cv2ImgAddText(img, txt, left, top, color = (0, 255, 0), size = 20):
        # 判断是 opencv 的图片,就转换RGB因为 PIL 和 CV 的颜色格式不同
        if (isinstance(img, numpy.ndarray)):
            img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
        # 创建绘图对象
        draw = ImageDraw.Draw(img)
        # 字体格式
        fontStyle = ImageFont.truetype("SourceHanSansCN-Light.otf", size,
                                       encoding = "utf-8")
        # 绘制文本
        draw.text((left, top), txt, color, font = fontStyle)
        # 转换 opencv 格式
        return cv2.cvtColor(numpy.asarray(img), cv2.COLOR_RGB2BGR)

参考: https://blog.csdn.net/ctwy291314/article/details/91492048 https://www.cnblogs.com/vipstone/p/8998249.html https://blog.csdn.net/javastart/article/details/88796482 https://blog.csdn.net/zizi7/article/details/70145150 https://www.cnblogs.com/arkenstone/p/6961453.html https://www.cnblogs.com/YouXiangLiThon/p/7815124.html https://blog.csdn.net/qq_41895190/article/details/90513453 https://zhuanlan.zhihu.com/p/161385206

1. 录音的 pcm 文件直接播放,使用:

#!/bin/bash
play -t raw -r 44.1k -e signed-integer -b 16 -c 2 loved.pcm
play -t raw -r 48k -e floating-point -b 32 -c 2 ./data_decode/out.pcm

参考: https://blog.csdn.net/lc999102/article/details/80579866

2. json.h 没有相应的头文件。json.h, curl.h

sudo apt-get install libjsoncpp-dev 
sudo ln -s /usr/include/jsoncpp/json/ /usr/include/json

sudo apt install libcurl4-openssl-dev
sudo ln -s /usr/include/x86_64-linux-gnu/curl /usr/include/curl

sudo apt-get install libopencv-dev

参考: https://blog.csdn.net/zhangpeterx/article/details/92175479

3. QCoreApplication 找不到定义的地方。

QCoreApplication 在 5.14.2/Src/qtbase/src/corelib/kernel/qcoreapplication.h 里面,定义为 class Q_CORE_EXPORT QCoreApplication。 参考: https://www.cnblogs.com/lyggqm/p/6281581.html

4. ffmpeg 的 cmake 配置

cmake_minimum_required(VERSION 3.10)

project(ffmpeg_test)

set(SRC_LIST main.cpp)
include_directories("/usr/include/x86_64-linux-gnu")
link_directories("/usr/lib/x86_64-linux-gnu")

add_executable(ffmpeg_test ${SRC_LIST})

#target_link_libraries(${PROJECT_NAME} libavutil.so libavcodec.so libavformat.so libavdevice.so.57 libavfilter.so libswscale.so libpostproc.so)

#target_link_libraries(${PROJECT_NAME} libavutil.so libavcodec.so libavformat.so libswscale.so)

target_link_libraries(${PROJECT_NAME} avutil avcodec avformat swscale)

参考: https://blog.csdn.net/wangchao1412/article/details/103454371 https://www.jianshu.com/p/72cdcb8d06a7 https://blog.csdn.net/BigDream123/article/details/89741253

5. 使用百度的 tts,需要安装百度 aip sdk

pip3 install baidu-aip --user

参考: https://blog.csdn.net/m0_37886429/article/details/85222593

6. pcm 和 wav 互转

参考: https://blog.csdn.net/sinat_37816910/article/details/105054372 https://blog.csdn.net/huplion/article/details/81260874

7. alsaaudio 中的 openPCM 这个参数的顺序有问题,不要按照 api 上面的顺序写。全部用关键词的方式去写,就没有问题。

        try:
            self.__alsaDev = alsaaudio.PCM(type = alsaaudio.PCM_PLAYBACK, mode = alsaaudio.PCM_NORMAL, rate = 16000, channels = 8, format = alsaaudio.PCM_FORMAT_S16_LE, periodsize = 160, device = "plughw:" + self.__devName)
        except Exception as e:
            print("alsaaudio open pcm exception: ", e)

8. 其他格式转换为 wav 格式,使用 pydub 中的 AudioSegment

    @staticmethod
    def extractToWave(srcPath, destDir = None, destPrefix = None):
        (srcDir, fileName) = os.path.split(srcPath)
        (fileNoExt, ext) = os.path.splitext(fileName)
        if ext == ".wav":
            return srcPath

        if destDir is None:
            destDir = srcDir
        if destPrefix is None:
            destPrefix = ""
        if not os.path.exists(destDir):
            os.makedirs(destDir)
        destName = destPrefix + fileNoExt + ".wav"
        destPath = destDir + "/" + destName
        #print(destPath)
        if ext == ".mp3":
            data = AudioSegment.from_mp3(srcPath)
        else:
            return None
        data.export(destPath, format = "wav")
        return destPath

参考: https://www.cnblogs.com/xingshansi/p/6799994.html https://ithelp.ithome.com.tw/articles/10252078 https://blog.csdn.net/baidu_29198395/article/details/86694365

9. alsaaudio 播放 wav 格式

    def playThreadWav(self, path, index):
        if self.__alsaDev:
            self.__alsaDev.close()
        print("alsa playback wav thread run: %d" % index)

        with wave.open(path, 'rb') as f:
            self.__rate = f.getframerate()
            self.__channels = f.getnchannels()
            self.__depthBits = f.getsampwidth() * 8
            self.__format = self.bitsToFormat(self.__depthBits)
            self.__periodSize = int(self.__rate / 100)
            try:
                self.__alsaDev = alsaaudio.PCM(type = alsaaudio.PCM_PLAYBACK,
                                               mode = alsaaudio.PCM_NORMAL,
                                               rate = self.__rate,
                                               channels = self.__channels,
                                               format = self.__format,
                                               periodsize = self.__periodSize,
                                               device = "plughw:" + self.__devName)
            except Exception as e:
                print("alsaaudio open exception: ", e)

            if self.__alsaDev is None:
                print("open alsa audio device failed")
                self.clearThreadParam(index)
                return "finished"

            data = f.readframes(self.__periodSize)
            while data and self.__eStop == False:
                try:
                    self.__alsaDev.write(data)
                except ALSAAudioError as e:
                    print("alsa audio play except: ", e)
                    break
                data = f.readframes(self.__periodSize)

        self.afterThreadComplete(index)
        return "finished"

    def clearThreadParam(self, index):
        del self.__poolDict[index]
        self.__rate = 0
        self.__channels = 0
        self.__depthBits = 0
        self.__format = 0
        self.__periodSize = 0

    def afterThreadComplete(self, index):
        self.__alsaDev.close()
        self.__alsaDev = None
        if index in self.__hookDict:
            if self.__eStop != True:
                self.__hookDict[index]()
            del self.__hookDict[index]
        self.clearThreadParam(index)

参考: https://www.programcreek.com/python/example/91453/alsaaudio.PCM

10. 使用 websocket 的时候, pip3 install --user websocket-client 而不是 websocket

11. pcm 转 wav

        (file, ext) = os.path.splitext(path)
        wavPath = file + ".wav"
        EspAudioUtil.pcmToWave(path, wavPath, rate, channels, bits)
        os.remove(path)

参考: https://stackoverflow.com/questions/16111038/how-to-convert-pcm-files-to-wav-files-scripting

12. 多通道音频抽取单通道数据

    @staticmethod
    def pcmExtractOneChannal(multiChannArray, channels, index):
        array = multiChannArray
        array.shape = -1, channels
        array = array.T
        return array[index]

    @staticmethod
    def pcmExtractOneChannalFile(multiPath, channels, index, dataBits, onePath):
        audioData = None
        if dataBits == 16:
            dataType = np.uint16
        with open(multiPath, 'rb') as f:
            audioData = np.fromfile(f, dtype = dataType)
        oneData = __class__.pcmExtractOneChannal(audioData, channels, index)
        oneData.tofile(onePath)

    @staticmethod
    def pcmExtractOneChannalBinary(multiBinary, channels, index, dataBits):
        audioData = None
        if dataBits == 16:
            dataType = np.uint16
        audioData = np.fromstring(multiBinary, dtype = dataType)
        oneData = __class__.pcmExtractOneChannal(audioData, channels, index)
        return oneData.tobytes()

参考: https://www.pythonf.cn/read/128012

13 pcm 和 wave 互转

    @staticmethod
    def pcmToWave(pcmPath, wavPath, rate, channels, depthBits):
        with open(pcmPath, "rb") as pcmFile:
            print("pcm open")
            pcmData = pcmFile.read()
        with wave.open(wavPath, "wb") as wavFile:
            print(channels, int(depthBits / 8), rate)
            print(len(pcmData))
            wavFile.setparams((channels, int(depthBits / 8), rate, 0, 'NONE', 'NONE'))
            wavFile.writeframes(pcmData)

    @staticmethod
    def waveToPCM(wavPath, pcmPath, dataBits = 16):
        if dataBits == 16:
            dataType = np.uint16
        with open(wavPath, 'rb') as f:
            f.seek(0)
            f.read(44)
            data = np.fromfile(f, dtype = dataType)
            data.tofile(pcmPath)
        with wave.open(wavPath, 'rb') as f:
            return f.getparams()

参考: https://blog.csdn.net/sinat_37816910/article/details/105054372 https://docs.python.org/3/library/wave.html

14. 停止 baidu 的 websocket,需要发送 cancel,至于是否 self.ws.keep_running = False 不太确定

参考: https://www.coder.work/article/1269314

15. 播放音乐并立即停止

cmd = "AUDIODEV=hw:realtekrt5651co play ~/esp_run/speech/test.wav"
sub = subprocess.Popen(cmd, shell = True)
print(sub.poll())
time.sleep(5)
print("kill")
print(time.time())
sub.kill()  #sub.send_signal(signal.SIGKILL)
# sub.wait()
print(time.time())
print(sub.poll())
sub = subprocess.Popen("stty echo", shell = True)

16. 如果需要使用 alsaaudio 在录音的时候播放其他音频,那么可能录音回发生 overrun,主要是播放音频的 set 函数回导致 overrun,其他一些耗时的处理也会导致 overrun,比如 mp3 解码。

17. subprocess kill 之后,使用 wait() 函数的时候,提示 EOFError 的时候,可以使用

stty sane

来恢复。

18. 寻找目录,寻找文件

def searchDir(path, dirName):
    for root, dirs, files in os.walk(path):
        if dirName in dirs:
            return os.path.join(root, dirName)
    return None

def searchFile(path, fileName):
    for root, dirs, files in os.walk(path):
        if fileName in files:
            return os.path.join(root, fileName)

19. 需要依赖的文件收集

def assembleDepends(path):
    cmd = 'grep -R "import" ' + path
    f = os.popen(cmd)
    data = f.readlines()
    f.close()
    #print(data)
    dependDict = {}
    if data != None:
        for line in data:
            lineData = line[line.find(":") + 1 : ]
            if lineData.startswith("#"):
                continue
            print(lineData)
            dependList = []
            if lineData.find("from") == -1:
                lineData = lineData.replace(" ", "").replace("\n", "")
                dependList = lineData[lineData.find("import") + len("import") : ].split(",")
                #print(dependList)
                for depend in dependList:
                    if depend.startswith("esp_"):
                        dependDict[depend] = 1
            else:
                start = lineData.find("from") + len("from") + 1
                end = lineData.find("import")
                lineData = lineData[start : end].strip()
                #print(lineData)
                if lineData.startswith("esp_"):
                    dependDict[lineData] = 1
        #print(dependDict)
        return dependDict
    return None

20. 降噪算法效果好,耗时低的是 WebRTC, python 可以使用 https://github.com/xiongyihui/python-webrtc-audio-processing 这边的代码。