音频转文字

佚名 / 2024-03-03 / 原文

案例1

github地址
效果，打开音频文件，运行，识别文字
功能强大，但不推荐使用，非常消耗性能和资源
具体步骤

点击查看详情

# 拉取到本地
git clone https://github.com/YaoFANGUK/video-subtitle-generator.git

# 打开终端，进入项目根目录
C:\Users\ychen\Downloads\video-subtitle-generator (main -> origin)
λ pwd
C:\Users\ychen\Downloads\video-subtitle-generator

# 创建虚拟环境
C:\Users\ychen\Downloads\video-subtitle-generator (main -> origin)
λ conda create -n vsgEnv python=3.8
Fetching package metadata .................
Solving package specifications: .

Package plan for installation in environment C:\ProgramData\Anaconda3\envs\vsgEnv:

The following NEW packages will be INSTALLED:

    ca-certificates: 2023.12.12-haa95532_0  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
    libffi:          3.4.4-hd77b12b_0       https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
    openssl:         1.1.1w-h2bbff1b_0      https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
    pip:             23.3.1-py38haa95532_0  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
    python:          3.8.18-h6244533_0      https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
    setuptools:      68.2.2-py38haa95532_0  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
    sqlite:          3.41.2-h2bbff1b_0      https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
    vc:              14.2-h21ff451_1        https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
    vs2015_runtime:  14.27.29016-h5e58377_2 https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
    wheel:           0.41.2-py38haa95532_0  https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main

Proceed ([y]/n)? y

ca-certificate 100% |#######################################################################| Time: 0:00:00 848.38 kB/s
libffi-3.4.4-h 100% |#######################################################################| Time: 0:00:00   4.78 MB/s
openssl-1.1.1w 100% |#######################################################################| Time: 0:00:00   9.53 MB/s
python-3.8.18- 100% |#######################################################################| Time: 0:00:01  11.69 MB/s
setuptools-68. 100% |#######################################################################| Time: 0:00:00  11.87 MB/s
wheel-0.41.2-p 100% |#######################################################################| Time: 0:00:00  14.51 MB/s
pip-23.3.1-py3 100% |#######################################################################| Time: 0:00:00  11.78 MB/s
#
# To activate this environment, use:
# > activate vsgEnv
#
# To deactivate an active environment, use:
# > deactivate
#
# * for power-users using bash, you must source
#

# 激活虚拟环境
C:\Users\ychen\Downloads\video-subtitle-generator (main -> origin)
λ activate vsgEnv

# 安装依赖
C:\Users\ychen\Downloads\video-subtitle-generator (main -> origin)
(vsgEnv) λ pip install -r requirements.txt
Looking in indexes: https://mirrors.aliyun.com/pypi/simple/
Collecting appdirs==1.4.4 (from -r requirements.txt (line 1))
  Downloading https://mirrors.aliyun.com/pypi/packages/3b/00/2344469e2084fb287c2e0b57b72910309874c3245463acd6cf5e3db69324/appdirs-1.4.4-py2.py3-none-any.whl (9.6 kB)
Collecting audioread==2.1.9 (from -r requirements.txt (line 2))
  Downloading https://mirrors.aliyun.com/pypi/packages/b3/d1/e324634c5867a668774d6fe233a83228da4ba16521e19059c15df899737d/audioread-2.1.9.tar.gz (377 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 377.5/377.5 kB 1.6 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... done

# 识别文字
C:\Users\ychen\Downloads\video-subtitle-generator (main -> origin)
(vsgEnv) λ python gui.py
选择识别的语言: 中文(简体)
选择识别模式: 标准
1536 864
Process SpawnPoolWorker-2:
Process SpawnPoolWorker-4:
Process SpawnPoolWorker-1:
Process SpawnPoolWorker-3:
Process SpawnPoolWorker-5: