Linux运行python相关指令

zhima771 / 2024-07-14 / 原文

恢复默认镜像源

conda config –remove-key channels

服务器后台运行程序

# 将日志保存到test.log文件里面
nohup python -u test.py > test.log 2>&1 &

# 运行带配置文件的代码
nohup python -u run_train.py --yaml config/train_nisqa_cnn_sa_ap.yaml > test.log 2>&1 &

# 实时查看日志文件后100行
tail -fn 100 train.log

创建、删除虚拟环境

# 创建（name：创建的环境名）
conda create -n name python=3.7(python版本自己指定)
# 删除
conda remove -n name --all

查看当前conda环境下的包

conda list

查看已有的虚拟环境

conda env list

切换到想要的虚拟环境

conda activate my_env

打包虚拟环境的包

pip freeze > requirements.txt

查看GPU memory情况

# 方法1
nvidia-smi

# 方法2 
gpustat -i

GPU相关

# 指定代码可见GPU
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "2, 4"

# 更改使用的GPU 
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "2"

# 指定显卡运行 
CUDA_VISIBLE_DEVICES=0,1 python train.py

使用0卡
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu' )

import os 
os.environ['CUDA_VISIBLE_DEVICES'] = '1'
 
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu' )

    # 指定GPU
    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    model = Ser_Model() 
    if torch.cuda.device_count() > 1:  # 检查电脑是否有多块GPU
        print(f"Let's use {torch.cuda.device_count()} GPUs!")
        model = nn.DataParallel(model, device_ids=[0, 1, 2,3])  # 将模型对象转变为多GPU并行运算的模型
    model.to(device)
 
# 接下来的代码可能是
# net.to(device)
# 为了把模型和数据都转移到这个device中，由于限定了可见cuda为1，故能指定只使用1号卡进行训练

非root用户赋予权限（绝对路径）

chmod -R 777 local/aishell_data_prep.sh

pip安装

# torchl
https://download.pytorch.org/whl/torch_stable.html

pip install -i https://pypi.mirrors.ustc.edu.cn/simple/ torchAUDIO==0.5.0

pip install 包名 -i http://pypi.douban.com/simple/ --trusted-host pypi.douban.com

https://pypi.tuna.tsinghua.edu.cn/simple


# 创建新环境时，安装requirements.txt文件内的包
pip install -r requirements.txt  

# 清华源
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

# 更新包
pip install --upgrade 包名称

pip换源文件 .pip/pip.conf 文件夹.pip内 pip.conf文件

[global]
index-url = https://pypi.douban.com/simple
[install]
use-mirrors =true
mirrors =https://pypi.douban.com/simple/
trusted-host =pypi.douban.com

下载包的网站

scp 服务器间传文件

scp -r /home/jxx/anaconda3/envs/py3.8torch1.7.1/nltk_data wxc@10.100.100.100(服务器地址):/home/wxc/anaconda3/envs/torch1.7.1/

软连接

Ln -s 已经存在的 新创的（快捷方式）
ln -s /home/jxx/code/asr_blockformer-main/wenet /home/jxx/code/asr_blockformer-main/examples/aishell/s1/wenet

Pytorch中查看CNN网络结构和各层输出参数

http://www.ichenhua.cn/read/238

https://zhuanlan.zhihu.com/p/52013707/

class PrintLayer(nn.Module):
   def __init__(self):
         super(PrintLayer, self).__init__()

   def forward(self,x):
         print(x.shape)
         return x

服务器创建账号密码

创建账号：sudo useradd -d "/home/cybing" -m -s "/bin/bash" cybing
设置密码：echo -e "cyb\nc" | sudo passwd cybing

模型保存与加载

torch.save：保存序列化的对象到磁盘，使用了Python的pickle进行序列化，模型、张量、所有对象的字典
torch.load：使用了pickle的unpacking将pickled的对象反序列化到内存中
torch.nn.Module.load_state_dict：使用反序列化的state_dict加载模型的参数字典

保存
torch.save(model.state_dict(), PATH)
加载
model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
model.eval()

https://blog.csdn.net/qq_39852676/article/details/100120803

# hugging face 数据集/模型下载

1.切换到需要使用的虚拟环境中
2.pip install -U huggingface_hub
3.export HF_ENDPOINT=https://hf-mirror.com (Linux)
  set HF_ENDPOINT=https://hf-mirror.com (Win)
4.huggingface-cli download --resume-download --repo-type dataset --local-dir-use-symlinks False XXX --local-dir YYY

[--repo-type {model,dataset,space}]
# 例子
huggingface-cli download --resume-download --repo-type dataset --local-dir-use-symlinks False ai-habitat/hab3_bench_assets --local-dir D:\workplace\hab3_bench_assets