MLE / AIE Notes in Python

ForHHeart / 2024-03-01 / 原文

🥥 Table of Content

I. Deep Learning Frameworks

PyTorch
HuggingFace
Langchain
TensorFlow
Keras

🥑 Get Started!

PyTorch

PyTorch Official Doucument

Tensors
Datasets & DataLoaders
Transforms
Build Model
Autograd
Optimization
Save & Load Model
PyTorch实现深度学习模型

Tensors

1.自定义张量
最左边有几个括号，就是几维向量。

# 一维向量 
X1 = torch.tensor([0.0, 1.0, 2.0])
# 二维向量
X2 = torch.tensor([[0., 1., 2.],
                   [3., 4., 5.],
                   [6., 7., 8.]])
# 三维向量
X3 = torch.tensor([[[0., 1., 2.],
                    [3., 4., 5.],
                    [6., 7., 8.]],

                   [[0., 1., 2.],
                    [3., 4., 5.],
                    [6., 7., 8.]],

                   [[0., 1., 2.],
                    [3., 4., 5.],
                    [6., 7., 8.]]])

2.生成全0张量和全1张量

# size为一维张量
X0 = torch.zeros(size)
X1 = torch.ones(size)

3.生成随机数张量
有几个数字，就是几维向量。提高维度只需要在前面加数字。
height为行数，width为列数，channel为通道数，batch为批量。

X1 = torch.rand([4])        # torch.rand(width)
X2 = torch.rand([2,4])      # torch.rand(height, width)
X3 = torch.rand([3,2,4])    # torch.rand(channel, height, width)
X4 = torch.rand([2,3,2,4])  # torch.rand(batch, channel, height, width)
X5 = torch.rand([5,2,3,2,4])# torch.rand(batch, batch, channel, height, width)

4.torch.rand()、torch.randn()、torch.randint()

torch.rand(size) # 返回一个张量，包含了从区间(0,1)的均匀分布中抽取的一组随机数，形状由size决定。
torch.randn(size) # 返回一个张量，包含了从标准正态分布(均值为0，方差为1)中抽取的一组随机数，形状由size决定。
torch.randint(low, high, size) # 返回一个张量，包含了从(low,high)中抽取的一组随机整数，形状由size决定。

Datasets & DataLoaders

import torch
from torch.utils import data # 从torch.utils中导入关于data处理的模块
from d2l import torch as d2l

07 - Tensor Operations

<1> Tensor Manipulations

Traditional Matrix Multiplication
1D Tensors (Two vectors)	[n]@[n] = scalar	`A@B` or `torch.dot(A, B)`
2D Tensors (Two Matrix)	[m,n]@[n,p] = [m,p]	`A@B` or `torch.mm(A, B)`
3D Tensors (Two Batched Matrix)	[b,m,n]@[b,n,p] = [b,m,p]	`A@B` or `torch.bmm(A, B)`
Higher-dimentional Tensors or Different Dimentional Tensors	[b,m,n]@[n,p] = [b,m,p]	`A@B` or `torch.matmul(A, B)`

Element-wise Multiplication (Hadamard products)
Tensors (must be the same shape)	[n]@[n] = [n] [m,n]@[m,n] = [m,n] [b,m,n]@[b,m,n] = [b,m,n]	A*B or torch.mul(A,B)

# 1D
# Define two 1D tensors (vectors)
a = torch.tensor([1, 2, 3])
b = torch.tensor([4, 5, 6])

# Compute the dot product
result = torch.dot(a, b)

print(result)


A = torch.rand(3, 4)  # Shape [3, 4]
B = torch.rand(4, 5)  # Shape [4, 5]
C = A @ B  # Resulting shape [3, 5]
C = torch.mm(A,B) # Resulting shape [3, 5]

# A is a 3D tensor of shape [10, 3, 4]
# B is a 2D tensor of shape [4, 5]
# The last two dimensions of A and the dimensions of B are suitable for matrix multiplication.
A = torch.rand(10, 3, 4)
B = torch.rand(4, 5)
C = torch.matmul(A, B)  # Resulting shape [10, 3, 5]

<2> Reshape Operations

reshape(-1)
reshape(-1,6)
reshape(3,4)

# remove the dimension which size is one
x = torch.zeros(1, 2, 1, 3, 1)  # A tensor with several singleton dimensions
y = x.squeeze()                 # Removes all singleton dimensions

print(x.shape)  # Before squeezing
print(y.shape)  # After squeezing

# torch.Size([1, 2, 1, 3, 1])
# torch.Size([2, 3])

x = torch.zeros(2, 1, 3)  # A tensor with a singleton dimension at position 1
y = x.squeeze(1)          # Squeeze dimension 1

print(x.shape)  # Before squeezing
print(y.shape)  # After squeezing
# torch.Size([2, 1, 3])
# torch.Size([2, 3])

# add a dimension in a specific position
unsqueeze(0) # add a dimension in the beginning
unsqueeze(1) # add a dimension in the second position
unsqueeze(-1) # add a dimension in the end

transpose(0,1) #

Build Model

nn.Linear函数是PyTorch中的一个模块，用于创建一个线性回归模型。它的输入和输出都是张量(Tensor)，可以将它视为一个数学模型，将输入张量乘以权重矩阵，加上偏差向量，再通过一个激活函数，得到输出张量。

nn.Linear(input_features,output_features,bias=True)

参数说明：

input_features：输入的特征数。

output_features：输出的特征数。

bias：是否添加偏差。如果设置为True（默认值），将会在模型中添加一个偏差向量。如果设置为False，则不会添加。

# 使用示例
from torch import nn

# 创建一个线性层，输入维度为2，输出维度为1
net = nn.Sequential(nn.Linear(2, 1))

# 将偏差b的值修改为[-1, 1]
linear.bias = nn.Parameter(torch.tensor([-1.0, 1.0]))

nn.Linear源码

def linear(X, w, b):
    """线性回归模型"""
    return torch.matmul(X, w) + b

Citation: nn.Linear函数 / nn.Linear函数怎么改变偏差b的值

Optimization

1.定义损失函数

均方误差(mean-square error, MSE)
$\mathcal{L}(\textbf{w},b)=\dfrac{1}{n}\sum\limits_{i=1}^{n}\dfrac{1}{2}(\hat y^{(i)}-y^{(i)})^2=\dfrac{1}{2n}\sum\limits_{i=1}^{n}(\textbf{w}^T\textbf{x}^{(i)}+b-y^{(i)})^2$

loss = nn.MSELoss()

交叉熵损失函数(Cross Entropy Loss, CEL)
$\mathcal{L}(\textbf{w},b)=-\sum\limits_{i=1}^{n}[y^{(i)}\log\hat y^{(i)}-(1-y^{(i)})\log(1-\hat y^{(i)})]$

loss = nn.CrossEntropyLoss()

2.定义优化算法

随机梯度下降算法(stochastic gradient descent，SGD)

trainer = torch.optim.SGD(net.parameters(), lr=0.03)

PyTorch实现深度学习模型

线性回归的简洁实现

（1）生成数据集

# Import library
import numpy as np
import torch
from torch.utils import data # 从torch.utils中导入关于data处理的模块
from d2l import torch as d2l

# Given the w_0 and b_0, generate the dataset
true_w = torch.tensor([2, -3.4]) # w = [2, -3.4]^T
true_b = 4.2 # b = 4.2
features, labels = d2l.synthetic_data(true_w, true_b, 1000)

# Check the artificial dataset, which consists of features and labels
print('feature size:', features.shape, '\nlabel size:', labels.shape)

（2）读取数据集

# Define the function
def load_array(data_arrays, batch_size, is_train=True): #@save
    """构造⼀个PyTorch数据迭代器"""
    dataset = data.TensorDataset(*data_arrays)
    return data.DataLoader(dataset, batch_size, shuffle=is_train)

# Output randomly and then divide into numbers of batches, which size is batch_size.
batch_size = 10
data_iter = load_array((features, labels), batch_size)

next(iter(data_iter)) # Output First batch

（3）定义模型

from torch import nn

net = nn.Sequential(nn.Linear(2, 1)) # list of layers

（4）初始化模型参数

net[0].weight.data.normal_(0, 0.01) # _表示替换，均值为0方差为0.01的正态分布数替换w的值
net[0].bias.data.fill_(0) # 偏差设为0

（5）定义损失函数

loss = nn.MSELoss() # 均方误差损失函数

（6）定义优化算法

trainer = torch.optim.SGD(net.parameters(), lr=0.03) # 随机梯度下降算法

（7）训练

num_epochs = 3 # 训练周期
for epoch in range(num_epochs):
    for X, y in data_iter:
        l = loss(net(X) ,y)
        trainer.zero_grad()
        l.backward()
        trainer.step()
    l = loss(net(features), labels)
    print(f'epoch {epoch + 1}, loss {l:f}')

# 模型评估
w = net[0].weight.data
print('w的估计误差：', true_w - w.reshape(true_w.shape))
b = net[0].bias.data
print('b的估计误差：', true_b - b)

🤗 Hugging Face

HuggingFace Official Document

Tokenizer
- AutoTokenizer
- BertTokenizer

Tokenizer

Encoder: word -> token -> ids
Decoder: ids -> token -> word

tokenizer.tokenize(texts) # words -> tokens
tokenizer.convert_tokens_to_ids(tokens) # tokens -> ids

tokenizer.encode(texts): # words -> tokens -> ids
tokenizer.decode(texts): # ids -> tokens -> words

batch_input = tokenizer(texts, truncation=True, padding=True, return_tensors='pt')

batch_input = tokenizer.encode_plus((texts[0],texts[1]), truncation=True, padding=True, return_tensors='pt')

tokenizer.vocab # word2idx
tokenizer.vocab_size # 30522

# Load Data and Configuration
texts = ['today is not that bad',
         'today is so bad',
         'so good tonight']
model_name = 'bert-base-uncased'


from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Tokenizer
batch_input = tokenizer(texts, truncation=True, padding=True, return_tensors='pt')

# Model
import torch
import torch.nn.functional as F

with torch.no_grad():
    outputs = model(**batch_input) # SequenceClassifierOutput(loss=None, logits=tensor([[ 0.2347, -0.1015],[ 0.1364, -0.3081],[ 0.0071, -0.4359]], grad_fn=<AddmmBackward0>), hidden_states=None, attentions=None)
    logits = outputs.logits
    scores = F.softmax(logits, dim=-1)
    labels_ids = torch.argmax(scores, dim=-1)
    labels = [model.config.id2label[id] for id in labels_ids.tolist()] # ['POSITIVE', 'NEGATIVE']

Tokenization

ids = tokenizer.encode(texts) # [101, 2651, 2003, 2025, 2008, 2919, 102, 2651, 2003, 2061, 2919, 102]

🦜🔗 Langchain

TensorFlow

Keras