MLE / AIE Notes in Python
🥥 Table of Content
I. Deep Learning Frameworks
- PyTorch
- HuggingFace
- Langchain
- TensorFlow
- Keras
🥑 Get Started!
PyTorch
PyTorch Official Doucument
- Tensors
- Datasets & DataLoaders
- Transforms
- Build Model
- Autograd
- Optimization
- Save & Load Model
- PyTorch实现深度学习模型
Tensors
1.自定义张量
最左边有几个括号,就是几维向量。
# 一维向量
X1 = torch.tensor([0.0, 1.0, 2.0])
# 二维向量
X2 = torch.tensor([[0., 1., 2.],
[3., 4., 5.],
[6., 7., 8.]])
# 三维向量
X3 = torch.tensor([[[0., 1., 2.],
[3., 4., 5.],
[6., 7., 8.]],
[[0., 1., 2.],
[3., 4., 5.],
[6., 7., 8.]],
[[0., 1., 2.],
[3., 4., 5.],
[6., 7., 8.]]])
2.生成全0张量和全1张量
# size为一维张量
X0 = torch.zeros(size)
X1 = torch.ones(size)
3.生成随机数张量
有几个数字,就是几维向量。提高维度只需要在前面加数字。
height为行数,width为列数,channel为通道数,batch为批量。
X1 = torch.rand([4]) # torch.rand(width)
X2 = torch.rand([2,4]) # torch.rand(height, width)
X3 = torch.rand([3,2,4]) # torch.rand(channel, height, width)
X4 = torch.rand([2,3,2,4]) # torch.rand(batch, channel, height, width)
X5 = torch.rand([5,2,3,2,4])# torch.rand(batch, batch, channel, height, width)
4.torch.rand()、torch.randn()、torch.randint()
torch.rand(size) # 返回一个张量,包含了从区间(0,1)的均匀分布中抽取的一组随机数,形状由size决定。
torch.randn(size) # 返回一个张量,包含了从标准正态分布(均值为0,方差为1)中抽取的一组随机数,形状由size决定。
torch.randint(low, high, size) # 返回一个张量,包含了从(low,high)中抽取的一组随机整数,形状由size决定。
Datasets & DataLoaders
import torch
from torch.utils import data # 从torch.utils中导入关于data处理的模块
from d2l import torch as d2l
07 - Tensor Operations
<1> Tensor Manipulations
| Traditional Matrix Multiplication | ||
|---|---|---|
| 1D Tensors (Two vectors) | [n]@[n] = scalar | A@B or torch.dot(A, B) |
| 2D Tensors (Two Matrix) | [m,n]@[n,p] = [m,p] | A@B or torch.mm(A, B) |
| 3D Tensors (Two Batched Matrix) | [b,m,n]@[b,n,p] = [b,m,p] | A@B or torch.bmm(A, B) |
| Higher-dimentional Tensors or Different Dimentional Tensors | [b,m,n]@[n,p] = [b,m,p] | A@B or torch.matmul(A, B) |
| Element-wise Multiplication (Hadamard products) | ||
|---|---|---|
| Tensors (must be the same shape) | [n]@[n] = [n] [m,n]@[m,n] = [m,n] [b,m,n]@[b,m,n] = [b,m,n] |
A*B or torch.mul(A,B) |
# 1D
# Define two 1D tensors (vectors)
a = torch.tensor([1, 2, 3])
b = torch.tensor([4, 5, 6])
# Compute the dot product
result = torch.dot(a, b)
print(result)
A = torch.rand(3, 4) # Shape [3, 4]
B = torch.rand(4, 5) # Shape [4, 5]
C = A @ B # Resulting shape [3, 5]
C = torch.mm(A,B) # Resulting shape [3, 5]
# A is a 3D tensor of shape [10, 3, 4]
# B is a 2D tensor of shape [4, 5]
# The last two dimensions of A and the dimensions of B are suitable for matrix multiplication.
A = torch.rand(10, 3, 4)
B = torch.rand(4, 5)
C = torch.matmul(A, B) # Resulting shape [10, 3, 5]
<2> Reshape Operations
reshape(-1)
reshape(-1,6)
reshape(3,4)
# remove the dimension which size is one
x = torch.zeros(1, 2, 1, 3, 1) # A tensor with several singleton dimensions
y = x.squeeze() # Removes all singleton dimensions
print(x.shape) # Before squeezing
print(y.shape) # After squeezing
# torch.Size([1, 2, 1, 3, 1])
# torch.Size([2, 3])
x = torch.zeros(2, 1, 3) # A tensor with a singleton dimension at position 1
y = x.squeeze(1) # Squeeze dimension 1
print(x.shape) # Before squeezing
print(y.shape) # After squeezing
# torch.Size([2, 1, 3])
# torch.Size([2, 3])
# add a dimension in a specific position
unsqueeze(0) # add a dimension in the beginning
unsqueeze(1) # add a dimension in the second position
unsqueeze(-1) # add a dimension in the end
transpose(0,1) #
Build Model
nn.Linear函数是PyTorch中的一个模块,用于创建一个线性回归模型。它的输入和输出都是张量(Tensor),可以将它视为一个数学模型,将输入张量乘以权重矩阵,加上偏差向量,再通过一个激活函数,得到输出张量。
nn.Linear(input_features,output_features,bias=True)
参数说明:
input_features:输入的特征数。
output_features:输出的特征数。
bias:是否添加偏差。如果设置为True(默认值),将会在模型中添加一个偏差向量。如果设置为False,则不会添加。
# 使用示例
from torch import nn
# 创建一个线性层,输入维度为2,输出维度为1
net = nn.Sequential(nn.Linear(2, 1))
# 将偏差b的值修改为[-1, 1]
linear.bias = nn.Parameter(torch.tensor([-1.0, 1.0]))
nn.Linear源码
def linear(X, w, b):
"""线性回归模型"""
return torch.matmul(X, w) + b
Citation: nn.Linear函数 / nn.Linear函数怎么改变偏差b的值
Optimization
1.定义损失函数
- 均方误差(mean-square error, MSE)
\(\mathcal{L}(\textbf{w},b)=\dfrac{1}{n}\sum\limits_{i=1}^{n}\dfrac{1}{2}(\hat y^{(i)}-y^{(i)})^2=\dfrac{1}{2n}\sum\limits_{i=1}^{n}(\textbf{w}^T\textbf{x}^{(i)}+b-y^{(i)})^2\)
loss = nn.MSELoss()
- 交叉熵损失函数(Cross Entropy Loss, CEL)
\(\mathcal{L}(\textbf{w},b)=-\sum\limits_{i=1}^{n}[y^{(i)}\log\hat y^{(i)}-(1-y^{(i)})\log(1-\hat y^{(i)})]\)
loss = nn.CrossEntropyLoss()
2.定义优化算法
- 随机梯度下降算法(stochastic gradient descent,SGD)
trainer = torch.optim.SGD(net.parameters(), lr=0.03)
PyTorch实现深度学习模型
线性回归的简洁实现
(1)生成数据集
# Import library
import numpy as np
import torch
from torch.utils import data # 从torch.utils中导入关于data处理的模块
from d2l import torch as d2l
# Given the w_0 and b_0, generate the dataset
true_w = torch.tensor([2, -3.4]) # w = [2, -3.4]^T
true_b = 4.2 # b = 4.2
features, labels = d2l.synthetic_data(true_w, true_b, 1000)
# Check the artificial dataset, which consists of features and labels
print('feature size:', features.shape, '\nlabel size:', labels.shape)
(2)读取数据集
# Define the function
def load_array(data_arrays, batch_size, is_train=True): #@save
"""构造⼀个PyTorch数据迭代器"""
dataset = data.TensorDataset(*data_arrays)
return data.DataLoader(dataset, batch_size, shuffle=is_train)
# Output randomly and then divide into numbers of batches, which size is batch_size.
batch_size = 10
data_iter = load_array((features, labels), batch_size)
next(iter(data_iter)) # Output First batch
(3)定义模型
from torch import nn
net = nn.Sequential(nn.Linear(2, 1)) # list of layers
(4)初始化模型参数
net[0].weight.data.normal_(0, 0.01) # _表示替换,均值为0方差为0.01的正态分布数替换w的值
net[0].bias.data.fill_(0) # 偏差设为0
(5)定义损失函数
loss = nn.MSELoss() # 均方误差损失函数
(6)定义优化算法
trainer = torch.optim.SGD(net.parameters(), lr=0.03) # 随机梯度下降算法
(7)训练
num_epochs = 3 # 训练周期
for epoch in range(num_epochs):
for X, y in data_iter:
l = loss(net(X) ,y)
trainer.zero_grad()
l.backward()
trainer.step()
l = loss(net(features), labels)
print(f'epoch {epoch + 1}, loss {l:f}')
# 模型评估
w = net[0].weight.data
print('w的估计误差:', true_w - w.reshape(true_w.shape))
b = net[0].bias.data
print('b的估计误差:', true_b - b)
🤗 Hugging Face
HuggingFace Official Document
- Tokenizer
- AutoTokenizer
- BertTokenizer
Tokenizer
Encoder: word -> token -> ids
Decoder: ids -> token -> word
tokenizer.tokenize(texts) # words -> tokens
tokenizer.convert_tokens_to_ids(tokens) # tokens -> ids
tokenizer.encode(texts): # words -> tokens -> ids
tokenizer.decode(texts): # ids -> tokens -> words
batch_input = tokenizer(texts, truncation=True, padding=True, return_tensors='pt')
batch_input = tokenizer.encode_plus((texts[0],texts[1]), truncation=True, padding=True, return_tensors='pt')
tokenizer.vocab # word2idx
tokenizer.vocab_size # 30522
# Load Data and Configuration
texts = ['today is not that bad',
'today is so bad',
'so good tonight']
model_name = 'bert-base-uncased'
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# Tokenizer
batch_input = tokenizer(texts, truncation=True, padding=True, return_tensors='pt')
# Model
import torch
import torch.nn.functional as F
with torch.no_grad():
outputs = model(**batch_input) # SequenceClassifierOutput(loss=None, logits=tensor([[ 0.2347, -0.1015],[ 0.1364, -0.3081],[ 0.0071, -0.4359]], grad_fn=<AddmmBackward0>), hidden_states=None, attentions=None)
logits = outputs.logits
scores = F.softmax(logits, dim=-1)
labels_ids = torch.argmax(scores, dim=-1)
labels = [model.config.id2label[id] for id in labels_ids.tolist()] # ['POSITIVE', 'NEGATIVE']
Tokenization
ids = tokenizer.encode(texts) # [101, 2651, 2003, 2025, 2008, 2919, 102, 2651, 2003, 2061, 2919, 102]
🦜🔗 Langchain