AI 学习笔记

🌞 / 2023-05-05 / 原文

AI 学习笔记

机器学习简介

Different types of Functions

Regression : The function outputs a scalar(标量).

  • predict the PM2.5

Classification : Given options (classes), the function outputs the correct one.

  • Spam filtering

Structured Learning : create something with structure(image, document)

Example : YouTube Channel

1.Function with Unknown Parameters.

\[y=b+wx_1 \]

2.Define Loss from Training Data

  • Loss is a function of parameters

\[L(b,w) \]

  • Loss : how good a set of values is.
  • L is mean absolute error (MAE)

\[e=\left | y-\hat{y} \right | \]

  • L is mean square error (MSE)

\[e=(y-\hat{y})^2 \]

\[L=\frac{1}{N} \sum_{n}^{}e_n \]

3.Optimization

\[w^*,b^*=arg\,\min_{w,b} \,L \]

Gradient Descent

  • (Randomly) Pick an initial value :

\[w^0 \]

  • Compute :

\[\frac {\partial L} {\partial w} |_{w=w_0} \]

Negative : Increase w

Positive : Decrease w

\[\eta\frac {\partial L} {\partial w} |_{w=w_0} \]

η:learning rate (hyperparameters)

  • Update w iteratively
    • Local minima
    • global minima

类似一个参数,推广到多个参数。

Linear Models

Linear models have severe limitation. Model Bias.

We need a more flexible model!

curve = constant + sum of a set of Hard Sigmoid Function

\[y=c\frac {1} {1+exp(-(b+wx_1))} \\ =csigmoid(b+wx_1) \]

\[y=b+\sum_{i}sigmoid(b_i+w_ix_i) \]

\[y=b+\sum_{i}sigmoid(b_i+\sum_{j}w_{ij}x_j) \]

线性代数角度:

\[r=b+Wx \]

\[a=\sigma(r) \]

\[y=b+c^Ta \]

Loss

  • Loss is a function of parameters L(θ)
  • Loss means how good a set of values is.

Optimization of New Model

\[\theta= \begin{bmatrix} \theta_1 \\ \theta_2 \\ \theta_3 \\ \dots \end{bmatrix} \]

\[\theta=arg \min_\theta L \]

  • (Randomly) Pick initial values θ^0

1 epoch = see all the batched once

update : update θ for each batch

Sigmoid -> ReLU (Rectified Linear Unit)

统称为 Activation function

Neural Network

PyTorch