1. 단항 선형 회귀¶
- 한 개의 입력이 들어가서 한 개의 출력이 나오는 구조
In [59]:
import torch
import torch.nn as nn
import torch.optim as optim
import matplotlib.pyplot as plt
In [60]:
torch.manual_seed(2024)
Out[60]:
<torch._C.Generator at 0x7807083d2270>
In [61]:
x_train = torch.FloatTensor([[1], [2], [3]])
y_train = torch.FloatTensor([[2], [4], [6]])
print(x_train, x_train.shape)
print(y_train, y_train.shape)
tensor([[1.], [2.], [3.]]) torch.Size([3, 1]) tensor([[2.], [4.], [6.]]) torch.Size([3, 1])
In [62]:
plt.figure(figsize=(6, 4))
plt.scatter(x_train, y_train)
Out[62]:
<matplotlib.collections.PathCollection at 0x78063337cee0>
In [63]:
# y = wx + b
model = nn.Linear(1, 1)
print(model)
Linear(in_features=1, out_features=1, bias=True)
In [64]:
y_pred = model(x_train)
print(y_pred)
tensor([[0.7260], [0.7894], [0.8528]], grad_fn=<AddmmBackward0>)
In [65]:
list(model.parameters()) # W: 0.0634, b: 0.6625
# y = 0.0634x + 0.6625
# x=1, 0.0634 + 0.6625 = 0.7259
# x=2, (0.0634*2) + 0.6625 = 0.7893
Out[65]:
[Parameter containing: tensor([[0.0634]], requires_grad=True), Parameter containing: tensor([0.6625], requires_grad=True)]
In [66]:
# mse: ((예측값 - 실측값)**2)의 평균
((y_pred - y_train)**2).mean()
Out[66]:
tensor(12.8082, grad_fn=<MeanBackward0>)
In [67]:
loss = nn.MSELoss()(y_pred, y_train)
loss
Out[67]:
tensor(12.8082, grad_fn=<MseLossBackward0>)
2. 경사하강법(Gradient Descent)¶
- 비용함수의 값을 최소로 하는 W와 b를 찾는 알고리즘을 "옵티마이저 알고리즘" 이라고 함
- 옵티마이저 알고리즘 중 가장 기본적인 기술이 경사하강법
![No description has been provided for this image](https://i.imgur.com/0fW4LTG.png)
- 옵티마이저 알고리즘을 통해 W와 b를 찾아내는 과정을 "학습"이라고 부름
- 학습률(Learning rate): 한 번 W를 움직이는 거리(increment step)
In [68]:
# SGD(Stochasitc Gradient Descent
# 랜덤하게 데이터를 하나씩 뽑아서 loss를 만듦
# 데이터를 뽑고 다시 데이터에 넣고 반복
# 빠르게 방향을 결정
optimizer = optim.SGD(model.parameters(), lr=0.01)
In [69]:
loss = nn.MSELoss()(y_pred, y_train)
In [70]:
# gradient를 초기화
optimizer.zero_grad()
# 역전파: 비용 함수를 미분하여 gradient(기울기) 계산
loss.backward()
# W와 b를 업데이트
optimizer.step()
print(list(model.parameters())) # W: 0.2177, b: 0.7267
[Parameter containing: tensor([[0.2177]], requires_grad=True), Parameter containing: tensor([0.7267], requires_grad=True)]
In [71]:
# 반복 합습을 통해 오차가 있는 W, b를 수정하면서 오차를 계속 줄여나감
# epochs: 반복 학습 횟수(에포크)
epochs = 1000
for epoch in range(epochs + 1):
y_pred = model(x_train)
loss = nn.MSELoss()(y_pred, y_train)
optimizer.zero_grad()
loss.backward()
optimizer.step()
if epoch % 100 == 0:
print(f'Epoch: {epoch}/{epochs}, loss: {loss:.6f}')
Epoch: 0/1000, loss: 10.171454 Epoch: 100/1000, loss: 0.142044 Epoch: 200/1000, loss: 0.087774 Epoch: 300/1000, loss: 0.054239 Epoch: 400/1000, loss: 0.033517 Epoch: 500/1000, loss: 0.020711 Epoch: 600/1000, loss: 0.012798 Epoch: 700/1000, loss: 0.007909 Epoch: 800/1000, loss: 0.004887 Epoch: 900/1000, loss: 0.003020 Epoch: 1000/1000, loss: 0.001866
In [72]:
print(list(model.parameters())) # W: 1.9499, b: 0.1138
[Parameter containing: tensor([[1.9499]], requires_grad=True), Parameter containing: tensor([0.1138], requires_grad=True)]
In [73]:
x_test = torch.FloatTensor([[5]])
y_pred = model(x_test)
print(y_pred)
tensor([[9.8635]], grad_fn=<AddmmBackward0>)
3. 다중 선형 회귀¶
- 여러 개의 입력이 들어가서 한 개의 출력이 나오는 구조
In [74]:
X_train = torch.FloatTensor([[73, 80, 75],
[93, 88, 93],
[89, 91, 90],
[96, 98, 100],
[73, 66, 70]])
y_train = torch.FloatTensor([[150], [190], [180], [200], [130]])
print(X_train, X_train.shape)
print(y_train, y_train.shape)
tensor([[ 73., 80., 75.], [ 93., 88., 93.], [ 89., 91., 90.], [ 96., 98., 100.], [ 73., 66., 70.]]) torch.Size([5, 3]) tensor([[150.], [190.], [180.], [200.], [130.]]) torch.Size([5, 1])
In [75]:
# y = W1x1 + W2x2 + W3x3 ... + b
model = nn.Linear(3, 1)
print(model)
Linear(in_features=3, out_features=1, bias=True)
In [76]:
optimizer = optim.SGD(model.parameters(), lr=0.00001)
In [77]:
epochs = 10000
for epoch in range(epochs + 1):
y_pred = model(X_train)
loss = nn.MSELoss()(y_pred, y_train)
optimizer.zero_grad()
loss.backward()
optimizer.step()
if epoch % 100 == 0:
print(f'Epoch: {epoch}/{epochs}, loss: {loss:.6f}')
Epoch: 0/10000, loss: 38561.125000 Epoch: 100/10000, loss: 43.880661 Epoch: 200/10000, loss: 43.343727 Epoch: 300/10000, loss: 42.829872 Epoch: 400/10000, loss: 42.337685 Epoch: 500/10000, loss: 41.866142 Epoch: 600/10000, loss: 41.414219 Epoch: 700/10000, loss: 40.980984 Epoch: 800/10000, loss: 40.565197 Epoch: 900/10000, loss: 40.166359 Epoch: 1000/10000, loss: 39.783215 Epoch: 1100/10000, loss: 39.415241 Epoch: 1200/10000, loss: 39.061520 Epoch: 1300/10000, loss: 38.721352 Epoch: 1400/10000, loss: 38.394039 Epoch: 1500/10000, loss: 38.079037 Epoch: 1600/10000, loss: 37.775555 Epoch: 1700/10000, loss: 37.483067 Epoch: 1800/10000, loss: 37.201080 Epoch: 1900/10000, loss: 36.929047 Epoch: 2000/10000, loss: 36.666267 Epoch: 2100/10000, loss: 36.412544 Epoch: 2200/10000, loss: 36.167397 Epoch: 2300/10000, loss: 35.930252 Epoch: 2400/10000, loss: 35.700775 Epoch: 2500/10000, loss: 35.478638 Epoch: 2600/10000, loss: 35.263390 Epoch: 2700/10000, loss: 35.054775 Epoch: 2800/10000, loss: 34.852394 Epoch: 2900/10000, loss: 34.656013 Epoch: 3000/10000, loss: 34.465225 Epoch: 3100/10000, loss: 34.279915 Epoch: 3200/10000, loss: 34.099678 Epoch: 3300/10000, loss: 33.924290 Epoch: 3400/10000, loss: 33.753643 Epoch: 3500/10000, loss: 33.587376 Epoch: 3600/10000, loss: 33.425301 Epoch: 3700/10000, loss: 33.267204 Epoch: 3800/10000, loss: 33.112991 Epoch: 3900/10000, loss: 32.962395 Epoch: 4000/10000, loss: 32.815300 Epoch: 4100/10000, loss: 32.671444 Epoch: 4200/10000, loss: 32.530815 Epoch: 4300/10000, loss: 32.393181 Epoch: 4400/10000, loss: 32.258480 Epoch: 4500/10000, loss: 32.126400 Epoch: 4600/10000, loss: 31.997028 Epoch: 4700/10000, loss: 31.870102 Epoch: 4800/10000, loss: 31.745564 Epoch: 4900/10000, loss: 31.623352 Epoch: 5000/10000, loss: 31.503361 Epoch: 5100/10000, loss: 31.385387 Epoch: 5200/10000, loss: 31.269455 Epoch: 5300/10000, loss: 31.155386 Epoch: 5400/10000, loss: 31.043224 Epoch: 5500/10000, loss: 30.932774 Epoch: 5600/10000, loss: 30.824055 Epoch: 5700/10000, loss: 30.716915 Epoch: 5800/10000, loss: 30.611334 Epoch: 5900/10000, loss: 30.507303 Epoch: 6000/10000, loss: 30.404682 Epoch: 6100/10000, loss: 30.303339 Epoch: 6200/10000, loss: 30.203369 Epoch: 6300/10000, loss: 30.104742 Epoch: 6400/10000, loss: 30.007236 Epoch: 6500/10000, loss: 29.910976 Epoch: 6600/10000, loss: 29.815811 Epoch: 6700/10000, loss: 29.721771 Epoch: 6800/10000, loss: 29.628819 Epoch: 6900/10000, loss: 29.536825 Epoch: 7000/10000, loss: 29.445801 Epoch: 7100/10000, loss: 29.355770 Epoch: 7200/10000, loss: 29.266705 Epoch: 7300/10000, loss: 29.178417 Epoch: 7400/10000, loss: 29.091028 Epoch: 7500/10000, loss: 29.004566 Epoch: 7600/10000, loss: 28.918858 Epoch: 7700/10000, loss: 28.833883 Epoch: 7800/10000, loss: 28.749701 Epoch: 7900/10000, loss: 28.666290 Epoch: 8000/10000, loss: 28.583557 Epoch: 8100/10000, loss: 28.501522 Epoch: 8200/10000, loss: 28.420135 Epoch: 8300/10000, loss: 28.339466 Epoch: 8400/10000, loss: 28.259365 Epoch: 8500/10000, loss: 28.179977 Epoch: 8600/10000, loss: 28.101202 Epoch: 8700/10000, loss: 28.023005 Epoch: 8800/10000, loss: 27.945362 Epoch: 8900/10000, loss: 27.868311 Epoch: 9000/10000, loss: 27.791782 Epoch: 9100/10000, loss: 27.715847 Epoch: 9200/10000, loss: 27.640408 Epoch: 9300/10000, loss: 27.565521 Epoch: 9400/10000, loss: 27.491138 Epoch: 9500/10000, loss: 27.417210 Epoch: 9600/10000, loss: 27.343838 Epoch: 9700/10000, loss: 27.270786 Epoch: 9800/10000, loss: 27.198359 Epoch: 9900/10000, loss: 27.126339 Epoch: 10000/10000, loss: 27.054819
In [78]:
print(list(model.parameters()))
[Parameter containing: tensor([[0.3478, 0.6414, 1.0172]], requires_grad=True), Parameter containing: tensor([-0.2856], requires_grad=True)]
In [79]:
# 93, 93, 93
x_test = torch.FloatTensor([[93, 93, 93]])
y_pred = model(x_test)
print(y_pred)
tensor([[186.3026]], grad_fn=<AddmmBackward0>)
4. temps.csv 데이터에서 기온에 따른 지면 온도를 예측해보기¶
In [80]:
import pandas as pd
In [81]:
df = pd.read_csv('/content/drive/MyDrive/KDT/6. 머신러닝과 딥러닝/Data/temps.csv', encoding='euc-kr')
df
Out[81]:
지점 | 지점명 | 일시 | 기온(°C) | 지면온도(°C) | |
---|---|---|---|---|---|
0 | 232 | 천안 | 2020-01-01 01:00 | -8.7 | -2.9 |
1 | 232 | 천안 | 2020-01-01 02:00 | -7.3 | -2.4 |
2 | 232 | 천안 | 2020-01-01 03:00 | -6.7 | -2.2 |
3 | 232 | 천안 | 2020-01-01 04:00 | -6.2 | -2.0 |
4 | 232 | 천안 | 2020-01-01 05:00 | -5.9 | -1.9 |
... | ... | ... | ... | ... | ... |
8777 | 232 | 천안 | 2020-12-31 19:00 | -6.6 | -0.6 |
8778 | 232 | 천안 | 2020-12-31 20:00 | -6.4 | -0.7 |
8779 | 232 | 천안 | 2020-12-31 21:00 | -7.3 | -1.2 |
8780 | 232 | 천안 | 2020-12-31 22:00 | -9.0 | -1.5 |
8781 | 232 | 천안 | 2020-12-31 23:00 | -9.2 | -1.2 |
8782 rows × 5 columns
In [82]:
df.isnull().mean()
Out[82]:
지점 0.000000 지점명 0.000000 일시 0.000000 기온(°C) 0.000342 지면온도(°C) 0.000000 dtype: float64
In [83]:
df = df.dropna()
In [84]:
df.isnull().mean()
Out[84]:
지점 0.0 지점명 0.0 일시 0.0 기온(°C) 0.0 지면온도(°C) 0.0 dtype: float64
In [85]:
x_data = df[['기온(°C)']]
y_data = df[['지면온도(°C)']]
In [86]:
x_data = torch.FloatTensor(x_data.values)
y_data = torch.FloatTensor(y_data.values)
print(x_data.shape)
print(y_data.shape)
torch.Size([8779, 1]) torch.Size([8779, 1])
In [87]:
x_data
Out[87]:
tensor([[-8.7000], [-7.3000], [-6.7000], ..., [-7.3000], [-9.0000], [-9.2000]])
In [88]:
y_data
Out[88]:
tensor([[-2.9000], [-2.4000], [-2.2000], ..., [-1.2000], [-1.5000], [-1.2000]])
In [89]:
plt.figure(figsize=(8, 6))
plt.scatter(x_data, y_data)
Out[89]:
<matplotlib.collections.PathCollection at 0x78063290c160>
In [90]:
model = nn.Linear(1, 1)
optimizer = torch.optim.SGD(model.parameters(), lr=0.001)
print(list(model.parameters()))
[Parameter containing: tensor([[-0.5700]], requires_grad=True), Parameter containing: tensor([0.2403], requires_grad=True)]
In [91]:
epochs = 10000
for epoch in range(epochs + 1):
y_pred = model(x_data)
loss = nn.MSELoss()(y_pred, y_data)
optimizer.zero_grad()
loss.backward()
optimizer.step()
if epoch % 100 == 0:
print(f'Epoch: {epoch}/{epochs}, loss: {loss:.6f}')
Epoch: 0/10000, loss: 727.285767 Epoch: 100/10000, loss: 13.042105 Epoch: 200/10000, loss: 13.029263 Epoch: 300/10000, loss: 13.018347 Epoch: 400/10000, loss: 13.009065 Epoch: 500/10000, loss: 13.001172 Epoch: 600/10000, loss: 12.994462 Epoch: 700/10000, loss: 12.988756 Epoch: 800/10000, loss: 12.983907 Epoch: 900/10000, loss: 12.979782 Epoch: 1000/10000, loss: 12.976276 Epoch: 1100/10000, loss: 12.973297 Epoch: 1200/10000, loss: 12.970764 Epoch: 1300/10000, loss: 12.968608 Epoch: 1400/10000, loss: 12.966776 Epoch: 1500/10000, loss: 12.965218 Epoch: 1600/10000, loss: 12.963895 Epoch: 1700/10000, loss: 12.962769 Epoch: 1800/10000, loss: 12.961811 Epoch: 1900/10000, loss: 12.960998 Epoch: 2000/10000, loss: 12.960305 Epoch: 2100/10000, loss: 12.959718 Epoch: 2200/10000, loss: 12.959216 Epoch: 2300/10000, loss: 12.958792 Epoch: 2400/10000, loss: 12.958429 Epoch: 2500/10000, loss: 12.958122 Epoch: 2600/10000, loss: 12.957862 Epoch: 2700/10000, loss: 12.957641 Epoch: 2800/10000, loss: 12.957451 Epoch: 2900/10000, loss: 12.957289 Epoch: 3000/10000, loss: 12.957153 Epoch: 3100/10000, loss: 12.957036 Epoch: 3200/10000, loss: 12.956938 Epoch: 3300/10000, loss: 12.956856 Epoch: 3400/10000, loss: 12.956782 Epoch: 3500/10000, loss: 12.956722 Epoch: 3600/10000, loss: 12.956671 Epoch: 3700/10000, loss: 12.956627 Epoch: 3800/10000, loss: 12.956591 Epoch: 3900/10000, loss: 12.956558 Epoch: 4000/10000, loss: 12.956532 Epoch: 4100/10000, loss: 12.956508 Epoch: 4200/10000, loss: 12.956489 Epoch: 4300/10000, loss: 12.956473 Epoch: 4400/10000, loss: 12.956459 Epoch: 4500/10000, loss: 12.956448 Epoch: 4600/10000, loss: 12.956436 Epoch: 4700/10000, loss: 12.956429 Epoch: 4800/10000, loss: 12.956419 Epoch: 4900/10000, loss: 12.956414 Epoch: 5000/10000, loss: 12.956409 Epoch: 5100/10000, loss: 12.956404 Epoch: 5200/10000, loss: 12.956400 Epoch: 5300/10000, loss: 12.956397 Epoch: 5400/10000, loss: 12.956393 Epoch: 5500/10000, loss: 12.956392 Epoch: 5600/10000, loss: 12.956391 Epoch: 5700/10000, loss: 12.956388 Epoch: 5800/10000, loss: 12.956388 Epoch: 5900/10000, loss: 12.956385 Epoch: 6000/10000, loss: 12.956385 Epoch: 6100/10000, loss: 12.956384 Epoch: 6200/10000, loss: 12.956383 Epoch: 6300/10000, loss: 12.956382 Epoch: 6400/10000, loss: 12.956382 Epoch: 6500/10000, loss: 12.956381 Epoch: 6600/10000, loss: 12.956381 Epoch: 6700/10000, loss: 12.956380 Epoch: 6800/10000, loss: 12.956379 Epoch: 6900/10000, loss: 12.956380 Epoch: 7000/10000, loss: 12.956379 Epoch: 7100/10000, loss: 12.956380 Epoch: 7200/10000, loss: 12.956379 Epoch: 7300/10000, loss: 12.956380 Epoch: 7400/10000, loss: 12.956379 Epoch: 7500/10000, loss: 12.956378 Epoch: 7600/10000, loss: 12.956379 Epoch: 7700/10000, loss: 12.956380 Epoch: 7800/10000, loss: 12.956379 Epoch: 7900/10000, loss: 12.956379 Epoch: 8000/10000, loss: 12.956379 Epoch: 8100/10000, loss: 12.956379 Epoch: 8200/10000, loss: 12.956379 Epoch: 8300/10000, loss: 12.956379 Epoch: 8400/10000, loss: 12.956377 Epoch: 8500/10000, loss: 12.956380 Epoch: 8600/10000, loss: 12.956379 Epoch: 8700/10000, loss: 12.956379 Epoch: 8800/10000, loss: 12.956379 Epoch: 8900/10000, loss: 12.956379 Epoch: 9000/10000, loss: 12.956379 Epoch: 9100/10000, loss: 12.956379 Epoch: 9200/10000, loss: 12.956379 Epoch: 9300/10000, loss: 12.956378 Epoch: 9400/10000, loss: 12.956378 Epoch: 9500/10000, loss: 12.956379 Epoch: 9600/10000, loss: 12.956379 Epoch: 9700/10000, loss: 12.956379 Epoch: 9800/10000, loss: 12.956379 Epoch: 9900/10000, loss: 12.956379 Epoch: 10000/10000, loss: 12.956379
In [92]:
list(model.parameters())
Out[92]:
[Parameter containing: tensor([[1.0854]], requires_grad=True), Parameter containing: tensor([0.8198], requires_grad=True)]
In [94]:
y_pred = model(x_data).detach().numpy()
y_pred
Out[94]:
array([[-8.623071], [-7.103529], [-6.452296], ..., [-7.103529], [-8.948688], [-9.165765]], dtype=float32)
In [95]:
plt.figure(figsize=(8, 6))
plt.scatter(x_data, y_data)
plt.scatter(x_data, y_pred)
Out[95]:
<matplotlib.collections.PathCollection at 0x7806328b1810>
In [96]:
result = model(torch.FloatTensor([[26]]))
print(result)
tensor([[29.0399]], grad_fn=<AddmmBackward0>)
In [ ]: