Linear Regression Using Gradient Descent In PyTorch
We’ll apply linear regression using gradient descent in PyTorch on the Boston Housing Dataset.
- Download Data
- Read CSV
- Create TensorDataset
- Create DataLoader
- Create a Model
- Prepare a Loss Function
- Prepare a Stochastic Gradient Descent (SGD) Optimizer
- Write the Epochs Loop
- Put it Altogether
Download Data
Download Boston.csv
from here into a data
folder. You can read about the dataset here.
Read CSV
Let’s use pandas
to read in the CSV.
import pandas as pd
df = pd.read_csv("data/Boston.csv", sep=",", header=0, index_col=0)
x = df.iloc[:, :-1]
# The target variable is the last column
# Note that using df.iloc[:, -1:] rather than df.iloc[:, -1] ensures
# we'll have a dataframe rather than a series, which maps to a 2-dimensional array
y = df.iloc[:, -1:]
Create TensorDataset
Convert the read-in dataframes into tensors that are passed in to create a TensorDataset
.
from torch.utils.data import TensorDataset
import torch
x = torch.from_numpy(x.to_numpy()).float()
y = torch.from_numpy(y.to_numpy()).float()
ds_train = TensorDataset(x, y)
Create DataLoader
Pass in the TensorDataset
to create a DataLoader
, which shuffles the rows and places them into batches that we can iterate on.
from torch.utils.data import DataLoader
batch_size = 32
dl_train = DataLoader(ds_train, batch_size, shuffle=True)
Create a Model
We’ll create a model consisting of one linear layer that has as many inputs as the number of features and 1 output (corresponding to 1 target variable).
This linear layer effectively encapsulates the random initialization of weights (one per feature) and of a bias (one for the one output), and applying the
x @ weights.t() + bias
formula when x
is passed in.
from torch.nn import Linear
model = Linear(x.shape[1], 1)
Prepare a Loss Function
Since it’s a linear regression model, we’ll use Mean Squared Error (MSE) as the loss function.
from torch.nn.functional import mse_loss
Prepare a Stochastic Gradient Descent (SGD) Optimizer
We need to prepare a Stochastic Gradient Descent (SGD) optimizer that can encapsulate the updating of the model parameters (weights and bias) without impacting the gradients as well as zeroing the gradients. We start with a learning rate of 1e-5
, but we may adjust this if the loss doesn’t converge.
from torch.optim import SGD
lr = 1e-5
opt = SGD(model.parameters(), lr)
Write the Epochs Loop
We can iterate on 100 epochs where each epoch consists of iterating through all the batches. With each batch, we compute the predictions, the loss, the gradients, and then we update the parameters and zero the gradients.
for epoch in range(1, 101):
for xb, yb in dl_train:
# Make predictions
preds = model(xb)
# Compute loss
loss = mse_loss(preds, yb)
# Compute gradients
loss.backward()
# Update parameters
opt.step()
# Zero gradients
opt.zero_grad()
if epoch % 10 == 0:
print("Epoch {}: Loss: {:.2f}".format(epoch, loss.item()))
Put it Altogether
We notice that loss appears as nan
or inf
suggesting the learning rate may be too large. Thus, we try a smaller learning rate, such as lr = 1e-7
, set a manual seed for reproduciblity, and put all the code together. The result is a model with a loss of about 89
corresponding to being off by about 9
on average.
import pandas as pd
import torch
from torch.utils.data import TensorDataset, DataLoader
from torch.nn import Linear
from torch.nn.functional import mse_loss
from torch.optim import SGD
torch.manual_seed(1337)
df = pd.read_csv("data/Boston.csv", sep=",", header=0, index_col=0)
x = df.iloc[:, :-1]
y = df.iloc[:, -1:]
x = torch.from_numpy(x.to_numpy()).float()
y = torch.from_numpy(y.to_numpy()).float()
ds_train = TensorDataset(x, y)
batch_size = 32
dl_train = DataLoader(ds_train, batch_size, shuffle=True)
model = Linear(x.shape[1], 1)
lr = 1e-7
opt = SGD(model.parameters(), lr)
for epoch in range(1, 101):
for xb, yb in dl_train:
# Make predictions
preds = model(xb)
# Compute loss
loss = mse_loss(preds, yb)
# Compute gradients
loss.backward()
# Update parameters
opt.step()
# Zero gradients
opt.zero_grad()
if epoch % 10 == 0:
print("Epoch {}: Loss: {:.2f}".format(epoch, loss.item()))