Neural networks usually rely on back propagation to get gradient to update network parameters , The gradient process is usually a very complex and error prone thing .
The deep learning framework can help us to automatically complete the gradient operation .
Pytorch Usually through back propagation backward Method This kind of gradient calculation is realized . The gradient obtained by this method will have the corresponding independent variable tensor grad Attribute .
besides , Can also call torch.autograd.grad Function to achieve gradient calculation .
This is it. Pytorch Automatic differential mechanism of .
backward Method is usually called on a scalar tensor , The gradient obtained by this method will have the corresponding independent variable tensor grad Attribute .
If the tensor called is not scalar , Then we need to pass in a shape similar to it Of gradient Parameter tensor .
It's equivalent to using gradient Parameter tensors and call tensors are vector dot products , The scalar result is then propagated back .
1, Back propagation of scalars
import numpy as np
import torch
# f(x) = a*x**2 + b*x + c The derivative of
x = torch.tensor(0.0,requires_grad = True) # x Need to be derivative
a = torch.tensor(1.0)
b = torch.tensor(-2.0)
c = torch.tensor(1.0)
y = a*torch.pow(x,2) + b*x + c
y.backward()
dy_dx = x.grad
print(dy_dx)
tensor(-2.)
2, Non scalar back propagation
import numpy as np
import torch
# f(x) = a*x**2 + b*x + c
x = torch.tensor([[0.0,0.0],[1.0,2.0]],requires_grad = True) # x Need to be derivative
a = torch.tensor(1.0)
b = torch.tensor(-2.0)
c = torch.tensor(1.0)
y = a*torch.pow(x,2) + b*x + c
gradient = torch.tensor([[1.0,1.0],[1.0,1.0]])
print("x:\n",x)
print("y:\n",y)
y.backward(gradient = gradient)
x_grad = x.grad
print("x_grad:\n",x_grad)
Two , utilize autograd.grad Method to find the derivative
import numpy as np
import torch
# f(x) = a*x**2 + b*x + c The derivative of
x = torch.tensor(0.0,requires_grad = True) # x Need to be derivative
a = torch.tensor(1.0)
b = torch.tensor(-2.0)
c = torch.tensor(1.0)
y = a*torch.pow(x,2) + b*x + c
# create_graph Set to True Will allow the creation of higher order derivatives
dy_dx = torch.autograd.grad(y,x,create_graph=True)[0]
print(dy_dx.data)
# Take the second derivative
dy2_dx2 = torch.autograd.grad(dy_dx,x)[0]
print(dy2_dx2.data)
tensor(-2.)
tensor(2.)