SGD

torch.optim.SGD

class SGD(params: Iterable[Tensor[_]], lr: Float) extends Optimizer

Implements stochastic gradient descent (optionally with momentum).

$$ \begin{aligned} &\rule{110mm}{0.4pt} \ &\textbf{input} : \gamma \text{ (lr)}, : \theta_0 \text{ (params)}, : f(\theta) \text{ (objective)}, : \lambda \text{ (weight decay)}, \ &\hspace{13mm} :\mu \text{ (momentum)}, :\tau \text{ (dampening)}, :\textit{ nesterov,}:\textit{ maximize} \[-1.ex] &\rule{110mm}{0.4pt} \ &\textbf{for} : t=1 : \textbf{to} : \ldots : \textbf{do} \ &\hspace{5mm}g_t \leftarrow \nabla_{\theta} f_t (\theta_{t-1}) \ &\hspace{5mm}\textbf{if} : \lambda \neq 0 \ &\hspace{10mm} g_t \leftarrow g_t + \lambda \theta_{t-1} \ &\hspace{5mm}\textbf{if} : \mu \neq 0 \ &\hspace{10mm}\textbf{if} : t > 1 \ &\hspace{15mm} \textbf{b}t \leftarrow \mu \textbf{b}{t-1} + (1-\tau) g_t \ &\hspace{10mm}\textbf{else} \ &\hspace{15mm} \textbf{b}t \leftarrow g_t \ &\hspace{10mm}\textbf{if} : \textit{nesterov} \ &\hspace{15mm} g_t \leftarrow g{t-1} + \mu \textbf{b}t \ &\hspace{10mm}\textbf{else} \[-1.ex] &\hspace{15mm} g_t \leftarrow \textbf{b}t \ &\hspace{5mm}\textbf{if} : \textit{maximize} \ &\hspace{10mm}\theta_t \leftarrow \theta{t-1} + \gamma g_t \[-1.ex] &\hspace{5mm}\textbf{else} \[-1.ex] &\hspace{10mm}\theta_t \leftarrow \theta{t-1} - \gamma g_t \[-1.ex] &\rule{110mm}{0.4pt} \[-1.ex] &\bf{return} : \theta_t \[-1.ex] &\rule{110mm}{0.4pt} \[-1.ex] \end{aligned} $$

Nesterov momentum is based on the formula from On the importance of initialization and momentum in deep learning

Attributes

Source: SGD.scala
Graph
Supertypes: class Optimizer

class Object

trait Matchable

class Any

Members list

Value members

Inherited methods

Performs a single optimization step (parameter update).

Attributes

Note: Unless otherwise specified, this function should not modify the .grad field of the parameters.
Inherited from:: Optimizer
Source: Optimizer.scala

Attributes

Inherited from:: Optimizer
Source: Optimizer.scala

Sets the gradients of all optimized Tensors to zero.

Attributes

Inherited from:: Optimizer
Source: Optimizer.scala

In this article

Generated with