LayerNorm

torch.nn.modules.normalization.LayerNorm

Applies Layer Normalization over a mini-batch of inputs as described in the paper Normalization https://arxiv.org/abs/1607.06450

TODO $$ y=x−E[x]Var[x]+ϵ∗γ+β y=Var[x]+ϵ ​x−E[x]​∗γ+β $$

The mean and standard-deviation are calculated over the last D dimensions, where D is the dimension of normalized_shape. For example, if normalized_shape is (3, 5) (a 2-dimensional shape), the mean and standard-deviation are computed over the last 2 dimensions of the input (i.e. input.mean((-2, -1))). γ and β are learnable affine transform parameters of normalized_shape if elementwise_affine is true. The standard-deviation is calculated via the biased estimator, equivalent to torch.var(input, unbiased=False).

Value parameters

`normalized_shape`

– input shape from an expected input of size [∗×normalized_shape[0]×normalized_shape[1]×…×normalized_shape[−1]] [∗×normalized_shape[0]×normalized_shape[1]×…×normalized_shape[−1]] If a single integer is used, it is treated as a singleton list, and this module will normalize over the last dimension which is expected to be of that specific size.

elementwise_affine

– a boolean value that when set to true, this module has learnable per-element affine parameters initialized to ones (for weights) and zeros (for biases). Default: true.

eps

– a value added to the denominator for numerical stability. Default: 1e-5

Attributes

Note

Unlike Batch Normalization and Instance Normalization, which applies scalar scale and bias for each entire channel/plane with the affine option, Layer Normalization applies per-element scale and bias with elementwise_affine.

Example

TODO

 // NLP Example
 val Seq(batch, sentence_length, embedding_dim) = Seq(20, 5, 10)
 val embedding = torch.randn(batch, sentence_length, embedding_dim)
 val layer_norm = nn.LayerNorm(embedding_dim)
 // Activate module
 val out = layer_norm(embedding)
 // Image Example
 val Seq(N, C, H, W) = Seq(20, 5, 10, 10)
 val input = torch.randn(N, C, H, W)
 // Normalize over the last three dimensions (i.e. the channel and spatial dimensions)
 val layer_norm = nn.LayerNorm([C, H, W])
 val output = layer_norm(input)
Source
LayerNorm.scala
Graph
Supertypes
trait TensorModule[ParamType]
trait Tensor[ParamType] => Tensor[ParamType]
trait HasWeight[ParamType]
class Module
class Object
trait Matchable
class Any
Show all

Members list

Value members

Concrete methods

def apply(t: Tensor[ParamType]): Tensor[ParamType]

Attributes

Source
LayerNorm.scala
override def hasBias(): Boolean

Attributes

Definition Classes
Source
LayerNorm.scala

Inherited methods

def andThen[A](g: Tensor[ParamType] => A): T1 => A

Attributes

Inherited from:
Function1
def apply(fn: Module => Unit): Module.this.type

Attributes

Inherited from:
Module
Source
Module.scala
def compose[A](g: A => Tensor[ParamType]): A => R

Attributes

Inherited from:
Function1
def eval(): Unit

Attributes

Inherited from:
Module
Source
Module.scala

Attributes

Inherited from:
Module
Source
Module.scala
def load(inputArchive: InputArchive): Unit

Attributes

Inherited from:
Module
Source
Module.scala

Attributes

Inherited from:
Module
Source
Module.scala

Attributes

Inherited from:
Module
Source
Module.scala

Attributes

Inherited from:
Module
Source
Module.scala

Attributes

Inherited from:
Module
Source
Module.scala

Attributes

Inherited from:
Module
Source
Module.scala

Attributes

Inherited from:
Module
Source
Module.scala

Attributes

Inherited from:
Module
Source
Module.scala

Attributes

Inherited from:
Module
Source
Module.scala
def parameters: Seq[Tensor[_]]

Attributes

Inherited from:
Module
Source
Module.scala
def register[M <: Module](child: M, n: String)(using name: Name): M

Attributes

Inherited from:
Module
Source
Module.scala

Adds a buffer to the module.

Adds a buffer to the module.

Attributes

Inherited from:
Module
Source
Module.scala
def registerBuffer[D <: DType](t: Tensor[D], n: String)(using name: Name): Tensor[D]

Attributes

Inherited from:
Module
Source
Module.scala
def registerModule[M <: Module](child: M, n: String)(using name: Name): M

Attributes

Inherited from:
Module
Source
Module.scala
def registerParameter[D <: DType](t: Tensor[D], requiresGrad: Boolean, n: String)(using name: Name): Tensor[D]

Attributes

Inherited from:
Module
Source
Module.scala
def save(outputArchive: OutputArchive): Unit

Attributes

Inherited from:
Module
Source
Module.scala

Attributes

Inherited from:
Module
Source
Module.scala
def to(device: Device): Module.this.type

Attributes

Inherited from:
Module
Source
Module.scala
override def toString(): String

Returns a string representation of the object.

Returns a string representation of the object.

The default representation is platform dependent.

Attributes

Returns

a string representation of the object.

Definition Classes
Inherited from:
TensorModule
Source
Module.scala
def train(on: Boolean): Unit

Attributes

Inherited from:
Module
Source
Module.scala

Concrete fields

val bias: Tensor[ParamType]

Attributes

Source
LayerNorm.scala
val weight: Tensor[ParamType]

Attributes

Source
LayerNorm.scala