torch.nn.functional

package torch.nn.functional

Attributes

Members list

Grouped members

Loss functions

Function that measures Binary Cross Entropy between target and input logits.

TODO support weight, reduction, pos_weight

Attributes

Inherited from:: Loss (hidden)
Source: Loss.scala

Linear functions

Applies a bilinear transformation to the incoming data: $y = x_1^T A x_2 + b$

Shape:

input1: $(N, , H_{in1})$ where $H_{in1}=\text{in1_features}$ and $$ means any number of additional dimensions. All but the last dimension of the inputs should be the same.
input2: $(N, *, H_{in2})$ where $H_{in2}=\text{in2_features}$
weight: $(\text{out_features}, \text{in1_features}, \text{in2_features})$
bias: $(\text{out_features})$
output: $(N, *, H_{out})$ where $H_{out}=\text{out_features}$ and all but the last dimension are the same shape as the input.

Attributes

Inherited from:: Linear (hidden)
Source: Linear.scala

Applies a linear transformation to the incoming data: $y = xA^T + b$.

This operation supports 2-D weight with sparse layout

Warning

Sparse support is a beta feature and some layout(s)/dtype/device combinations may not be supported, or may not have autograd support. If you notice missing functionality please open a feature request.

This operator supports TensorFloat32<tf32_on_ampere>

Shape:

Input: $(*, in_features)$ where [*] means any number of additional dimensions, including none
Weight: $(out_features, in_features)$ or $(in_features)$
Bias: $(out_features)$ or $()$
Output: $(, out_features)$ or $()$, based on the shape of the weight

Attributes

Inherited from:: Linear (hidden)
Source: Linear.scala

Sparse functions

Takes LongTensor with index values of shape (*) and returns a tensor of shape (*, numClasses) that have zeros everywhere except where the index of last dimension matches the corresponding value of the input tensor, in which case it will be 1.

Attributes

Inherited from:: Sparse (hidden)
Source: Sparse.scala

Pooling functions

Applies a 1D max pooling over an input signal composed of several input planes.

Value parameters

ceilMode: If true, will use ceil instead of floor to compute the output shape. This ensures that every element in the input tensor is covered by a sliding window.
countIncludePad: when true, will include the zero-padding in the averaging calculation.
input: input tensor of shape $(\text{minibatch} , \text{in_channels} , iW)$
kernelSize: the size of the window.
padding: implicit zero paddings on both sides of the input. Can be a single number or a tuple (padW,).
stride: the stride of the window. Default: kernelSize

Attributes

See also: torch.nn.AdaptiveAvgPool1d for details and output shape.
Inherited from:: Pooling (hidden)
Source: Pooling.scala

Applies a 2D max pooling over an input signal composed of several input planes.

Attributes

Inherited from:: Pooling (hidden)
Source: Pooling.scala

Applies a 3D max pooling over an input signal composed of several input planes.

Attributes

Inherited from:: Pooling (hidden)
Source: Pooling.scala

Applies a 1D max pooling over an input signal composed of several input planes.

Value parameters

ceilMode: If true, will use ceil instead of floor to compute the output shape. This ensures that every element in the input tensor is covered by a sliding window.
dilation: The stride between elements within a sliding window, must be > 0.
input: input tensor of shape $(\text{minibatch} , \text{in_channels} , iW)$, minibatch dim optional.
kernelSize: the size of the window.
padding: Implicit negative infinity padding to be added on both sides, must be >= 0 and <= kernel_size / 2.
stride: the stride of the window.

Attributes

Inherited from:: Pooling (hidden)
Source: Pooling.scala

Applies a 1D max pooling over an input signal composed of several input planes.

Value parameters

ceilMode: If true, will use ceil instead of floor to compute the output shape. This ensures that every element in the input tensor is covered by a sliding window.
dilation: The stride between elements within a sliding window, must be > 0.
input: input tensor of shape $(\text{minibatch} , \text{in_channels} , iW)$, minibatch dim optional.
kernelSize: the size of the window.
padding: Implicit negative infinity padding to be added on both sides, must be >= 0 and <= kernel_size / 2.
stride: the stride of the window.

Attributes

Inherited from:: Pooling (hidden)
Source: Pooling.scala

Applies a 2D max pooling over an input signal composed of several input planes.

Value parameters

ceilMode: If true, will use ceil instead of floor to compute the output shape. This ensures that every element in the input tensor is covered by a sliding window.
dilation: The stride between elements within a sliding window, must be > 0.
input: input tensor $(\text{minibatch} , \text{in_channels} , iH , iW)$, minibatch dim optional.
kernelSize: size of the pooling region. Can be a single number or a tuple (kH, kW)
padding: Implicit negative infinity padding to be added on both sides, must be >= 0 and <= kernel_size / 2.
stride: stride of the pooling operation. Can be a single number or a tuple (sH, sW)

Attributes

See also: torch.nn.MaxPool2d for details.
Inherited from:: Pooling (hidden)
Source: Pooling.scala

Applies a 2D max pooling over an input signal composed of several input planes.

Value parameters

ceilMode: If true, will use ceil instead of floor to compute the output shape. This ensures that every element in the input tensor is covered by a sliding window.
dilation: The stride between elements within a sliding window, must be > 0.
input: input tensor $(\text{minibatch} , \text{in_channels} , iH , iW)$, minibatch dim optional.
kernelSize: size of the pooling region. Can be a single number or a tuple (kH, kW)
padding: Implicit negative infinity padding to be added on both sides, must be >= 0 and <= kernel_size / 2.
stride: stride of the pooling operation. Can be a single number or a tuple (sH, sW)

Attributes

See also: torch.nn.MaxPool2d for details.
Inherited from:: Pooling (hidden)
Source: Pooling.scala

Applies a 3D max pooling over an input signal composed of several input planes.

Attributes

Inherited from:: Pooling (hidden)
Source: Pooling.scala

Applies a 3D max pooling over an input signal composed of several input planes.

Attributes

Inherited from:: Pooling (hidden)
Source: Pooling.scala

Convolution functions

Applies a 1D convolution over an input signal composed of several input planes.

Attributes

Inherited from:: Convolution (hidden)
Source: Convolution.scala

Applies a 2D convolution over an input signal composed of several input planes.

Attributes

Inherited from:: Convolution (hidden)
Source: Convolution.scala

Applies a 3D convolution over an input image composed of several input planes.

Attributes

Inherited from:: Convolution (hidden)
Source: Convolution.scala

Applies a 1D transposed convolution operator over an input signal composed of several input planes, sometimes also called “deconvolution”.

Attributes

Inherited from:: Convolution (hidden)
Source: Convolution.scala

Applies a 2D transposed convolution operator over an input image composed of several input planes, sometimes also called “deconvolution”.

Attributes

Inherited from:: Convolution (hidden)
Source: Convolution.scala

Applies a 3D transposed convolution operator over an input image composed of several input planes, sometimes also called “deconvolution”.

Attributes

Inherited from:: Convolution (hidden)
Source: Convolution.scala

Dropout functions

During training, randomly zeroes some of the elements of the input tensor with probability p using samples from a Bernoulli distribution.

Attributes

See also: torch.nn.Dropout for details.
Inherited from:: Dropout (hidden)
Source: Dropout.scala

Non-linear activation functions

Applies a softmax followed by a logarithm.

While mathematically equivalent to log(softmax(x)), doing these two operations separately is slower and numerically unstable. This function uses an alternative formulation to compute the output and gradient correctly.

See torch.nn.LogSoftmax for more details.

Attributes

Inherited from:: Activations (hidden)
Source: Activations.scala

Applies the rectified linear unit function element-wise.

See torch.nn.ReLU for more details.

Attributes

Inherited from:: Activations (hidden)
Source: Activations.scala

Applies the element-wise function $\text{Sigmoid}(x) = \frac{1}{1 + \exp(-x)}$

See torch.nn.Sigmoid for more details.

Attributes

Inherited from:: Activations (hidden)
Source: Activations.scala

Applies the Sigmoid Linear Unit (SiLU) function, element-wise. The SiLU function is also known as the swish function.

Attributes

Inherited from:: Activations (hidden)
Source: Activations.scala

Applies a softmax function.

Attributes

Inherited from:: Activations (hidden)
Source: Activations.scala

Value members

Inherited methods

This criterion computes the cross entropy loss between input logits and target. See torch.nn.loss.CrossEntropyLoss for details.

Shape:

Input: Shape $(C)$, $(N,C)$ or $(N,C,d_1,d_2,...,d_K)$ with $K≥1$ in the case of K-dimensional loss.
Target: If containing class indices, shape $()$, $(N)$ or $(N,d_1,d_2,...,d_K)$ with $K≥1$ in the case of K-dimensional loss where each value should be between $[0,C)$. If containing class probabilities, same shape as the input and each value should be between [0,1][0,1].

where:

C = number of classes
N = batch size

Value parameters

ignore_index: Specifies a target value that is ignored and does not contribute to the input gradient. When size_average is true, the loss is averaged over non-ignored targets. Note that ignore_index is only applicable when the target contains class indices. Default: -100
input: Predicted unnormalized logits; see Shape section above for supported shapes.
label_smoothing: A float in [0.0, 1.0]. Specifies the amount of smoothing when computing the loss, where 0.0 means no smoothing. The targets become a mixture of the original ground truth and a uniform distribution as described in Rethinking the Inception Architecture for Computer Vision. Default: 0.0
reduce: Deprecated (see reduction). By default, the losses are averaged or summed over observations for each mini-batch depending on size_average. When reduce is false, returns a loss per batch element instead and ignores size_average. Default: true
reduction: Specifies the reduction to apply to the output: 'none' | 'mean' | 'sum'. 'none': no reduction will be applied, 'mean': the sum of the output will be divided by the number of elements in the output, 'sum': the output will be summed. Note: size_average and reduce are in the process of being deprecated, and in the meantime, specifying either of those two args will override reduction. Default: 'mean'
size_average: Deprecated (see reduction). By default, the losses are averaged over each loss element in the batch. Note that for some losses, there multiple elements per sample. If the field size_average is set to false, the losses are instead summed for each mini-batch. Ignored when reduce is false. Default: true
target: Ground truth class indices or class probabilities; see Shape section below for supported shapes.
weight: a manual rescaling weight given to each class. If given, has to be a Tensor of size C

Attributes

Returns

torch.Tensor

See also

See torch.nn.functional.cross_entropy

See for equivalent torch.nn.CrossEntropyLoss class

See PyTorch C++ documentation

See ByteDeco PyTorch preset

Example

 // Example of target with class indices
 val input = torch.randn(3, 5, requires_grad=True)
 val target = torch.randint(5, (3,), dtype=torch.int64)
 val loss = F.cross_entropy(input, target)
 loss.backward()
 // Example of target with class probabilities
 val input = torch.randn(3, 5, requires_grad=True)
 val target = torch.randn(3, 5).softmax(dim=1)
 val loss = F.crossEntropy(input, target)
 loss.backward()

Inherited from:

Loss (hidden)

Source

Loss.scala

In this article

Generated with