Reference

Contents

Reference

Core Layers

KnetLayers.Multiply — Type.

Multiply(input=inputDimension, output=outputDimension, winit=xavier, atype=KnetLayers.arrtype)

Creates a matrix multiplication layer based on inputDimension and outputDimension. (m::Multiply) = m.w * x

By default parameters initialized with xavier, you may change it with winit argument.

Keywords

input=inputDimension: input dimension
output=outputDimension: output dimension
winit=xavier: weight initialization distribution
atype=KnetLayers.arrtype : array type for parameters. Default value is KnetArray{Float32} if you have gpu device. Otherwise it is Array{Float32}

source

KnetLayers.Embed — Type.

Embed(input=inputSize, output=embedSize, winit=xavier, atype=KnetLayers.arrtype)

Creates an embedding layer according to given inputSize and embedSize where inputSize is your number of unique items you want to embed, and embedSize is the size of output vectors. By default parameters initialized with xavier, you yam change it with winit argument.

(m::Embed)(x::Array{T}) where T<:Integer
(m::Embed)(x; keepsize=true)

Embed objects are callable with an input which is either and integer array (one hot encoding) or an N-dimensional matrix. For N-dimensional matrix, size(x,1)==inputSize

Keywords

input=inputDimension: input dimension
output=embeddingDimension: output dimension
winit=xavier: weight initialization distribution
atype=KnetLayers.arrtype : array type for parameters. Default value is KnetArray{Float32} if you have gpu device. Otherwise it is Array{Float32}

source

KnetLayers.Linear — Type.

Linear(input=inputSize, output=outputSize, winit=xavier, binit=zeros, atype=KnetLayers.arrtype)

Creates and linear layer according to given inputSize and outputSize.

Keywords

input=inputSize input dimension
output=outputSize output dimension
winit=xavier: weight initialization distribution
bias=zeros: bias initialization distribution
atype=KnetLayers.arrtype : array type for parameters. Default value is KnetArray{Float32} if you have gpu device. Otherwise it is Array{Float32}

source

KnetLayers.Dense — Type.

Dense(input=inputSize, output=outputSize, activation=ReLU(), winit=xavier, binit=zeros, atype=KnetLayers.arrtype)

Creates and deense layer according to given input=inputSize and output=outputSize. If activation is nothing, it acts like a Linear Layer.

Keywords

input=inputSize input dimension
output=outputSize output dimension
winit=xaiver: weight initialization distribution
bias=zeros: bias initialization distribution
activation=ReLU() activation function(it does not broadcast) or an activation layer
atype=KnetLayers.arrtype : array type for parameters. Default value is KnetArray{Float32} if you have gpu device. Otherwise it is Array{Float32}

source

KnetLayers.BatchNorm — Type.

BatchNorm(channels:Int;options...)
(m::BatchNorm)(x;training=false) #forward run

Options

momentum=0.1: A real number between 0 and 1 to be used as the scale of

last mean and variance. The existing running mean or variance is multiplied by (1-momentum).

`mean=nothing': The running mean.
var=nothing: The running variance.
meaninit=zeros: The function used for initialize the running mean. Should either be nothing or

of the form (eltype, dims...)->data. zeros is a good option.

varinit=ones: The function used for initialize the run
elementtype=eltype(KnetLayers.arrtype) : element type ∈ {Float32,Float64} for parameters. Default value is eltype(KnetLayers.arrtype)

Keywords

training=nothing: When training is true, the mean and variance of x are used and moments

argument is modified if it is provided. When training is false, mean and variance stored in the moments argument are used. Default value is true when at least one of x and params is AutoGrad.Value, false otherwise.

source

Nonlinearities

KnetLayers.ReLU — Type.

ReLU()
(l::ReLU)(x) = max(0,x)

Rectified Linear Unit function.

source

KnetLayers.Sigm — Type.

Sigm()
(l::Sigm)(x) = sigm(x)

Sigmoid function

source

KnetLayers.Tanh — Type.

Tanh()
(l::Tanh)(x) = tanh(x)

Tangent hyperbolic function

source

KnetLayers.ELU — Type.

ELU()
(l::ELU)(x) = elu(x) -> Computes x < 0 ? exp(x) - 1 : x

Exponential Linear Unit nonlineariy.

source

KnetLayers.LeakyReLU — Type.

LeakyReLU(α=0.2)
(l::LeakyReLU)(x) -> Computes x < 0 ? α*x : x

source

KnetLayers.Dropout — Type.

Dropout(p=0)

Dropout Layer. p is the droput probability.

source

KnetLayers.SoftMax — Type.

SoftMax(dims=:)
(l::SoftMax)(x)

Treat entries in x as as unnormalized scores and return softmax probabilities.

dims is an optional argument, if not specified the normalization is over the whole x, otherwise the normalization is performed over the given dimensions. In particular, if x is a matrix, dims=1 normalizes columns of x and dims=2 normalizes rows of x.

source

KnetLayers.LogSoftMax — Type.

LogSoftMax(dims=:)
(l::LogSoftMax)(x)

Treat entries in x as as unnormalized log probabilities and return normalized log probabilities.

source

KnetLayers.LogSumExp — Type.

LogSumExp(dims=:)
(l::LogSumExp)(x)

Compute log(sum(exp(x);dims)) in a numerically stable manner.

dims is an optional argument, if not specified the summation is over the whole x, otherwise the summation is performed over the given dimensions. In particular if x is a matrix, dims=1 sums columns of x and dims=2 sums rows of x.

source

Loss Functions

KnetLayers.CrossEntropyLoss — Type.

CrossEntropyLoss(dims=1)
(l::CrossEntropyLoss)(scores, answers)

Calculates negative log likelihood error on your predicted scores. answers should be integers corresponding to correct class indices. If an answer is 0, loss from that answer will not be included. This is usefull feature when you are working with unequal length sequences.

if dims==1

size(scores) = C,[B,T1,T2,...]
size(answers)= [B,T1,T2,...]

elseif dims==2

size(scores) = [B,T1,T2,...],C
size(answers)= [B,T1,T2,...]

source

KnetLayers.BCELoss — Type.

BCELoss(average=true)
(l::BCELoss)(scores, answers)
Computes binary cross entropy given scores(predicted values) and answer labels. answer values should be {0,1}, then it returns negative of
mean|sum(answers * log(p) + (1-answers)*log(1-p)) where p is equal to 1/(1 + exp.(scores)). See also LogisticLoss.

source

KnetLayers.LogisticLoss — Type.

LogisticLoss(average=true)
(l::LogisticLoss)(scores, answers)
Computes logistic loss given scores(predicted values) and answer labels. answer values should be {-1,1}, then it returns mean|sum(log(1 +
exp(-answers*scores))). See also `BCELoss`.

source

Convolutional Layers

KnetLayers.Conv — Function.

Conv(height=filterHeight, width=filterWidth, channels=1, filter=1, kwargs...)

Creates and convolutional layer according to given filter dimensions.

(m::GenericConv)(x) #forward run

If m.w has dimensions (W1,W2,...,I,O) and x has dimensions (X1,X2,...,I,N), the result y will have dimensions (Y1,Y2,...,O,N) where

Yi=1+floor((Xi+2*padding[i]-Wi)/stride[i])

Here I is the number of input channels, O is the number of output channels, N is the number of instances, and Wi,Xi,Yi are spatial dimensions. padding and stride are keyword arguments that can be specified as a single number (in which case they apply to all dimensions), or an tuple with entries for each spatial dimension.

Keywords

activation=identity: nonlinear function applied after convolution
pool=nothing: Pooling layer or window size of pooling
winit=xavier: weight initialization distribution
bias=zeros: bias initialization distribution
padding=0: the number of extra zeros implicitly concatenated at the start and at the end of each dimension.
stride=1: the number of elements to slide to reach the next filtering window.
upscale=1: upscale factor for each dimension.
mode=0: 0 for convolution and 1 for cross-correlation.
alpha=1: can be used to scale the result.
handle: handle to a previously created cuDNN context. Defaults to a Knet allocated handle.

source

KnetLayers.DeConv — Function.

DeConv(height::Int, width=1, channels=1, filter=1, kwargs...)

Creates and deconvolutional layer according to given filter dimensions.

(m::GenericConv)(x) #forward run

If m.w has dimensions (W1,W2,...,I,O) and x has dimensions (X1,X2,...,I,N), the result y will have dimensions (Y1,Y2,...,O,N) where

Yi = Wi+stride[i](Xi-1)-2padding[i]

Keywords

activation=identity: nonlinear function applied after convolution
unpool=nothing: Unpooling layer or window size of unpooling
winit=xavier: weight initialization distribution
bias=zeros: bias initialization distribution
padding=0: the nßumber of extra zeros implicitly concatenated at the start and at the end of each dimension.
stride=1: the number of elements to slide to reach the next filtering window.
upscale=1: upscale factor for each dimension.
mode=0: 0 for convolution and 1 for cross-correlation.
alpha=1: can be used to scale the result.
handle: handle to a previously created cuDNN context. Defaults to a Knet allocated handle.

source

KnetLayers.Pool — Function.

Pool(kwargs...)
(::GenericPool)(x)

Compute pooling of input values (i.e., the maximum or average of several adjacent values) to produce an output with smaller height and/or width.

Currently 4 or 5 dimensional KnetArrays with Float32 or Float64 entries are supported. If x has dimensions (X1,X2,...,I,N), the result y will have dimensions (Y1,Y2,...,I,N) where

Yi=1+floor((Xi+2*padding[i]-window[i])/stride[i])

Here I is the number of input channels, N is the number of instances, and Xi,Yi are spatial dimensions. window, padding and stride are keyword arguments that can be specified as a single number (in which case they apply to all dimensions), or an array/tuple with entries for each spatial dimension.

Keywords:

window=2: the pooling window size for each dimension.
padding=0: the number of extra zeros implicitly concatenated at the

start and at the end of each dimension.

stride=window: the number of elements to slide to reach the next pooling

window.

mode=0: 0 for max, 1 for average including padded values, 2 for average

excluding padded values.

maxpoolingNanOpt=0: Nan numbers are not propagated if 0, they are

propagated if 1.

alpha=1: can be used to scale the result.

source

KnetLayers.UnPool — Function.

UnPool(kwargs...)
(::GenericPool)(x)

Reverse of pooling. It has same kwargs with Pool

x == pool(unpool(x;o...); o...)

source

Recurrent Layers

KnetLayers.AbstractRNN — Type.

SRNN(;input=inputSize, hidden=hiddenSize, activation=:relu, options...)
LSTM(;input=inputSize, hidden=hiddenSize, options...)
GRU(;input=inputSize, hidden=hiddenSize, options...)

(1) (l::T)(x; kwargs...) where T<:AbstractRNN
(2) (l::T)(x::Array{Int}; batchSizes=nothing, kwargs...) where T<:AbstractRNN
(3) (l::T)(x::Vector{Vector{Int}}; sorted=false, kwargs...) where T<:AbstractRNN

All RNN layers has above forward run(1,2,3) functionalities.

(1) x is an input array with size equals d,[B,T]

(2) For this You need to have an RNN with embedding layer. x is an integer array and inputs coressponds one hot vector indices. You can give 2D array for minibatching as rows corresponds to one instance. You can give 1D array with minibatching by specifying batch batchSizes argument. Checkout Knet.rnnforw for this.

(3) For this You need to have an RNN with embedding layer. x is an vector of integer vectors. Every integer vector corresponds to an instance. It automatically batches inputs. It is better to give inputs as sorted. If your inputs sorted you can make sorted argument true to increase performance.

see RNNOutput

options

embed=nothing: embedding size or and embedding layer
numLayers=1: Number of RNN layers.
bidirectional=false: Create a bidirectional RNN if true.
dropout=0: Dropout probability. Ignored if numLayers==1.
skipInput=false: Do not multiply the input with a matrix if true.
dataType=eltype(KnetLayers.arrtype): Data type to use for weights. Default is Float32.
algo=0: Algorithm to use, see CUDNN docs for details.
seed=0: Random number seed for dropout. Uses time() if 0.
winit=xavier: Weight initialization method for matrices.
binit=zeros: Weight initialization method for bias vectors.
usegpu=(KnetLayers.arrtype <: KnetArray): GPU used by default if one exists.

Keywords

hx=nothing : initial hidden states
cx=nothing : initial memory cells
hy=false : if true returns h
cy=false : if true returns c

source

KnetLayers.RNNOutput — Type.

struct RNNOutput
    y
    hidden
    memory
    indices
end

Outputs of the RNN models are always RNNOutput hidden,memory and indices may be nothing depending on the kwargs you used in forward.

y is last hidden states of each layer. size(y)=(H/2H,[B,T]). If you use unequal length instances in a batch input, y becomes 2D array size(y)=(H/2H,sum_of_sequence_lengths). See indices and PadRNNOutput to get correct time outputs for a specific instance or to pad whole output.

h is the hidden states in each timesstep. size(h) = h,B,L/2L

c is the hidden states in each timesstep. size(h) = h,B,L/2L

indices is corresponding instace indices for your RNNOutput.y. You may call yi = y[:,indices[i]].

source

KnetLayers.PadSequenceArray — Function.

PadSequenceArray(batch::Vector{Vector{T}}) where T<:Integer

Pads a batch of integer arrays with zeros

julia> PadSequenceArray([[1,2,3],[1,2],[1]]) 3×3 Array{Int64,2}: 1 2 3 1 2 0 1 0 0

source

KnetLayers.PadRNNOutput — Function.

PadRNNOutput(s::RNNOutput)

Pads a rnn output if it is produces by unequal length batches size(s.y)=(H/2H,sum_of_sequence_lengths) becomes (H/2H,B,Tmax)

source

Special Layers

KnetLayers.MLP — Type.

MLP(h::Int...;kwargs...)

Creates a multi layer perceptron according to given hidden states. First hidden state is equal to input size and the last one equal to output size.

(m::MLP)(x;prob=0)

Runs MLP with given input x. prob is the dropout probability.

Keywords

winit=xavier: weight initialization distribution
bias=zeros: bias initialization distribution
activation=ReLU(): activation layer or function
atype=KnetLayers.arrtype : array type for parameters. Default value is KnetArray{Float32} if you have gpu device. Otherwise it is Array{Float32}

source