Reference
Contents
Core Layers
KnetLayers.Multiply — Type.Multiply(input=inputDimension, output=outputDimension, winit=xavier, atype=KnetLayers.arrtype)Creates a matrix multiplication layer based on inputDimension and outputDimension. (m::Multiply) = m.w * x
By default parameters initialized with xavier, you may change it with winit argument.
Keywords
input=inputDimension: input dimensionoutput=outputDimension: output dimensionwinit=xavier: weight initialization distributionatype=KnetLayers.arrtype: array type for parameters. Default value is KnetArray{Float32} if you have gpu device. Otherwise it is Array{Float32}
KnetLayers.Embed — Type.Embed(input=inputSize, output=embedSize, winit=xavier, atype=KnetLayers.arrtype)Creates an embedding layer according to given inputSize and embedSize where inputSize is your number of unique items you want to embed, and embedSize is the size of output vectors. By default parameters initialized with xavier, you yam change it with winit argument.
(m::Embed)(x::Array{T}) where T<:Integer
(m::Embed)(x; keepsize=true)Embed objects are callable with an input which is either and integer array (one hot encoding) or an N-dimensional matrix. For N-dimensional matrix, size(x,1)==inputSize
Keywords
input=inputDimension: input dimensionoutput=embeddingDimension: output dimensionwinit=xavier: weight initialization distributionatype=KnetLayers.arrtype: array type for parameters. Default value is KnetArray{Float32} if you have gpu device. Otherwise it is Array{Float32}
KnetLayers.Linear — Type.Linear(input=inputSize, output=outputSize, winit=xavier, binit=zeros, atype=KnetLayers.arrtype)Creates and linear layer according to given inputSize and outputSize.
Keywords
input=inputSizeinput dimensionoutput=outputSizeoutput dimensionwinit=xavier: weight initialization distributionbias=zeros: bias initialization distributionatype=KnetLayers.arrtype: array type for parameters. Default value is KnetArray{Float32} if you have gpu device. Otherwise it is Array{Float32}
KnetLayers.Dense — Type.Dense(input=inputSize, output=outputSize, activation=ReLU(), winit=xavier, binit=zeros, atype=KnetLayers.arrtype)Creates and deense layer according to given input=inputSize and output=outputSize. If activation is nothing, it acts like a Linear Layer.
Keywords
input=inputSizeinput dimensionoutput=outputSizeoutput dimensionwinit=xaiver: weight initialization distributionbias=zeros: bias initialization distributionactivation=ReLU()activation function(it does not broadcast) or an activation layeratype=KnetLayers.arrtype: array type for parameters. Default value is KnetArray{Float32} if you have gpu device. Otherwise it is Array{Float32}
KnetLayers.BatchNorm — Type.BatchNorm(channels:Int;options...)
(m::BatchNorm)(x;training=false) #forward runOptions
momentum=0.1: A real number between 0 and 1 to be used as the scale of
last mean and variance. The existing running mean or variance is multiplied by (1-momentum).
- `mean=nothing': The running mean.
var=nothing: The running variance.meaninit=zeros: The function used for initialize the running mean. Should either benothingor
of the form (eltype, dims...)->data. zeros is a good option.
varinit=ones: The function used for initialize the runelementtype=eltype(KnetLayers.arrtype): element type ∈ {Float32,Float64} for parameters. Default value iseltype(KnetLayers.arrtype)
Keywords
training=nothing: When training is true, the mean and variance of x are used and moments
argument is modified if it is provided. When training is false, mean and variance stored in the moments argument are used. Default value is true when at least one of x and params is AutoGrad.Value, false otherwise.
Nonlinearities
KnetLayers.ReLU — Type.ReLU()
(l::ReLU)(x) = max(0,x)Rectified Linear Unit function.
KnetLayers.Sigm — Type.Sigm()
(l::Sigm)(x) = sigm(x)Sigmoid function
KnetLayers.Tanh — Type.Tanh()
(l::Tanh)(x) = tanh(x)Tangent hyperbolic function
KnetLayers.ELU — Type.ELU()
(l::ELU)(x) = elu(x) -> Computes x < 0 ? exp(x) - 1 : xExponential Linear Unit nonlineariy.
KnetLayers.LeakyReLU — Type.LeakyReLU(α=0.2)
(l::LeakyReLU)(x) -> Computes x < 0 ? α*x : xKnetLayers.Dropout — Type.Dropout(p=0)Dropout Layer. p is the droput probability.
KnetLayers.SoftMax — Type.SoftMax(dims=:)
(l::SoftMax)(x)Treat entries in x as as unnormalized scores and return softmax probabilities.
dims is an optional argument, if not specified the normalization is over the whole x, otherwise the normalization is performed over the given dimensions. In particular, if x is a matrix, dims=1 normalizes columns of x and dims=2 normalizes rows of x.
KnetLayers.LogSoftMax — Type.LogSoftMax(dims=:)
(l::LogSoftMax)(x)Treat entries in x as as unnormalized log probabilities and return normalized log probabilities.
dims is an optional argument, if not specified the normalization is over the whole x, otherwise the normalization is performed over the given dimensions. In particular, if x is a matrix, dims=1 normalizes columns of x and dims=2 normalizes rows of x.
KnetLayers.LogSumExp — Type.LogSumExp(dims=:)
(l::LogSumExp)(x)Compute log(sum(exp(x);dims)) in a numerically stable manner.
dims is an optional argument, if not specified the summation is over the whole x, otherwise the summation is performed over the given dimensions. In particular if x is a matrix, dims=1 sums columns of x and dims=2 sums rows of x.
Loss Functions
KnetLayers.CrossEntropyLoss — Type.CrossEntropyLoss(dims=1)
(l::CrossEntropyLoss)(scores, answers)Calculates negative log likelihood error on your predicted scores. answers should be integers corresponding to correct class indices. If an answer is 0, loss from that answer will not be included. This is usefull feature when you are working with unequal length sequences.
if dims==1
- size(scores) = C,[B,T1,T2,...]
- size(answers)= [B,T1,T2,...]
elseif dims==2
- size(scores) = [B,T1,T2,...],C
- size(answers)= [B,T1,T2,...]
KnetLayers.BCELoss — Type.BCELoss(average=true)
(l::BCELoss)(scores, answers)
Computes binary cross entropy given scores(predicted values) and answer labels. answer values should be {0,1}, then it returns negative of
mean|sum(answers * log(p) + (1-answers)*log(1-p)) where p is equal to 1/(1 + exp.(scores)). See also LogisticLoss.KnetLayers.LogisticLoss — Type.LogisticLoss(average=true)
(l::LogisticLoss)(scores, answers)
Computes logistic loss given scores(predicted values) and answer labels. answer values should be {-1,1}, then it returns mean|sum(log(1 +
exp(-answers*scores))). See also `BCELoss`.Convolutional Layers
KnetLayers.Conv — Function.Conv(height=filterHeight, width=filterWidth, channels=1, filter=1, kwargs...)Creates and convolutional layer according to given filter dimensions.
(m::GenericConv)(x) #forward runIf m.w has dimensions (W1,W2,...,I,O) and x has dimensions (X1,X2,...,I,N), the result y will have dimensions (Y1,Y2,...,O,N) where
Yi=1+floor((Xi+2*padding[i]-Wi)/stride[i])Here I is the number of input channels, O is the number of output channels, N is the number of instances, and Wi,Xi,Yi are spatial dimensions. padding and stride are keyword arguments that can be specified as a single number (in which case they apply to all dimensions), or an tuple with entries for each spatial dimension.
Keywords
activation=identity: nonlinear function applied after convolutionpool=nothing: Pooling layer or window size of poolingwinit=xavier: weight initialization distributionbias=zeros: bias initialization distributionpadding=0: the number of extra zeros implicitly concatenated at the start and at the end of each dimension.stride=1: the number of elements to slide to reach the next filtering window.upscale=1: upscale factor for each dimension.mode=0: 0 for convolution and 1 for cross-correlation.alpha=1: can be used to scale the result.handle: handle to a previously created cuDNN context. Defaults to a Knet allocated handle.
KnetLayers.DeConv — Function.DeConv(height::Int, width=1, channels=1, filter=1, kwargs...)Creates and deconvolutional layer according to given filter dimensions.
(m::GenericConv)(x) #forward runIf m.w has dimensions (W1,W2,...,I,O) and x has dimensions (X1,X2,...,I,N), the result y will have dimensions (Y1,Y2,...,O,N) where
Yi = Wi+stride[i](Xi-1)-2padding[i]Here I is the number of input channels, O is the number of output channels, N is the number of instances, and Wi,Xi,Yi are spatial dimensions. padding and stride are keyword arguments that can be specified as a single number (in which case they apply to all dimensions), or an tuple with entries for each spatial dimension.
Keywords
activation=identity: nonlinear function applied after convolutionunpool=nothing: Unpooling layer or window size of unpoolingwinit=xavier: weight initialization distributionbias=zeros: bias initialization distributionpadding=0: the nßumber of extra zeros implicitly concatenated at the start and at the end of each dimension.stride=1: the number of elements to slide to reach the next filtering window.upscale=1: upscale factor for each dimension.mode=0: 0 for convolution and 1 for cross-correlation.alpha=1: can be used to scale the result.handle: handle to a previously created cuDNN context. Defaults to a Knet allocated handle.
KnetLayers.Pool — Function.Pool(kwargs...)
(::GenericPool)(x)Compute pooling of input values (i.e., the maximum or average of several adjacent values) to produce an output with smaller height and/or width.
Currently 4 or 5 dimensional KnetArrays with Float32 or Float64 entries are supported. If x has dimensions (X1,X2,...,I,N), the result y will have dimensions (Y1,Y2,...,I,N) where
Yi=1+floor((Xi+2*padding[i]-window[i])/stride[i])
Here I is the number of input channels, N is the number of instances, and Xi,Yi are spatial dimensions. window, padding and stride are keyword arguments that can be specified as a single number (in which case they apply to all dimensions), or an array/tuple with entries for each spatial dimension.
Keywords:
window=2: the pooling window size for each dimension.
padding=0: the number of extra zeros implicitly concatenated at the
start and at the end of each dimension.
- stride=window: the number of elements to slide to reach the next pooling
window.
- mode=0: 0 for max, 1 for average including padded values, 2 for average
excluding padded values.
- maxpoolingNanOpt=0: Nan numbers are not propagated if 0, they are
propagated if 1.
- alpha=1: can be used to scale the result.
KnetLayers.UnPool — Function.UnPool(kwargs...)
(::GenericPool)(x)
Reverse of pooling. It has same kwargs with Pool
x == pool(unpool(x;o...); o...)Recurrent Layers
KnetLayers.AbstractRNN — Type.SRNN(;input=inputSize, hidden=hiddenSize, activation=:relu, options...)
LSTM(;input=inputSize, hidden=hiddenSize, options...)
GRU(;input=inputSize, hidden=hiddenSize, options...)
(1) (l::T)(x; kwargs...) where T<:AbstractRNN
(2) (l::T)(x::Array{Int}; batchSizes=nothing, kwargs...) where T<:AbstractRNN
(3) (l::T)(x::Vector{Vector{Int}}; sorted=false, kwargs...) where T<:AbstractRNNAll RNN layers has above forward run(1,2,3) functionalities.
(1) x is an input array with size equals d,[B,T]
(2) For this You need to have an RNN with embedding layer. x is an integer array and inputs coressponds one hot vector indices. You can give 2D array for minibatching as rows corresponds to one instance. You can give 1D array with minibatching by specifying batch batchSizes argument. Checkout Knet.rnnforw for this.
(3) For this You need to have an RNN with embedding layer. x is an vector of integer vectors. Every integer vector corresponds to an instance. It automatically batches inputs. It is better to give inputs as sorted. If your inputs sorted you can make sorted argument true to increase performance.
see RNNOutput
options
embed=nothing: embedding size or and embedding layernumLayers=1: Number of RNN layers.bidirectional=false: Create a bidirectional RNN iftrue.dropout=0: Dropout probability. Ignored ifnumLayers==1.skipInput=false: Do not multiply the input with a matrix iftrue.dataType=eltype(KnetLayers.arrtype): Data type to use for weights. Default is Float32.algo=0: Algorithm to use, see CUDNN docs for details.seed=0: Random number seed for dropout. Usestime()if 0.winit=xavier: Weight initialization method for matrices.binit=zeros: Weight initialization method for bias vectors.usegpu=(KnetLayers.arrtype <: KnetArray): GPU used by default if one exists.
Keywords
- hx=nothing : initial hidden states
- cx=nothing : initial memory cells
- hy=false : if true returns h
- cy=false : if true returns c
KnetLayers.RNNOutput — Type.struct RNNOutput
y
hidden
memory
indices
endOutputs of the RNN models are always RNNOutput hidden,memory and indices may be nothing depending on the kwargs you used in forward.
y is last hidden states of each layer. size(y)=(H/2H,[B,T]). If you use unequal length instances in a batch input, y becomes 2D array size(y)=(H/2H,sum_of_sequence_lengths). See indices and PadRNNOutput to get correct time outputs for a specific instance or to pad whole output.
h is the hidden states in each timesstep. size(h) = h,B,L/2L
c is the hidden states in each timesstep. size(h) = h,B,L/2L
indices is corresponding instace indices for your RNNOutput.y. You may call yi = y[:,indices[i]].
KnetLayers.PadSequenceArray — Function.PadSequenceArray(batch::Vector{Vector{T}}) where T<:IntegerPads a batch of integer arrays with zeros
julia> PadSequenceArray([[1,2,3],[1,2],[1]]) 3×3 Array{Int64,2}: 1 2 3 1 2 0 1 0 0
KnetLayers.PadRNNOutput — Function.PadRNNOutput(s::RNNOutput)Pads a rnn output if it is produces by unequal length batches size(s.y)=(H/2H,sum_of_sequence_lengths) becomes (H/2H,B,Tmax)
Special Layers
KnetLayers.MLP — Type.MLP(h::Int...;kwargs...)Creates a multi layer perceptron according to given hidden states. First hidden state is equal to input size and the last one equal to output size.
(m::MLP)(x;prob=0)Runs MLP with given input x. prob is the dropout probability.
Keywords
winit=xavier: weight initialization distributionbias=zeros: bias initialization distributionactivation=ReLU(): activation layer or functionatype=KnetLayers.arrtype: array type for parameters. Default value is KnetArray{Float32} if you have gpu device. Otherwise it is Array{Float32}
Function Index
KnetLayers.AbstractRNNKnetLayers.BCELossKnetLayers.BatchNormKnetLayers.CrossEntropyLossKnetLayers.DenseKnetLayers.DropoutKnetLayers.ELUKnetLayers.EmbedKnetLayers.LeakyReLUKnetLayers.LinearKnetLayers.LogSoftMaxKnetLayers.LogSumExpKnetLayers.LogisticLossKnetLayers.MLPKnetLayers.MultiplyKnetLayers.RNNOutputKnetLayers.ReLUKnetLayers.SigmKnetLayers.SoftMaxKnetLayers.TanhKnetLayers.ConvKnetLayers.DeConvKnetLayers.PadRNNOutputKnetLayers.PadSequenceArrayKnetLayers.PoolKnetLayers.UnPool