Reference
Contents
Core Layers
KnetLayers.Multiply
— Type.Multiply(input=inputDimension, output=outputDimension, winit=xavier, atype=KnetLayers.arrtype)
Creates a matrix multiplication layer based on inputDimension
and outputDimension
. (m::Multiply) = m.w * x
By default parameters initialized with xavier, you may change it with winit
argument.
Keywords
input=inputDimension
: input dimensionoutput=outputDimension
: output dimensionwinit=xavier
: weight initialization distributionatype=KnetLayers.arrtype
: array type for parameters. Default value is KnetArray{Float32} if you have gpu device. Otherwise it is Array{Float32}
KnetLayers.Embed
— Type.Embed(input=inputSize, output=embedSize, winit=xavier, atype=KnetLayers.arrtype)
Creates an embedding layer according to given inputSize
and embedSize
where inputSize
is your number of unique items you want to embed, and embedSize
is the size of output vectors. By default parameters initialized with xavier, you yam change it with winit
argument.
(m::Embed)(x::Array{T}) where T<:Integer
(m::Embed)(x; keepsize=true)
Embed objects are callable with an input which is either and integer array (one hot encoding) or an N-dimensional matrix. For N-dimensional matrix, size(x,1)==inputSize
Keywords
input=inputDimension
: input dimensionoutput=embeddingDimension
: output dimensionwinit=xavier
: weight initialization distributionatype=KnetLayers.arrtype
: array type for parameters. Default value is KnetArray{Float32} if you have gpu device. Otherwise it is Array{Float32}
KnetLayers.Linear
— Type.Linear(input=inputSize, output=outputSize, winit=xavier, binit=zeros, atype=KnetLayers.arrtype)
Creates and linear layer according to given inputSize
and outputSize
.
Keywords
input=inputSize
input dimensionoutput=outputSize
output dimensionwinit=xavier
: weight initialization distributionbias=zeros
: bias initialization distributionatype=KnetLayers.arrtype
: array type for parameters. Default value is KnetArray{Float32} if you have gpu device. Otherwise it is Array{Float32}
KnetLayers.Dense
— Type.Dense(input=inputSize, output=outputSize, activation=ReLU(), winit=xavier, binit=zeros, atype=KnetLayers.arrtype)
Creates and deense layer according to given input=inputSize
and output=outputSize
. If activation is nothing
, it acts like a Linear
Layer.
Keywords
input=inputSize
input dimensionoutput=outputSize
output dimensionwinit=xaiver
: weight initialization distributionbias=zeros
: bias initialization distributionactivation=ReLU()
activation function(it does not broadcast) or an activation layeratype=KnetLayers.arrtype
: array type for parameters. Default value is KnetArray{Float32} if you have gpu device. Otherwise it is Array{Float32}
KnetLayers.BatchNorm
— Type.BatchNorm(channels:Int;options...)
(m::BatchNorm)(x;training=false) #forward run
Options
momentum=0.1
: A real number between 0 and 1 to be used as the scale of
last mean and variance. The existing running mean or variance is multiplied by (1-momentum).
- `mean=nothing': The running mean.
var=nothing
: The running variance.meaninit=zeros
: The function used for initialize the running mean. Should either benothing
or
of the form (eltype, dims...)->data
. zeros
is a good option.
varinit=ones
: The function used for initialize the runelementtype=eltype(KnetLayers.arrtype)
: element type ∈ {Float32,Float64} for parameters. Default value iseltype(KnetLayers.arrtype)
Keywords
training
=nothing: When training is true, the mean and variance of x are used and moments
argument is modified if it is provided. When training is false, mean and variance stored in the moments argument are used. Default value is true when at least one of x and params is AutoGrad.Value, false otherwise.
Nonlinearities
KnetLayers.ReLU
— Type.ReLU()
(l::ReLU)(x) = max(0,x)
Rectified Linear Unit function.
KnetLayers.Sigm
— Type.Sigm()
(l::Sigm)(x) = sigm(x)
Sigmoid function
KnetLayers.Tanh
— Type.Tanh()
(l::Tanh)(x) = tanh(x)
Tangent hyperbolic function
KnetLayers.ELU
— Type.ELU()
(l::ELU)(x) = elu(x) -> Computes x < 0 ? exp(x) - 1 : x
Exponential Linear Unit nonlineariy.
KnetLayers.LeakyReLU
— Type.LeakyReLU(α=0.2)
(l::LeakyReLU)(x) -> Computes x < 0 ? α*x : x
KnetLayers.Dropout
— Type.Dropout(p=0)
Dropout Layer. p
is the droput probability.
KnetLayers.SoftMax
— Type.SoftMax(dims=:)
(l::SoftMax)(x)
Treat entries in x as as unnormalized scores and return softmax probabilities.
dims is an optional argument, if not specified the normalization is over the whole x, otherwise the normalization is performed over the given dimensions. In particular, if x is a matrix, dims=1 normalizes columns of x and dims=2 normalizes rows of x.
KnetLayers.LogSoftMax
— Type.LogSoftMax(dims=:)
(l::LogSoftMax)(x)
Treat entries in x as as unnormalized log probabilities and return normalized log probabilities.
dims is an optional argument, if not specified the normalization is over the whole x, otherwise the normalization is performed over the given dimensions. In particular, if x is a matrix, dims=1 normalizes columns of x and dims=2 normalizes rows of x.
KnetLayers.LogSumExp
— Type.LogSumExp(dims=:)
(l::LogSumExp)(x)
Compute log(sum(exp(x);dims)) in a numerically stable manner.
dims is an optional argument, if not specified the summation is over the whole x, otherwise the summation is performed over the given dimensions. In particular if x is a matrix, dims=1 sums columns of x and dims=2 sums rows of x.
Loss Functions
KnetLayers.CrossEntropyLoss
— Type.CrossEntropyLoss(dims=1)
(l::CrossEntropyLoss)(scores, answers)
Calculates negative log likelihood error on your predicted scores. answers
should be integers corresponding to correct class indices. If an answer is 0, loss from that answer will not be included. This is usefull feature when you are working with unequal length sequences.
if dims==1
- size(scores) = C,[B,T1,T2,...]
- size(answers)= [B,T1,T2,...]
elseif dims==2
- size(scores) = [B,T1,T2,...],C
- size(answers)= [B,T1,T2,...]
KnetLayers.BCELoss
— Type.BCELoss(average=true)
(l::BCELoss)(scores, answers)
Computes binary cross entropy given scores(predicted values) and answer labels. answer values should be {0,1}, then it returns negative of
mean|sum(answers * log(p) + (1-answers)*log(1-p)) where p is equal to 1/(1 + exp.(scores)). See also LogisticLoss.
KnetLayers.LogisticLoss
— Type.LogisticLoss(average=true)
(l::LogisticLoss)(scores, answers)
Computes logistic loss given scores(predicted values) and answer labels. answer values should be {-1,1}, then it returns mean|sum(log(1 +
exp(-answers*scores))). See also `BCELoss`.
Convolutional Layers
KnetLayers.Conv
— Function.Conv(height=filterHeight, width=filterWidth, channels=1, filter=1, kwargs...)
Creates and convolutional layer according to given filter dimensions.
(m::GenericConv)(x) #forward run
If m.w
has dimensions (W1,W2,...,I,O)
and x
has dimensions (X1,X2,...,I,N)
, the result y
will have dimensions (Y1,Y2,...,O,N)
where
Yi=1+floor((Xi+2*padding[i]-Wi)/stride[i])
Here I
is the number of input channels, O
is the number of output channels, N
is the number of instances, and Wi,Xi,Yi
are spatial dimensions. padding
and stride
are keyword arguments that can be specified as a single number (in which case they apply to all dimensions), or an tuple with entries for each spatial dimension.
Keywords
activation=identity
: nonlinear function applied after convolutionpool=nothing
: Pooling layer or window size of poolingwinit=xavier
: weight initialization distributionbias=zeros
: bias initialization distributionpadding=0
: the number of extra zeros implicitly concatenated at the start and at the end of each dimension.stride=1
: the number of elements to slide to reach the next filtering window.upscale=1
: upscale factor for each dimension.mode=0
: 0 for convolution and 1 for cross-correlation.alpha=1
: can be used to scale the result.handle
: handle to a previously created cuDNN context. Defaults to a Knet allocated handle.
KnetLayers.DeConv
— Function.DeConv(height::Int, width=1, channels=1, filter=1, kwargs...)
Creates and deconvolutional layer according to given filter dimensions.
(m::GenericConv)(x) #forward run
If m.w
has dimensions (W1,W2,...,I,O)
and x
has dimensions (X1,X2,...,I,N)
, the result y
will have dimensions (Y1,Y2,...,O,N)
where
Yi = Wi+stride[i](Xi-1)-2padding[i]
Here I
is the number of input channels, O
is the number of output channels, N
is the number of instances, and Wi,Xi,Yi
are spatial dimensions. padding
and stride
are keyword arguments that can be specified as a single number (in which case they apply to all dimensions), or an tuple with entries for each spatial dimension.
Keywords
activation=identity
: nonlinear function applied after convolutionunpool=nothing
: Unpooling layer or window size of unpoolingwinit=xavier
: weight initialization distributionbias=zeros
: bias initialization distributionpadding=0
: the nßumber of extra zeros implicitly concatenated at the start and at the end of each dimension.stride=1
: the number of elements to slide to reach the next filtering window.upscale=1
: upscale factor for each dimension.mode=0
: 0 for convolution and 1 for cross-correlation.alpha=1
: can be used to scale the result.handle
: handle to a previously created cuDNN context. Defaults to a Knet allocated handle.
KnetLayers.Pool
— Function.Pool(kwargs...)
(::GenericPool)(x)
Compute pooling of input values (i.e., the maximum or average of several adjacent values) to produce an output with smaller height and/or width.
Currently 4 or 5 dimensional KnetArrays with Float32 or Float64 entries are supported. If x has dimensions (X1,X2,...,I,N), the result y will have dimensions (Y1,Y2,...,I,N) where
Yi=1+floor((Xi+2*padding[i]-window[i])/stride[i])
Here I is the number of input channels, N is the number of instances, and Xi,Yi are spatial dimensions. window, padding and stride are keyword arguments that can be specified as a single number (in which case they apply to all dimensions), or an array/tuple with entries for each spatial dimension.
Keywords:
window=2: the pooling window size for each dimension.
padding=0: the number of extra zeros implicitly concatenated at the
start and at the end of each dimension.
- stride=window: the number of elements to slide to reach the next pooling
window.
- mode=0: 0 for max, 1 for average including padded values, 2 for average
excluding padded values.
- maxpoolingNanOpt=0: Nan numbers are not propagated if 0, they are
propagated if 1.
- alpha=1: can be used to scale the result.
KnetLayers.UnPool
— Function.UnPool(kwargs...)
(::GenericPool)(x)
Reverse of pooling. It has same kwargs with Pool
x == pool(unpool(x;o...); o...)
Recurrent Layers
KnetLayers.AbstractRNN
— Type.SRNN(;input=inputSize, hidden=hiddenSize, activation=:relu, options...)
LSTM(;input=inputSize, hidden=hiddenSize, options...)
GRU(;input=inputSize, hidden=hiddenSize, options...)
(1) (l::T)(x; kwargs...) where T<:AbstractRNN
(2) (l::T)(x::Array{Int}; batchSizes=nothing, kwargs...) where T<:AbstractRNN
(3) (l::T)(x::Vector{Vector{Int}}; sorted=false, kwargs...) where T<:AbstractRNN
All RNN layers has above forward run(1,2,3) functionalities.
(1) x
is an input array with size equals d,[B,T]
(2) For this You need to have an RNN with embedding layer. x
is an integer array and inputs coressponds one hot vector indices. You can give 2D array for minibatching as rows corresponds to one instance. You can give 1D array with minibatching by specifying batch batchSizes argument. Checkout Knet.rnnforw
for this.
(3) For this You need to have an RNN with embedding layer. x
is an vector of integer vectors. Every integer vector corresponds to an instance. It automatically batches inputs. It is better to give inputs as sorted. If your inputs sorted you can make sorted
argument true to increase performance.
see RNNOutput
options
embed=nothing
: embedding size or and embedding layernumLayers=1
: Number of RNN layers.bidirectional=false
: Create a bidirectional RNN iftrue
.dropout=0
: Dropout probability. Ignored ifnumLayers==1
.skipInput=false
: Do not multiply the input with a matrix iftrue
.dataType=eltype(KnetLayers.arrtype)
: Data type to use for weights. Default is Float32.algo=0
: Algorithm to use, see CUDNN docs for details.seed=0
: Random number seed for dropout. Usestime()
if 0.winit=xavier
: Weight initialization method for matrices.binit=zeros
: Weight initialization method for bias vectors.usegpu=(KnetLayers.arrtype <: KnetArray)
: GPU used by default if one exists.
Keywords
- hx=nothing : initial hidden states
- cx=nothing : initial memory cells
- hy=false : if true returns h
- cy=false : if true returns c
KnetLayers.RNNOutput
— Type.struct RNNOutput
y
hidden
memory
indices
end
Outputs of the RNN models are always RNNOutput
hidden
,memory
and indices
may be nothing depending on the kwargs you used in forward.
y
is last hidden states of each layer. size(y)=(H/2H,[B,T])
. If you use unequal length instances in a batch input, y
becomes 2D array size(y)=(H/2H,sum_of_sequence_lengths)
. See indices
and PadRNNOutput
to get correct time outputs for a specific instance or to pad whole output.
h
is the hidden states in each timesstep. size(h) = h,B,L/2L
c
is the hidden states in each timesstep. size(h) = h,B,L/2L
indices
is corresponding instace indices for your RNNOutput.y
. You may call yi = y[:,indices[i]]
.
KnetLayers.PadSequenceArray
— Function.PadSequenceArray(batch::Vector{Vector{T}}) where T<:Integer
Pads a batch of integer arrays with zeros
julia> PadSequenceArray([[1,2,3],[1,2],[1]]) 3×3 Array{Int64,2}: 1 2 3 1 2 0 1 0 0
KnetLayers.PadRNNOutput
— Function.PadRNNOutput(s::RNNOutput)
Pads a rnn output if it is produces by unequal length batches size(s.y)=(H/2H,sum_of_sequence_lengths)
becomes (H/2H,B,Tmax)
Special Layers
KnetLayers.MLP
— Type.MLP(h::Int...;kwargs...)
Creates a multi layer perceptron according to given hidden states. First hidden state is equal to input size and the last one equal to output size.
(m::MLP)(x;prob=0)
Runs MLP with given input x
. prob
is the dropout probability.
Keywords
winit=xavier
: weight initialization distributionbias=zeros
: bias initialization distributionactivation=ReLU()
: activation layer or functionatype=KnetLayers.arrtype
: array type for parameters. Default value is KnetArray{Float32} if you have gpu device. Otherwise it is Array{Float32}
Function Index
KnetLayers.AbstractRNN
KnetLayers.BCELoss
KnetLayers.BatchNorm
KnetLayers.CrossEntropyLoss
KnetLayers.Dense
KnetLayers.Dropout
KnetLayers.ELU
KnetLayers.Embed
KnetLayers.LeakyReLU
KnetLayers.Linear
KnetLayers.LogSoftMax
KnetLayers.LogSumExp
KnetLayers.LogisticLoss
KnetLayers.MLP
KnetLayers.Multiply
KnetLayers.RNNOutput
KnetLayers.ReLU
KnetLayers.Sigm
KnetLayers.SoftMax
KnetLayers.Tanh
KnetLayers.Conv
KnetLayers.DeConv
KnetLayers.PadRNNOutput
KnetLayers.PadSequenceArray
KnetLayers.Pool
KnetLayers.UnPool