Nodes¶
Node¶
-
class
magnet.nodes.
Node
(*args, **kwargs)[source]¶ Abstract base class that defines MagNet’s Node implementation.
A Node is a ‘self-aware Module’. It can dynamically parametrize itself in runtime.
For instance, a
Linear
Node can infer the input features automatically when first called; aConv
Node can infer the dimensionality (1, 2, 3) of the input automatically.MagNet’s Nodes strive to help the developer as much as possible by finding the right hyperparameter values automatically. Ideally, the developer shouldn’t need to define anything except the basic architecture and the inputs and outputs.
The arguments passed to the constructor are stored in a
_args
attribute as a dictionary.This is later modified by the
build()
method which gets automatically called on the first forward pass.Keyword Arguments: name (str) – Class Name -
build
(*args, **kwargs)[source]¶ Builds the Node. Ideally, should not be called manually.
When an unbuilt module is first called, this method gets invoked.
-
_mul_list
(n)[source]¶ A useful overload of the * operator that can create similar copies of the node.
Parameters: n (tuple or list) – The modifier n should be used to change the arguments of the node in a meaningful way.
For instance, in the case of a Linear node, the items in n can be interpreted as the output dimensions of each layer.
-
Core¶
-
class
magnet.nodes.
Lambda
(fn, **kwargs)[source]¶ Wraps a Node around any function.
Parameters: fn (callable) – The function which gets called in the forward pass Examples:
>>> import magnet.nodes as mn >>> import torch >>> model = mn.Lambda(lambda x: x.mean()) >>> model(torch.arange(5, dtype=torch.float)).item() 2.0 >>> def subtract(x, y): >>> return x - y >>> model = mn.Lambda(subtract) >>> model(2 * torch.ones(1), torch.ones(1)).item() 1.0
-
class
magnet.nodes.
Conv
(c=None, k=3, p='half', s=1, d=1, g=1, b=True, ic=None, act='relu', bn=False, **kwargs)[source]¶ Applies a convolution over an input tensor.
Parameters: - c (int) – Number of channels produced by the convolution. Default: Inferred
- k (int or tuple) – Size of the convolving kernel. Default:
3
- p (int, tuple or str) – Zero-padding added to both sides
of the input. Default:
'half'
- s (int or tuple) – Stride of the convolution. Default:
1
- d (int or tuple) – Spacing between kernel elements. Default:
1
- g (int) – Number of blocked connections from input channels
to output channels. Default:
1
- b (bool) – If
True
, adds a learnable bias to the output. Default:True
- ic (int) – Number of channels in the input image. Default: Inferred
- act (str or None) – The activation function to use.
Default:
'relu'
p
can be conveniently used for'half'
,'same'
or'double'
padding to half, same or double the image size respectively. The arguments are accordingly inferred at runtime. For'half'
padding, the output channels (if not provided) are set to twice the input channels to make up for the lost information and vice-versa for the double padding. For'same'
padding, the output channels are kept equal to the input channels. In all three cases, the dilation is set to1
and the stride is modified as required.c
is inferred from the second dimension of the input tensor.act
is set to'relu'
by default unlike the PyTorch implementation where activation functions need to be seperately defined. Take caution to manually set the activation toNone
, where needed.
Note
The dimensions (1, 2 or 3) of the convolutional kernels are inferred from the corresponding shape of the input tensor.
Note
One can also create multiple Nodes using the convinient multiplication (
*
) operation.Multiplication with an integer \(n\), gives \(n\) copies of the Node.
Multiplication with a list or tuple of integers, \((c_1, c_2, ..., c_n)\) gives \(n\) copies of the Node with
c
set to \(c_i\)Shape: - Input: \((N, C_{in}, *)\) where * is any non-zero number of trailing dimensions. - Output: \((N, C_{out}, *)\)
Variables: layer (nn.Module) – The Conv module built from torch.nn Examples:
>>> import torch >>> from torch import nn >>> import magnet.nodes as mn >>> from magnet.utils import summarize >>> # A Conv layer with 32 channels and half padding >>> model = mn.Conv(32) >>> model(torch.randn(4, 16, 28, 28)).shape torch.Size([4, 32, 14, 14]) >>> # Alternatively, the 32 in the constructor may be omitted >>> # since it is inferred on runtime. >>> # The same conv layer with 'double' padding >>> model = mn.Conv(p='double') >>> model(torch.randn(4, 16, 28, 28)).shape torch.Size([4, 8, 56, 56]) >>> layers = mn.Conv() * 3 [Conv(), Conv(), Conv()] >>> model = nn.Sequential(*layers) >>> summarize(model) +-------+------------+----------------------+ | Node | Shape | Trainable Parameters | +-------+------------+----------------------+ | input | 16, 28, 28 | 0 | +-------+------------+----------------------+ | Conv | 32, 14, 14 | 4,640 | +-------+------------+----------------------+ | Conv | 64, 7, 7 | 18,496 | +-------+------------+----------------------+ | Conv | 128, 4, 4 | 73,856 | +-------+------------+----------------------+ Total Trainable Parameters: 96,992
-
class
magnet.nodes.
Linear
(o=1, b=True, flat=True, i=None, act='relu', bn=False, **kwargs)[source]¶ Applies a linear transformation to the incoming tensor
Parameters: - o (int or tuple) – Output dimensions. Default: \(1\)
- b (bool) – Whether to include a bias term. Default:
True
- flat (bool) – Whether to flatten out the input to 2 dimensions.
Default:
True
- i (int) – Input dimensions. Default: Inferred
- act (str or None) – The activation function to use.
Default:
'relu'
- bn (bool) – Whether to use Batch Normalization immediately after
the layer. Default:
False
flat
is used by default to flatten the input to a vector. This is useful, say in the case of CNNs where an 3-D image based output with multiple channels needs to be fed to several dense layers.o
is inferred from the last dimension of the input tensor.act
is set to ‘relu’ by default unlike the PyTorch implementation where activation functions need to be seperately defined. Take caution to manually set the activation to None, where needed.
Note
One can also create multiple Nodes using the convinient multiplication (*) operation.
Multiplication with an integer \(n\), gives \(n\) copies of the Node.
Multiplication with a list or tuple of integers, \((o_1, o_2, ..., o_n)\) gives \(n\) copies of the Node with
o
set to \(o_i\)Note
If
o
is a tuple, the output features are its product and the output is inflated to this shape.- Shape:
- If
flat
is True - Input: \((N, *)\) where \(*\) means any number of trailing dimensions
- Output: \((N, *)\)
- Else
- Input: \((N, *, in\_features)\) where \(*\) means any number of trailing dimensions
- Output: \((N, *, out\_features)\) where all but the last dimension are the same shape as the input.
- If
Variables: layer (nn.Module) – The Linear module built from torch.nn Examples:
>>> import torch >>> from torch import nn >>> import magnet.nodes as mn >>> from magnet.utils import summarize >>> # A Linear mapping to 10-dimensional space >>> model = mn.Linear(10) >>> model(torch.randn(64, 3, 28, 28)).shape torch.Size([64, 10]) >>> # Don't flatten the input >>> model = mn.Linear(10, flat=False) >>> model(torch.randn(64, 3, 28, 28)).shape torch.Size([64, 3, 28, 10]) >>> # Make a Deep Neural Network >>> # Don't forget to turn the activation to None in the final layer >>> layers = mn.Linear() * (10, 50) + [mn.Linear(10, act=None)] [Linear(), Linear(), Linear()] >>> model = nn.Sequential(*layers) >>> summarize(model) +------+---------+--------------------+----------------------------------------------------+ | Node | Shape |Trainable Parameters| Arguments | +------+---------+--------------------+----------------------------------------------------+ |input |3, 28, 28| 0 | | +------+---------+--------------------+----------------------------------------------------+ |Linear| 10 | 23,530 |bn=False, act=relu, i=2352, flat=True, b=True, o=10 | +------+---------+--------------------+----------------------------------------------------+ |Linear| 50 | 550 |bn=False, act=relu, i=10, flat=True, b=True, o=50 | +------+---------+--------------------+----------------------------------------------------+ |Linear| 10 | 510 |bn=False, act=None, i=50, flat=True, b=True, o=10 | +------+---------+--------------------+----------------------------------------------------+ Total Trainable Parameters: 24,590
-
class
magnet.nodes.
RNN
(h, n=1, b=False, bi=False, act='tanh', d=0, batch_first=False, i=None, **kwargs)[source]¶ Applies a multi-layer RNN with to an input tensor.
Parameters: - h (int, Required) – The number of features in the hidden state h
- n (int) – Number of layers. Default:
1
- b (bool) – Whether to include a bias term. Default:
True
- bi (bool) – If
True
, becomes a bidirectional RNN. Default:False
- act (str or None) – The activation function to use.
Default:
'tanh'
- d (int) – The dropout probability of the outputs of each layer.
Default:
0
- batch_first (False) – If
True
, then the input and output tensors are provided as(batch, seq, feature)
. Default:False
- i (int) – Input dimensions. Default: Inferred
i
is inferred from the last dimension of the input tensor.
Note
One can also create multiple Nodes using the convinient multiplication (*) operation.
Multiplication with an integer \(n\), gives \(n\) copies of the Node.
Multiplication with a list or tuple of integers, \((h_1, h_2, ..., h_n)\) gives \(n\) copies of the Node with
h
set to \(h_i\)Variables: layer (nn.Module) – The RNN module built from torch.nn Examples:
>>> import torch >>> from torch import nn >>> import magnet.nodes as mn >>> from magnet.utils import summarize >>> # A recurrent layer with 32 hidden dimensions >>> model = mn.RNN(32) >>> model(torch.randn(7, 4, 300))[0].shape torch.Size([7, 4, 32]) >>> # Attach a linear head >>> model = nn.Sequential(model, mn.Linear(1000, act=None))
-
class
magnet.nodes.
LSTM
(h, n=1, b=False, bi=False, d=0, batch_first=False, i=None, **kwargs)[source]¶ Applies a multi-layer LSTM with to an input tensor.
See mn.RNN for more details
-
class
magnet.nodes.
GRU
(h, n=1, b=False, bi=False, d=0, batch_first=False, i=None, **kwargs)[source]¶ Applies a multi-layer GRU with to an input tensor.
See mn.RNN for more details
-
class
magnet.nodes.
BatchNorm
(e=1e-05, m=0.1, a=True, track=True, i=None, **kwargs)[source]¶ Applies Batch Normalization to the input tensor e=1e-05, m=0.1, a=True, track=True, i=None
Parameters: - e (float) – A small value added to the denominator
for numerical stability. Default:
1e-5
- m (float or None) – The value used for the running_mean
and running_var computation. Can be set to
None
for cumulative moving average (i.e. simple average). Default:0.1
- a (bool) – Whether to have learnable affine parameters.
Default:
True
- track (bool) – Whether to track the running mean and variance.
Default:
True
- i (int) – Input channels. Default: Inferred
i
is inferred from the second dimension of the input tensor.
Note
The dimensions (1, 2 or 3) of the running mean and variance are inferred from the corresponding shape of the input tensor.
Note
One can also create multiple Nodes using the convinient multiplication (*) operation.
Multiplication with an integer \(n\), gives \(n\) copies of the Node.
Multiplication with a list or tuple of integers, \((i_1, i_2, ..., i_n)\) gives \(n\) copies of the Node with
i
set to \(i_i\)- Shape:
- Input: \((N, C, *)\) where \(*\) means any number of trailing dimensions
- Output: \((N, C, *)\) (same shape as input)
Variables: layer (nn.Module) – The BatchNorm module built from torch.nn
Examples:
>>> import torch >>> from torch import nn >>> import magnet.nodes as mn >>> from magnet.utils import summarize >>> # A Linear mapping to 10-dimensional space >>> model = mn.Linear(10) >>> model(torch.randn(64, 3, 28, 28)).shape torch.Size([64, 10]) >>> # Don't flatten the input >>> model = mn.Linear(10, flat=False) >>> model(torch.randn(64, 3, 28, 28)).shape torch.Size([64, 3, 28, 10]) >>> # Make a Deep Neural Network >>> # Don't forget to turn the activation to None in the final layer >>> layers = mn.Linear() * (10, 50) + [mn.Linear(10, act=None)] [Linear(), Linear(), Linear()] >>> model = nn.Sequential(*layers) >>> summarize(model) +------+---------+--------------------+----------------------------------------------------+ | Node | Shape |Trainable Parameters| Arguments | +------+---------+--------------------+----------------------------------------------------+ |input |3, 28, 28| 0 | | +------+---------+--------------------+----------------------------------------------------+ |Linear| 10 | 23,530 |bn=False, act=relu, i=2352, flat=True, b=True, o=10 | +------+---------+--------------------+----------------------------------------------------+ |Linear| 50 | 550 |bn=False, act=relu, i=10, flat=True, b=True, o=50 | +------+---------+--------------------+----------------------------------------------------+ |Linear| 10 | 510 |bn=False, act=None, i=50, flat=True, b=True, o=10 | +------+---------+--------------------+----------------------------------------------------+ Total Trainable Parameters: 24,590
- e (float) – A small value added to the denominator
for numerical stability. Default: