Nodes¶

Node¶

class magnet.nodes.Node(*args, **kwargs)[source]¶

Abstract base class that defines MagNet’s Node implementation.

A Node is a ‘self-aware Module’. It can dynamically parametrize itself in runtime.

For instance, a Linear Node can infer the input features automatically when first called; a Conv Node can infer the dimensionality (1, 2, 3) of the input automatically.

MagNet’s Nodes strive to help the developer as much as possible by finding the right hyperparameter values automatically. Ideally, the developer shouldn’t need to define anything except the basic architecture and the inputs and outputs.

The arguments passed to the constructor are stored in a _args attribute as a dictionary.

This is later modified by the build() method which gets automatically called on the first forward pass.

Keyword Arguments:
	name (str) – Class Name

build(*args, **kwargs)[source]¶

Builds the Node. Ideally, should not be called manually.

When an unbuilt module is first called, this method gets invoked.

_mul_list(n)[source]¶

A useful overload of the * operator that can create similar copies of the node.

Parameters:	n (tuple or list) –

The modifier n should be used to change the arguments of the node in a meaningful way.

For instance, in the case of a Linear node, the items in n can be interpreted as the output dimensions of each layer.

Core¶

class magnet.nodes.Lambda(fn, **kwargs)[source]¶

Wraps a Node around any function.

Parameters:	fn (callable) – The function which gets called in the forward pass

Examples:

>>> import magnet.nodes as mn

>>> import torch

>>> model = mn.Lambda(lambda x: x.mean())

>>> model(torch.arange(5, dtype=torch.float)).item()
2.0

>>> def subtract(x, y):
>>>     return x - y

>>> model = mn.Lambda(subtract)

>>> model(2 * torch.ones(1), torch.ones(1)).item()
1.0

class magnet.nodes.Conv(c=None, k=3, p='half', s=1, d=1, g=1, b=True, ic=None, act='relu', bn=False, **kwargs)[source]¶

Applies a convolution over an input tensor.

Parameters:

c (int) – Number of channels produced by the convolution. Default: Inferred
k (int or tuple) – Size of the convolving kernel. Default: 3
p (int, tuple or str) – Zero-padding added to both sides of the input. Default: 'half'
s (int or tuple) – Stride of the convolution. Default: 1
d (int or tuple) – Spacing between kernel elements. Default: 1
g (int) – Number of blocked connections from input channels to output channels. Default: 1
b (bool) – If True, adds a learnable bias to the output. Default: True
ic (int) – Number of channels in the input image. Default: Inferred
act (str or None) – The activation function to use. Default: 'relu'

p can be conveniently used for 'half', 'same' or 'double' padding to half, same or double the image size respectively. The arguments are accordingly inferred at runtime. For 'half' padding, the output channels (if not provided) are set to twice the input channels to make up for the lost information and vice-versa for the double padding. For 'same' padding, the output channels are kept equal to the input channels. In all three cases, the dilation is set to 1 and the stride is modified as required.
c is inferred from the second dimension of the input tensor.
act is set to 'relu' by default unlike the PyTorch implementation where activation functions need to be seperately defined. Take caution to manually set the activation to None, where needed.

Note

The dimensions (1, 2 or 3) of the convolutional kernels are inferred from the corresponding shape of the input tensor.

Note

One can also create multiple Nodes using the convinient multiplication (*) operation.

Multiplication with an integer \(n\), gives \(n\) copies of the Node.

Multiplication with a list or tuple of integers, \((c_1, c_2, ..., c_n)\) gives \(n\) copies of the Node with c set to \(c_i\)

Shape: - Input: \((N, C_{in}, *)\) where * is any non-zero number of trailing dimensions. - Output: \((N, C_{out}, *)\)

Variables:	layer (nn.Module) – The Conv module built from torch.nn

Examples:

>>> import torch

>>> from torch import nn

>>> import magnet.nodes as mn
>>> from magnet.utils import summarize

>>> # A Conv layer with 32 channels and half padding
>>> model = mn.Conv(32)

>>> model(torch.randn(4, 16, 28, 28)).shape
torch.Size([4, 32, 14, 14])

>>> # Alternatively, the 32 in the constructor may be omitted
>>> # since it is inferred on runtime.

>>> # The same conv layer with 'double' padding
>>> model = mn.Conv(p='double')

>>> model(torch.randn(4, 16, 28, 28)).shape
torch.Size([4, 8, 56, 56])

>>> layers = mn.Conv() * 3
[Conv(), Conv(), Conv()]

>>> model = nn.Sequential(*layers)
>>> summarize(model)
+-------+------------+----------------------+
| Node  |   Shape    | Trainable Parameters |
+-------+------------+----------------------+
| input | 16, 28, 28 |          0           |
+-------+------------+----------------------+
| Conv  | 32, 14, 14 |        4,640         |
+-------+------------+----------------------+
| Conv  |  64, 7, 7  |        18,496        |
+-------+------------+----------------------+
| Conv  | 128, 4, 4  |        73,856        |
+-------+------------+----------------------+
Total Trainable Parameters: 96,992

class magnet.nodes.Linear(o=1, b=True, flat=True, i=None, act='relu', bn=False, **kwargs)[source]¶

Applies a linear transformation to the incoming tensor

Parameters:

o (int or tuple) – Output dimensions. Default: \(1\)
b (bool) – Whether to include a bias term. Default: True
flat (bool) – Whether to flatten out the input to 2 dimensions. Default: True
i (int) – Input dimensions. Default: Inferred
act (str or None) – The activation function to use. Default: 'relu'
bn (bool) – Whether to use Batch Normalization immediately after the layer. Default: False

flat is used by default to flatten the input to a vector. This is useful, say in the case of CNNs where an 3-D image based output with multiple channels needs to be fed to several dense layers.
o is inferred from the last dimension of the input tensor.
act is set to ‘relu’ by default unlike the PyTorch implementation where activation functions need to be seperately defined. Take caution to manually set the activation to None, where needed.

Note

One can also create multiple Nodes using the convinient multiplication (*) operation.

Multiplication with an integer \(n\), gives \(n\) copies of the Node.

Multiplication with a list or tuple of integers, \((o_1, o_2, ..., o_n)\) gives \(n\) copies of the Node with o set to \(o_i\)

Note

If o is a tuple, the output features are its product and the output is inflated to this shape.

Shape:

If flat is True

Input: \((N, *)\) where \(*\) means any number of trailing dimensions
Output: \((N, *)\)

Else

Input: \((N, *, in\_features)\) where \(*\) means any number of trailing dimensions
Output: \((N, *, out\_features)\) where all but the last dimension are the same shape as the input.

Variables:	layer (nn.Module) – The Linear module built from torch.nn

Examples:

>>> import torch

>>> from torch import nn

>>> import magnet.nodes as mn
>>> from magnet.utils import summarize

>>> # A Linear mapping to 10-dimensional space
>>> model = mn.Linear(10)

>>> model(torch.randn(64, 3, 28, 28)).shape
torch.Size([64, 10])

>>> # Don't flatten the input
>>> model = mn.Linear(10, flat=False)

>>> model(torch.randn(64, 3, 28, 28)).shape
torch.Size([64, 3, 28, 10])

>>> # Make a Deep Neural Network
>>> # Don't forget to turn the activation to None in the final layer
>>> layers = mn.Linear() * (10, 50) + [mn.Linear(10, act=None)]
[Linear(), Linear(), Linear()]

>>> model = nn.Sequential(*layers)
>>> summarize(model)
+------+---------+--------------------+----------------------------------------------------+
| Node |  Shape  |Trainable Parameters|                   Arguments                        |
+------+---------+--------------------+----------------------------------------------------+
|input |3, 28, 28|         0          |                                                    |
+------+---------+--------------------+----------------------------------------------------+
|Linear|   10    |       23,530       |bn=False, act=relu, i=2352, flat=True, b=True, o=10 |
+------+---------+--------------------+----------------------------------------------------+
|Linear|   50    |        550         |bn=False, act=relu, i=10, flat=True, b=True, o=50   |
+------+---------+--------------------+----------------------------------------------------+
|Linear|   10    |        510         |bn=False, act=None, i=50, flat=True, b=True, o=10   |
+------+---------+--------------------+----------------------------------------------------+
Total Trainable Parameters: 24,590

class magnet.nodes.RNN(h, n=1, b=False, bi=False, act='tanh', d=0, batch_first=False, i=None, **kwargs)[source]¶

Applies a multi-layer RNN with to an input tensor.

Parameters:

h (int, Required) – The number of features in the hidden state h
n (int) – Number of layers. Default: 1
b (bool) – Whether to include a bias term. Default: True
bi (bool) – If True, becomes a bidirectional RNN. Default: False
act (str or None) – The activation function to use. Default: 'tanh'
d (int) – The dropout probability of the outputs of each layer. Default: 0
batch_first (False) – If True, then the input and output tensors are provided as (batch, seq, feature). Default: False
i (int) – Input dimensions. Default: Inferred

i is inferred from the last dimension of the input tensor.

Note

One can also create multiple Nodes using the convinient multiplication (*) operation.

Multiplication with an integer \(n\), gives \(n\) copies of the Node.

Multiplication with a list or tuple of integers, \((h_1, h_2, ..., h_n)\) gives \(n\) copies of the Node with h set to \(h_i\)

Variables:	layer (nn.Module) – The RNN module built from torch.nn

Examples:

>>> import torch

>>> from torch import nn

>>> import magnet.nodes as mn
>>> from magnet.utils import summarize

>>> # A recurrent layer with 32 hidden dimensions
>>> model = mn.RNN(32)

>>> model(torch.randn(7, 4, 300))[0].shape
torch.Size([7, 4, 32])

>>> # Attach a linear head
>>> model = nn.Sequential(model, mn.Linear(1000, act=None))

class magnet.nodes.LSTM(h, n=1, b=False, bi=False, d=0, batch_first=False, i=None, **kwargs)[source]¶

Applies a multi-layer LSTM with to an input tensor.

See mn.RNN for more details

class magnet.nodes.GRU(h, n=1, b=False, bi=False, d=0, batch_first=False, i=None, **kwargs)[source]¶

Applies a multi-layer GRU with to an input tensor.

See mn.RNN for more details

class magnet.nodes.BatchNorm(e=1e-05, m=0.1, a=True, track=True, i=None, **kwargs)[source]¶

Applies Batch Normalization to the input tensor e=1e-05, m=0.1, a=True, track=True, i=None

Parameters:

e (float) – A small value added to the denominator for numerical stability. Default: 1e-5
m (float or None) – The value used for the running_mean and running_var computation. Can be set to None for cumulative moving average (i.e. simple average). Default: 0.1
a (bool) – Whether to have learnable affine parameters. Default: True
track (bool) – Whether to track the running mean and variance. Default: True
i (int) – Input channels. Default: Inferred

i is inferred from the second dimension of the input tensor.

Note

The dimensions (1, 2 or 3) of the running mean and variance are inferred from the corresponding shape of the input tensor.

Note

One can also create multiple Nodes using the convinient multiplication (*) operation.

Multiplication with an integer \(n\), gives \(n\) copies of the Node.

Multiplication with a list or tuple of integers, \((i_1, i_2, ..., i_n)\) gives \(n\) copies of the Node with i set to \(i_i\)

Shape:

Input: \((N, C, *)\) where \(*\) means any number of trailing dimensions
Output: \((N, C, *)\) (same shape as input)

Variables:	layer (nn.Module) – The BatchNorm module built from `torch.nn`

Examples:

>>> import torch

>>> from torch import nn

>>> import magnet.nodes as mn
>>> from magnet.utils import summarize

>>> # A Linear mapping to 10-dimensional space
>>> model = mn.Linear(10)

>>> model(torch.randn(64, 3, 28, 28)).shape
torch.Size([64, 10])

>>> # Don't flatten the input
>>> model = mn.Linear(10, flat=False)

>>> model(torch.randn(64, 3, 28, 28)).shape
torch.Size([64, 3, 28, 10])

>>> # Make a Deep Neural Network
>>> # Don't forget to turn the activation to None in the final layer
>>> layers = mn.Linear() * (10, 50) + [mn.Linear(10, act=None)]
[Linear(), Linear(), Linear()]

>>> model = nn.Sequential(*layers)
>>> summarize(model)
+------+---------+--------------------+----------------------------------------------------+
| Node |  Shape  |Trainable Parameters|                   Arguments                        |
+------+---------+--------------------+----------------------------------------------------+
|input |3, 28, 28|         0          |                                                    |
+------+---------+--------------------+----------------------------------------------------+
|Linear|   10    |       23,530       |bn=False, act=relu, i=2352, flat=True, b=True, o=10 |
+------+---------+--------------------+----------------------------------------------------+
|Linear|   50    |        550         |bn=False, act=relu, i=10, flat=True, b=True, o=50   |
+------+---------+--------------------+----------------------------------------------------+
|Linear|   10    |        510         |bn=False, act=None, i=50, flat=True, b=True, o=10   |
+------+---------+--------------------+----------------------------------------------------+
Total Trainable Parameters: 24,590