hypernet_reconciliation
Bases: fabrication
The hypernet based parameter reconciliation function.
It performs the hypernet based parameter reconciliation, and returns the reconciled parameter matrix of shape (n, D). This class inherits from the reconciliation class (i.e., the fabrication class in the module directory).
...
Notes
Formally, given the input parameter vector \(\mathbf{w} \in {R}^l\) from length \(l\), the hypernet based parameter reconciliation function projects it to a high-dimensional parameter matrix of shape (n, D) via a hypernet model, e.g., MLP, as follows $$ \begin{equation} \psi(\mathbf{w}) = \text{HyperNet}(\mathbf{w}) = \mathbf{W} \in {R}^{n \times D}, \end{equation} $$ where \(\text{HyperNet}(\cdot)\) denotes a randomly initialized MLP model with frozen parameters.
For the hybernet based parameter reconciliation function, the parameter length \(l\) should be assigned manually in the initialization method, and it cannot be calculated based on the dimension parameters \(n\) and \(D\) anymore.
Also in the current project, we use a frozen MLP with 1 hidden layer as the hypernet for parameter reconciliation. Meanwhile, the current implementation of this reconciliation function also allows the dynamic MLP with learnable parameters, which can be turned on or turned off by chanting the "static" parameter as True or False, respectively.
Attributes:
Name | Type | Description |
---|---|---|
name |
str, default = 'hypernet_reconciliation'
|
Name of the hypernet parameter reconciliation function |
r |
int, default = 2
|
Submatrix rank parameter. |
Methods:
Name | Description |
---|---|
__init__ |
It initializes the hypernet parameter reconciliation function. |
calculate_l |
It calculates the length of required parameters for the reconciliation function. |
forward |
It implements the abstract forward method declared in the base reconciliation class. |
Source code in tinybig/reconciliation/hypernet_reconciliation.py
|
|
__init__(name='hypernet_reconciliation', l=64, hidden_dim=128, static=True, net=None, *args, **kwargs)
The initialization method of the hypernet parameter reconciliation function.
It initializes a hypernet parameter reconciliation function object. This method will also call the initialization method of the base class as well.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name |
Name of the hypernet based parameter reconciliation function. |
'hypernet_reconciliation'
|
|
l |
int
|
The learnable parameter length, which needs to be assigned manually. |
64
|
hidden_dim |
int
|
The hidden layer dimension of the hypernet MLP. |
128
|
static |
bool
|
The static hypernet indicator. If state=True, the hypernet MLP is frozen; if state=False, the hypernet MLP is dynamic and contains learnable parameters as well. |
True
|
net |
The hypernet MLP model. |
None
|
Returns:
Type | Description |
---|---|
object
|
The hypernet parameter reconciliation function object. |
Source code in tinybig/reconciliation/hypernet_reconciliation.py
calculate_l(n=None, D=None)
The required parameter number calculation method.
It calculates the number of required learnable parameters, i.e., \(l\), of the parameter reconciliation function.
Notes
For the hybernet based parameter reconciliation function, the parameter length \(l\) should be assigned manually in the initialization method, and it cannot be calculated based on the dimension parameters \(n\) and \(D\) anymore.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n |
int
|
The dimension of the output space. |
None
|
D |
int
|
The dimension of the intermediate expansion space. |
None
|
Returns:
Type | Description |
---|---|
int
|
The number of required learnable parameters. |
Source code in tinybig/reconciliation/hypernet_reconciliation.py
forward(n, D, w, device='cpu', *args, **kwargs)
The forward method of the parameter reconciliation function.
It applies the hypernet based parameter reconciliation operation to the input parameter vector \(\mathbf{w}\), and returns the reconciled parameter matrix of shape (n, D) subject to rank parameters \(r\) as follows: $$ \begin{equation} \psi(\mathbf{w}) = \text{HyperNet}(\mathbf{w}) = \mathbf{W} \in {R}^{n \times D}, \end{equation} $$ where \(\text{HyperNet}(\cdot)\) denotes a randomly initialized MLP model with frozen parameters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n |
int
|
The dimension of the output space. |
required |
D |
int
|
The dimension of the intermediate expansion space. |
required |
w |
Parameter
|
The learnable parameters of the model. |
required |
device |
Device to perform the parameter reconciliation. |
'cpu'
|
Returns:
Type | Description |
---|---|
Tensor
|
The reconciled parameter matrix of shape (n, D). |
Source code in tinybig/reconciliation/hypernet_reconciliation.py
initialize_hypernet(l, n, D, hidden_dim, static=True, device='cpu')
The hypernet MLP initialization method.
It initializes the hypernet MLP model based on the provided parameters, whose architecture dimensions can be denoted as follows: $$ \begin{equation} [l] \to [hidden\_dim] \to [n \times D], \end{equation} $$ which can projects any inputs of length \(l\) to the desired output of length \(n \times D\).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
l |
int
|
The input dimension of the hypernet MLP, which equals to the parameter length \(l\). |
required |
n |
int
|
The output space dimension, which together with the expansion dimension \(D\) defines the output dimension of the bypernet MLP as \(n \times D\). |
required |
D |
int
|
The expansion space dimension, which together with the output space dimension \(n\) defines the output dimension of the bypernet MLP as \(n \times D\). |
required |
hidden_dim |
int
|
The hidden layer dimension of the hypernet MLP. |
required |
static |
bool
|
The static hypernet indicator. If state=True, the hypernet MLP is frozen; if state=False, the hypernet MLP is dynamic and contains learnable parameters as well. |
True
|
device |
str
|
The device to host the hypernet and perform the parameter reconciliation. |
'cpu'
|
Returns:
Type | Description |
---|---|
None
|
This function initialize the self.net parameter and doesn't have any return values. |