gromo.growing_module.AdditionGrowingModule#

class gromo.growing_module.AdditionGrowingModule(post_addition_function: Module = Identity(), previous_modules: list[AdditionGrowingModule | GrowingModule] = None, next_modules: list[AdditionGrowingModule | GrowingModule] = None, allow_growing: bool = False, tensor_s_shape: tuple[int, int] = None, device: device | None = None, name: str = None)[source]#

Module to connect multiple modules with an addition operation. This module does not perform the addition operation, it is done by the user.

add_next_module(module: AdditionGrowingModule | GrowingModule) → None[source]#

Add a module to the next modules of the current module.

Parameters:: module – next module to add

add_previous_module(module: AdditionGrowingModule | GrowingModule) → None[source]#

Add a module to the previous modules of the current module.

Parameters:: module – previous module to add

compute_previous_m_update() → tuple[Tensor, int][source]#

Compute the update of the tensor M for the input of all previous modules.

Returns:

torch.Tensor – update of the tensor M
int – number of samples used to compute the update

compute_previous_s_update() → tuple[Tensor, int][source]#

Compute the update of the tensor S for the input of all previous modules.

Returns:

torch.Tensor – update of the tensor S
int – number of samples used to compute the update

compute_s_update() → tuple[Tensor, int][source]#

Compute the update of the tensor S. Should be added to the type of layer.

Returns:

torch.Tensor – update of the tensor S
int – number of samples used to compute the update

delete_update(include_previous: bool = False) → None[source]#: Delete the update of the optimal added parameters.

forward(x: Tensor) → Tensor[source]#

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

grow()[source]#: Function to call after growing previous or next modules.

init_computation() → None[source]#: Initialize the computation of the optimal added parameters.

parameters(recurse: bool = True) → Iterator[Parameter][source]#

Return an iterator over module parameters.

This is typically passed to an optimizer.

Parameters:: recurse (bool) – if True, then yields parameters of this module and all submodules. Otherwise, yields only parameters that are direct members of this module.
Yields:: Parameter – module parameter

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for param in model.parameters():
>>>     print(type(param), param.size())
<class 'torch.Tensor'> (20L,)
<class 'torch.Tensor'> (20L, 1L, 5L, 5L)

projected_v_goal() → Tensor[source]#

Compute the projected gradient of the goal with respect to the activity of the layer.

dLoss/dA_proj := dLoss/dA - dW B[-1] where A is the pre-activation vector of the layer, and dW the optimal delta for all the previous layers

Returns:: projected gradient of the goal with respect to the activity of the next layer dLoss/dA - dW B[-1]
Return type:: torch.Tensor

reset_computation() → None[source]#: Reset the computation of the optimal added parameters.

set_next_modules(next_modules: list[AdditionGrowingModule | GrowingModule]) → None[source]#

Set the next modules of the current module.

Parameters:: next_modules – list of next modules

set_previous_modules(previous_modules: list[AdditionGrowingModule | GrowingModule]) → None[source]#

Set the previous modules of the current module.

Parameters:: previous_modules – list of previous modules

sum_in_features(with_bias: bool = False) → int[source]#

Count total in_features of previous modules

Returns:: sum of previous in_features
Return type:: int

sum_out_features() → int[source]#

Count total out_features of next modules

Returns:: sum of next out_features
Return type:: int

update_scaling_factor(scaling_factor: Tensor | float) → None[source]#

Update the scaling factor of all next modules and the _next_module_scaling_factor of the previous modules. Does only work if previous and next modules are GrowingModule.

Parameters:: scaling_factor (torch.Tensor | float) – scaling factor to apply to the optimal delta

update_size() → None[source]#: Update the input and output size of the module