Interactive online version:

Parameter sharing in Haiku#

Introduction#

In Haiku, parameter reuse is determined uniquely by module instance names, i.e., if a module instance has the same name as another module instance, they share parameters.

Unless specified, module names are automatically determined by Haiku based on the module class name (following a pattern that was established in TensorFlow 1 with Sonnet V1). More in detail, module naming follows these rules:

Module names are assigned when the module instance is constructed. Unless a module instance name is provided as an argument to the constructor, Haiku generates one from the current module class name (basically: to_snake_case(CurrentClassName)).
If the module instance name doesn’t end in a _N (where N is a number) and another module instance with the same name already exists, Haiku adds an incremental number to the end of the new module instance name (e.g. module_1).
When two modules are nested (i.e., a module instance is constructed inside another module’s class definition), then the inner module name will be prepended by the outer module name and, possibly (see the next point), the outer module current method being called. The constructor (i.e., __init__) is replaced by the tilde ~ symbol.
If the calling method name is __call__ this will be ignored (the method name will be prepended by the outer module name only).
When there are multiple layers of nesting, the previous rule is applied at each level of nesting, and each inner module name is based on the module name and calling method name of the module immediately preceding the current module in the hierarchy of calls.

Let’s see how this works with a practical example.

Flat modules (no nesting)#

This section covers parameter sharing when the modules are not nested.

[4]:

#@title Imports and accessory functions
import functools
import haiku as hk
import jax
import jax.numpy as jnp


def parameter_shapes(params):
  """Make printing parameters a little more readable."""
  return jax.tree_util.tree_map(lambda p: p.shape, params)


def transform_and_print_shapes(fn, x_shape=(2, 3)):
  """Print name and shape of the parameters."""
  rng = jax.random.PRNGKey(42)
  x = jnp.ones(x_shape)

  transformed_fn = hk.transform(fn)
  params = transformed_fn.init(rng, x)
  print('\nThe name and shape of the parameters are:')
  print(parameter_shapes(params))

def assert_all_equal(params_1, params_2):
  assert all(jax.tree_util.tree_leaves(
      jax.tree_util.tree_map(lambda a, b: (a == b).all(), params_1, params_2)))

[6]:

w_init = hk.initializers.TruncatedNormal(stddev=1)

class SimpleModule(hk.Module):
  """A simple module class with one variable."""

  def __init__(self, output_channels, name=None):
    super().__init__(name)
    assert isinstance(output_channels, int)
    self._output_channels = output_channels

  def __call__(self, x):
    w_shape = (x.shape[-1], self._output_channels)
    w = hk.get_parameter("w", w_shape, x.dtype, init=w_init)
    return jnp.dot(x, w)

[ ]:

def f(x):
  # This instance will be named `a_simple_module`.
  simple = SimpleModule(output_channels=2)
  simple_out = simple(x)  # implicitly calls module_install.__call__()
  print(f'The name assigned to "simple" is: "{simple.module_name}".')
  return simple_out

transform_and_print_shapes(f)

WARNING:absl:No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)

The name assigned to "simple" is: "simple_module".

The name and shape of the parameters are:
{'simple_module': {'w': (3, 2)}}

Great! Here we see that indeed if we create a SimpleModule instance and do not specify a name, Haiku assigns to it the name a_simple_module. This is also reflected in the parameters associated to the module.

What happens if we instantiate SimpleModule twice though? Does Haiku assign to both instances the same name?

[ ]:

def f(x):
  # This instance will be named `a_simple_module`.
  simple_one = SimpleModule(output_channels=2)
  # This instance will be named `a_simple_module_1`.
  simple_two = SimpleModule(output_channels=2)
  first_out = simple_one(x)
  second_out = simple_two(x)
  print(f'The name assigned to "simple_one" is: "{simple_one.module_name}".')
  print(f'The name assigned to "simple_two" is: "{simple_two.module_name}".')
  return first_out + second_out

transform_and_print_shapes(f)

The name assigned to "simple_one" is: "simple_module".
The name assigned to "simple_two" is: "simple_module_1".

The name and shape of the parameters are:
{'simple_module': {'w': (3, 2)}, 'simple_module_1': {'w': (3, 2)}}

As expected Haiku is smart enough to differentiate the two instances and avoid accidental parameter sharing: the second instance is named a_simple_module_1 and each instance has its own set of parameters. Good!

But what if we wanted to share parameters? In this case, we would have to instantiate the module only once and call it multiple times. Let’s see how this works:

[ ]:

def f(x):
  # This instance will be named `a_simple_module`.
  simple_one = SimpleModule(output_channels=2)
  first_out = simple_one(x)
  second_out = simple_one(x)  # share parameters w/ previous call
  print(f'The name assigned to "simple_one" is: "{simple_one.module_name}".')
  return first_out + second_out

transform_and_print_shapes(f)

The name assigned to "simple_one" is: "simple_module".

The name and shape of the parameters are:
{'simple_module': {'w': (3, 2)}}

Nested modules#

In this section we’ll see what happens when we nest one hk.Module into another.

[ ]:

class NestedModule(hk.Module):
  """A module class with a nested module created in the constructor."""

  def __init__(self, output_channels, name=None):
    super().__init__(name)
    assert isinstance(output_channels, int)
    self._output_channels = output_channels
    self.inner_simple = SimpleModule(self._output_channels)

  def __call__(self, x):
    w_shape = (x.shape[-1], self._output_channels)
    # Another variable that is also called `w`.
    w = hk.get_parameter("w", w_shape, x.dtype, init=w_init)
    return jnp.dot(x, w) + self.inner_simple(x)

[ ]:

def f(x):
  # This will be named `a_nested_module` and the SimpleModule instance created
  # inside it will be named `a_nested_module/a_simple_module`.
  nested = NestedModule(output_channels=2)
  nested_out = nested(x)
  print('The name assigned to outer module (i.e., "nested") is: '
        f'"{nested.module_name}".')
  print('The name assigned to the inner module (i.e., inside "nested") is: "'
        f'{nested.inner_simple.module_name}".')
  return nested_out

transform_and_print_shapes(f)

The name assigned to outer module (i.e., "nested") is: "nested_module".
The name assigned to the inner module (i.e., inside "nested") is: "nested_module/~/simple_module".

The name and shape of the parameters are:
{'nested_module': {'w': (3, 2)}, 'nested_module/~/simple_module': {'w': (3, 2)}}

As expected, the inner module name depends on: (a) the outer module name; and (b) the outer module’s method being called.

Note also how the outer module’s constructor name __init__ is replaced by a ~ in the parameter names. If the inner module instance was created inside the __call__ method of the outer module, the inner module instance name would have been 'a_nested_module/a_simple_module'.

In this example we defined all the modules from scratch, but the same holds for any of the modules and networks defined in Haiku, e.g., hk.Linear, hk.nets.MLP, … . If you are curious, see what happens if you assign to self.inner_simple an instance of hk.Linear instead of SimpleModule.

Let’s try now multiple levels of nesting:

[ ]:

class TwiceNestedModule(hk.Module):
  """A module class with a nested module containing a nested module."""

  def __init__(self, output_channels, name=None):
    super().__init__(name)
    assert isinstance(output_channels, int)
    self._output_channels = output_channels
    self.inner_nested = NestedModule(self._output_channels)

  def __call__(self, x):
    w_shape = (x.shape[-1], self._output_channels)
    w = hk.get_parameter("w", w_shape, x.dtype, init=w_init)
    return jnp.dot(x, w) + self.inner_nested(x)

[ ]:

def f(x):
  """Create the module instances and inspect their names."""
  # Instantiate a NestedModule instance. This will be named `a_nested_module`.
  # The SimpleModule instance created inside it will be named
  # a_nested_module/a_simple_module`.
  outer = TwiceNestedModule(output_channels=2)
  outer_out = outer(x)
  print(f'The name assigned to the most outer class is: "{outer.module_name}".')
  print('The name assigned to the module inside "double_nested" is: "'
        f'{outer.inner_nested.module_name}".')
  print('The name assigned to the module inside it is "'
        f'{outer.inner_nested.inner_simple.module_name}".')
  return outer_out

transform_and_print_shapes(f)

The name assigned to the most outer class is: "twice_nested_module".
The name assigned to the module inside "double_nested" is: "twice_nested_module/~/nested_module".
The name assigned to the module inside it is "twice_nested_module/~/nested_module/~/simple_module".

The name and shape of the parameters are:
{'twice_nested_module': {'w': (3, 2)}, 'twice_nested_module/~/nested_module': {'w': (3, 2)}, 'twice_nested_module/~/nested_module/~/simple_module': {'w': (3, 2)}}

Great, this also works as expected: the full hierarchy of module names and calls is reflected in the inner module names.

Haiku documentation

Parameter sharing in Haiku

Contents

Haiku documentation

Parameter sharing in Haiku

Contents

Parameter sharing in Haiku#

Introduction#

Flat modules (no nesting)#

Nested modules#

Multitransform: merge the parameters without sharing them#

Sharing parameters between transformed functions#

Case 1: All modules have the same names, and the same shape#

Case 2: Common modules have the same names, and the same shape#

Case 3: Common modules have the same names, but different shapes#