Lazily Importing Python Modules

Recently, I found myself wanting to defer importing a python module so that it wasn’t actually imported until it was explicitly used.

After quite a bit of searching I discovered TensorFlow has a utility to lazily import modules. They use it to lazily load their contrib module.

Since the code doesn’t have any dependencies, I’ll just copy it here and then discuss how it works.

# Code copied from https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/util/lazy_loader.py
"""A LazyLoader class."""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import importlib
import types


class LazyLoader(types.ModuleType):
  """Lazily import a module, mainly to avoid pulling in large dependencies.

  `contrib`, and `ffmpeg` are examples of modules that are large and not always
  needed, and this allows them to only be loaded when they are used.
  """

  # The lint error here is incorrect.
  def __init__(self, local_name, parent_module_globals, name):  # pylint: disable=super-on-old-class
    self._local_name = local_name
    self._parent_module_globals = parent_module_globals

    super(LazyLoader, self).__init__(name)

  def _load(self):
    # Import the target module and insert it into the parent's namespace
    module = importlib.import_module(self.__name__)
    self._parent_module_globals[self._local_name] = module

    # Update this object's dict so that if someone keeps a reference to the
    #   LazyLoader, lookups are efficient (__getattr__ is only called on lookups
    #   that fail).
    self.__dict__.update(module.__dict__)

    return module

  def __getattr__(self, item):
    module = self._load()
    return getattr(module, item)

  def __dir__(self):
    module = self._load()
    return dir(module)

The lazy loader works by masquerading as a python module that loads the actual module and replaces itself with the actual module when the lazy loader is accessed.

It is called as:

contrib = LazyLoader('contrib', globals(), 'tensorflow.contrib')

You can think of the above call as the lazy version of:

import tensorflow.contrib as contrib

Extending types.ModuleType ensures that the lazy module will be correctly added to globals like a real module. Implementing __getattr__ supports accessing attributes within the module and implementing __dir__ support tab complete. When either __getattr__ or __dir__ are called, the actual module is loaded, globals is updated to point to the actual module and the lazy load module object updates all of its state (__dict__) to the state of the real module so that references to the lazy load module don’t need to go through the load process each time it is accessed.

Written on October 27, 2018