Basics of Module Importing

Copyright © Graham Dumpleton

This article provides a basic overview of the key components of the module importing system in mod_python.

<!> Note that aspects of how the module importer works will change in mod_python 3.3, with the introduction of a totally new module importer implementation. This document should therefore only be referred to in relation to mod_python versions up to and including version 3.2.

Module Import Function

The module importing system in mod_python is all based upon the Python function import_module() contained in the mod_python.apache module. This function is used to import modules which have been specified by any of the Python*Handler, PythonHandlerModule, Python*Filter or PythonConnectionHandler directives. It is not however used for any modules imported using the PythonImport directive. In this final case, the standard Python C API import functions are used.

The import_module() function may also be used explicitly from within any Python code executing within the context of mod_python. For example, the import_module() function is, except for mod_python 3.2, used by mod_python.publisher to import individual Python code files held within the document tree. Having loaded such a code file as a module, mod_python.publisher will then map URLs into calls against distinct functions contained within the module.

One of the main reasons for the existance of the import_module() method is to facilitate the provision of a mechanism for the automatic reloading of modules when the code file kept on disk has changed. Unfortunately the effectiveness of this is limited in versions of mod_python prior to version 3.3, due to shortcomings in the design of the module importing system, along with a number of bugs in the current implementation. The introduction of mod_python 3.3 will however see vast improvements in this area and other aspects of the module importing system.

The Handler Directives

The handler directives are placed into the main Apache configuration file or into a ".htaccess" file contained within the appropriate directory in the document tree. These directives are used to specify the module containing the handler function to be executed for a particular phase of processing.

For example, to specify that the mod_python supplied mod_python.publisher module should be used for the content handling phase of any request where a ".py" extension were used, the main Apache configuration file would need to contain:

<Directory /some/path>
AddHandler python-program .py
PythonHandler mod_python.publisher
</Directory>

Alternatively, if you wished to use your own custom handler for the content handling phase of any request made against the directory, and it was stored within the same directory for which mod_python was being enabled, you might instead use:

<Directory /some/path>
SetHandler python-program
PythonHandler myhandler
</Directory>

In the case of mod_python.publisher, the module would normally be installed into the "site-packages" directory of the Python distribution. As such, it will be found by a search of the default Python module search path.

For the case of the custom handler however, it isn't installed into the "site-packages" directory, nor in any other directory contained in the Python module search path. This means that it will not be able to be located when Python is asked to load it.

To avoid this problem, mod_python will automatically insert at the head of the Python module search path, the root directory within the document tree for which the PythonHandler directive was specified. This having been done, the module containing the custom handler will now be able to be found.

If a custom handler is not placed into the root directory of the document tree where mod_python is enabled, nor is it located within any directory listed in the Python module search path, it will be necessary to override the Python module search path to contain the directory where it is located.

The Python module search path can be overridden using the PythonPath directive. For example, if the custom handler were actually located within a subdirectory of the root directory where mod_python was enabled, the main Apache configuration would then need to be defined as:

<Directory /some/path>
PythonPath "['/some/path/modules']+sys.path"
AddHandler python-program .py
PythonHandler myhandler
</Directory>

As the PythonPath directive will replace the prior value of the Python module search path, it is necessary when merely wishing to extend it to reference the prior value. In effect, the value specified for the PythonPath directive is evalulated and thus one can use the Python language itself to construct the new value for the path. The final value must be a list of strings where each string is the path name of a directory to be searched.

Note that setting the PythonPath directive explicitly will result in the root directory of the document tree for which mod_python was enabled, no longer being added into the Python module search path. That is, the value specified by the PythonPath directive takes precedence for that directory and anything below it. Using the PythonHandler directive in a subdirectory will not even undo this, even when used within the context of a distinct interpreter. The PythonPath directive should there before used with great caution and not at all if possible.

Explicit Import of Modules

When using mod_python, instead of using the import statement to import modules, the import_module() function can be used. The benefits of using the import_module() function are that the exact location of the module can be specified, avoiding the need for the directory containing the module to be added into the Python module search path. Using the import_module() function, it is also easier to import arbitrarily named modules the names of which will not be known of in advance, as the name of the module is supplied as an argument of the function and doesn't need to be hard coded as a syntactical argument of the import statement.

The prototype for the import_module() function is as follows:

import_module(module_name, autoreload=1, log=0, path=None)

The only required parameter is that of the module name. In supplying only the module name, a search will be made of the Python module search path. If the "path" parameter is supplied, it should be a list of directories to search for the module. To load a module from a particular location, the "path" parameter would be a list containing just the directory containing the desired module.

The import_module() method can feasibly be used at global scope within a module, or within an actual handler function executed as a consequence of a request. To import a module from the same directory as the handler, the location can be derived from the special __file__ variable present within a module.

   1 from mod_python import apache
   2 import os
   3 
   4 directory = os.path.dirname(__file__)
   5 module1 = apache.import_module("module1", path=[directory])
   6 
   7 def handler(req):
   8     module2 = apache.import_module("module2", path=[directory])
   9     package1 = apache.import_module("package1", path=[directory])
  10     package2 = apache.import_module("package1.package2", path=[directory])
  11     module3 = apache.import_module("package1.module3", path=[directory])
  12     ...

As well as file based modules, the import_module() can also be used to import the root of a package, or a sub package/module of a package. Whereas when using the import statement to import a sub package/module still requires access to the module to be via the root package name, the import_module() function returns a direct reference to the sub package/module.

Auto Reload Mechanism

In the context of a module specified using a handler directive, the intent of the automatic module reloading mechanism is that when the code file corresponding to the handler is changed on disk, that this will be detected the next time a request arrives which would require the handler to be executed. Having detected this change, the module will first be reloaded and only then will the handler be executed.

Whether or not automatic module reloading is enabled for a handler specified using a handler directive is controlled by the PythonAutoReload directive. By default automatic module reloading is enabled at this level and it would need to be explicitly disabled if it wasn't required, such as in a production environment.

When using the import_module() function explicitly from within a handler module, whether the automatic module reloading mechanism is enabled is controlled through the "autoreload" argument. By default, automatic module reloading is enabled at this level as well and would need to be explicitly disabled to stop it from occurring. This can only be done by supplying a value to the "autoreload" argument which evaluates to false. The PythonAutoReload directive has absolutely no affect in this circumstance.

For a module being imported by an explicit call to the import_module() function from within the context of a handler being executed in response to a request, whether the module is up to date will be evaluated for each request. If the module has been changed on disk, it will be reimported at the time of the handler being executed.

If the importing of the module by an explicit call to the import_module() function is performed at global scope within a module, whether the child module has changed and whether it needs to be reimported, is only determined at the time that the parent module is first imported and when the parent module is subsequently reimported. Note that modules imported using the import statement are never reloaded, that which already exists in sys.modules is used even if the parent module were reloaded.

Logging of Module Imports

For a module specified using a handler directive, the fact that the module is imported or subsequently reimported will only be logged to the Apache log file if the PythonDebug option is enabled. By default this option is disabled.

Enabling of the debug option has other consequences though besides logging information to the log file about imports. Specifically, if debugging is enabled and an error occurs within a handler, the details of the Python error and the trace back will be returned in the error page to the client making the request.

Where explicit use is made of the import_module() function, to have whether or not a module is being imported can only be enabled by supplying a value as the "log" argument which evaluates to true. The PythonDebug directive has absolutely no affect in this circumstance.


CategoryModPython

ModPython/Articles/BasicsOfModuleImporting (last edited 2006-11-10 07:44:39 by GrahamDumpleton)