pkgutil — Package Utilities¶
Purpose: | Add to the module search path for a specific package and work with resources included in a package. |
---|
The pkgutil
module includes functions for changing the import
rules for Python packages and for loading non-code resources from
files distributed within a package.
Package Import Paths¶
The extend_path()
function is used to modify the search path and
change the way sub-modules are imported from within a package so that
several different directories can be combined as though they are one.
This can be used to override installed versions of packages with
development versions, or to combine platform-specific and shared
modules into a single package namespace.
The most common way to call extend_path()
is by adding two
lines to the __init__.py
inside the package.
import pkgutil
__path__ = pkgutil.extend_path(__path__, __name__)
extend_path()
scans sys.path
for directories that include a
subdirectory named for the package given as the second argument. The
list of directories is combined with the path value passed as the
first argument and returned as a single list, suitable for use as the
package import path.
An example package called demopkg
includes two files,
__init__.py
and shared.py
. The __init__.py
file in
demopkg1
contains print
statements to show the search
path before and after it is modified, to highlight the difference.
import pkgutil
import pprint
print('demopkg1.__path__ before:')
pprint.pprint(__path__)
print()
__path__ = pkgutil.extend_path(__path__, __name__)
print('demopkg1.__path__ after:')
pprint.pprint(__path__)
print()
The extension
directory, with add-on features for demopkg
,
contains three more source files. There is an __init__.py
at each
directory level, and a not_shared.py
.
$ find extension -name '*.py'
extension/__init__.py
extension/demopkg1/__init__.py
extension/demopkg1/not_shared.py
This simple test program imports the demopkg1
package.
import demopkg1
print('demopkg1 :', demopkg1.__file__)
try:
import demopkg1.shared
except Exception as err:
print('demopkg1.shared : Not found ({})'.format(err))
else:
print('demopkg1.shared :', demopkg1.shared.__file__)
try:
import demopkg1.not_shared
except Exception as err:
print('demopkg1.not_shared: Not found ({})'.format(err))
else:
print('demopkg1.not_shared:', demopkg1.not_shared.__file__)
When this test program is run directly from the command line, the
not_shared
module is not found.
Note
The full file system paths in these examples have been shortened to emphasize the parts that change.
$ python3 pkgutil_extend_path.py
demopkg1.__path__ before:
['.../demopkg1']
demopkg1.__path__ after:
['.../demopkg1']
demopkg1 : .../demopkg1/__init__.py
demopkg1.shared : .../demopkg1/shared.py
demopkg1.not_shared: Not found (No module named 'demopkg1.not_sh
ared')
However, if the extension
directory is added to the
PYTHONPATH
and the program is run again, different results are
produced.
$ PYTHONPATH=extension python3 pkgutil_extend_path.py
demopkg1.__path__ before:
['.../demopkg1']
demopkg1.__path__ after:
['.../demopkg1',
'.../extension/demopkg1']
demopkg1 : .../demopkg1/__init__.py
demopkg1.shared : .../demopkg1/shared.py
demopkg1.not_shared: .../extension/demopkg1/not_shared.py
The version of demopkg1
inside the extension
directory has
been added to the search path, so the not_shared
module is
found there.
Extending the path in this manner is useful for combining platform-specific versions of packages with common packages, especially if the platform-specific versions include C extension modules.
Development Versions of Packages¶
While developing enhancements to a project, it is common to need to test changes to an installed package. Replacing the installed copy with a development version may be a bad idea, since it is not necessarily correct and other tools on the system are likely to depend on the installed package.
A completely separate copy of the package could be configured in a
development environment using virtualenv
or venv
,
but for small modifications the overhead of setting up a virtual
environment with all of the dependencies may be excessive.
Another option is to use pkgutil
to modify the module search
path for modules that belong to the package under development. In this
case, however, the path must be reversed so development version
overrides the installed version.
Given a package demopkg2
containing an __init__.py
and
overloaded.py
, with the function under development located in
demopkg2/overloaded.py
. The installed version contains
def func():
print('This is the installed version of func().')
and demopkg2/__init__.py
contains
import pkgutil
__path__ = pkgutil.extend_path(__path__, __name__)
__path__.reverse()
reverse()
is used to ensure that any directories added to the
search path by pkgutil
are scanned for imports before the
default location.
This program imports demopkg2.overloaded
and calls func()
.
import demopkg2
print('demopkg2 :', demopkg2.__file__)
import demopkg2.overloaded
print('demopkg2.overloaded:', demopkg2.overloaded.__file__)
print()
demopkg2.overloaded.func()
Running it without any special path treatment produces output from the
installed version of func()
.
$ python3 pkgutil_devel.py
demopkg2 : .../demopkg2/__init__.py
demopkg2.overloaded: .../demopkg2/overloaded.py
This is the installed version of func().
A development directory containing
$ find develop/demopkg2 -name '*.py'
develop/demopkg2/__init__.py
develop/demopkg2/overloaded.py
and a modified version of overloaded
def func():
print('This is the development version of func().')
will be loaded when the test program is run with the develop
directory in the search path.
$ PYTHONPATH=develop python3 pkgutil_devel.py
demopkg2 : .../demopkg2/__init__.py
demopkg2.overloaded: .../develop/demopkg2/overloaded.py
This is the development version of func().
Managing Paths with PKG Files¶
The first example illustrated how to extend the search path using
extra directories included in the PYTHONPATH
. It is also
possible to add to the search path using *.pkg
files containing
directory names. PKG files are similar to the PTH files used by the
site
module. They can contain directory names, one per line, to
be added to the search path for the package.
Another way to structure the platform-specific portions of the
application from the first example is to use a separate directory for
each operating system, and include a .pkg
file to extend the
search path.
This example uses the same demopkg1
files, and also includes
the following files.
$ find os_* -type f
os_one/demopkg1/__init__.py
os_one/demopkg1/not_shared.py
os_one/demopkg1.pkg
os_two/demopkg1/__init__.py
os_two/demopkg1/not_shared.py
os_two/demopkg1.pkg
The PKG files are named demopkg1.pkg
to match the package
being extended. They both contain one line.
demopkg
This demo program shows the version of the module being imported.
import demopkg1
print('demopkg1:', demopkg1.__file__)
import demopkg1.shared
print('demopkg1.shared:', demopkg1.shared.__file__)
import demopkg1.not_shared
print('demopkg1.not_shared:', demopkg1.not_shared.__file__)
A simple wrapper script can be used to switch between the two packages.
#!/bin/sh
export PYTHONPATH=os_${1}
echo "PYTHONPATH=$PYTHONPATH"
echo
python3 pkgutil_os_specific.py
When run with "one"
or "two"
as the arguments, the path is
adjusted.
$ ./with_os.sh one
PYTHONPATH=os_one
demopkg1.__path__ before:
['.../demopkg1']
demopkg1.__path__ after:
['.../demopkg1',
'.../os_one/demopkg1',
'demopkg']
demopkg1: .../demopkg1/__init__.py
demopkg1.shared: .../demopkg1/shared.py
demopkg1.not_shared: .../os_one/demopkg1/not_shared.py
$ ./with_os.sh two
PYTHONPATH=os_two
demopkg1.__path__ before:
['.../demopkg1']
demopkg1.__path__ after:
['.../demopkg1',
'.../os_two/demopkg1',
'demopkg']
demopkg1: .../demopkg1/__init__.py
demopkg1.shared: .../demopkg1/shared.py
demopkg1.not_shared: .../os_two/demopkg1/not_shared.py
PKG files can appear anywhere in the normal search path, so a single PKG file in the current working directory could also be used to include a development tree.
Nested Packages¶
For nested packages, it is only necessary to modify the path of the top-level package. For example, with this directory structure
$ find nested -name '*.py'
nested/__init__.py
nested/second/__init__.py
nested/second/deep.py
nested/shallow.py
Where nested/__init__.py
contains
import pkgutil
__path__ = pkgutil.extend_path(__path__, __name__)
__path__.reverse()
and a development tree like
$ find develop/nested -name '*.py'
develop/nested/__init__.py
develop/nested/second/__init__.py
develop/nested/second/deep.py
develop/nested/shallow.py
Both the shallow
and deep
modules contain a simple
function to print out a message indicating whether or not they come
from the installed or development version.
This test program exercises the new packages.
import nested
import nested.shallow
print('nested.shallow:', nested.shallow.__file__)
nested.shallow.func()
print()
import nested.second.deep
print('nested.second.deep:', nested.second.deep.__file__)
nested.second.deep.func()
When pkgutil_nested.py
is run without any path manipulation, the
installed version of both modules are used.
$ python3 pkgutil_nested.py
nested.shallow: .../nested/shallow.py
This func() comes from the installed version of nested.shallow
nested.second.deep: .../nested/second/deep.py
This func() comes from the installed version of nested.second.de
ep
When the develop
directory is added to the path, the development
version of both functions override the installed versions.
$ PYTHONPATH=develop python3 pkgutil_nested.py
nested.shallow: .../develop/nested/shallow.py
This func() comes from the development version of nested.shallow
nested.second.deep: .../develop/nested/second/deep.py
This func() comes from the development version of nested.second.
deep
Package Data¶
In addition to code, Python packages can contain data files such as
templates, default configuration files, images, and other supporting
files used by the code in the package. The get_data()
function
gives access to the data in the files in a format-agnostic way, so it
does not matter if the package is distributed as an EGG, part of a
frozen binary, or regular files on the file system.
With a package pkgwithdata
containing a templates
directory
$ find pkgwithdata -type f
pkgwithdata/__init__.py
pkgwithdata/templates/base.html
The file pkgwithdata/templates/base.html
contains a simple HTML
template.
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html> <head>
<title>PyMOTW Template</title>
</head>
<body>
<h1>Example Template</h1>
<p>This is a sample data file.</p>
</body>
</html>
This program uses get_data()
to retrieve the template contents
and print them out.
import pkgutil
template = pkgutil.get_data('pkgwithdata', 'templates/base.html')
print(template.decode('utf-8'))
The arguments to get_data()
are the dotted name of the package,
and a filename relative to the top of the package. The return value
is a byte sequence, so it is decoded from UTF-8 before being printed.
$ python3 pkgutil_get_data.py
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html> <head>
<title>PyMOTW Template</title>
</head>
<body>
<h1>Example Template</h1>
<p>This is a sample data file.</p>
</body>
</html>
get_data()
is distribution format-agnostic because it uses the
import hooks defined in PEP 302 to access the package contents.
Any loader that provides the hooks can be used, including the ZIP
archive importer in zipfile
.
import pkgutil
import zipfile
import sys
# Create a ZIP file with code from the current directory
# and the template using a name that does not appear on the
# local filesystem.
with zipfile.PyZipFile('pkgwithdatainzip.zip', mode='w') as zf:
zf.writepy('.')
zf.write('pkgwithdata/templates/base.html',
'pkgwithdata/templates/fromzip.html',
)
# Add the ZIP file to the import path.
sys.path.insert(0, 'pkgwithdatainzip.zip')
# Import pkgwithdata to show that it comes from the ZIP archive.
import pkgwithdata
print('Loading pkgwithdata from', pkgwithdata.__file__)
# Print the template body
print('\nTemplate:')
data = pkgutil.get_data('pkgwithdata', 'templates/fromzip.html')
print(data.decode('utf-8'))
This example uses PyZipFile.writepy()
to create a ZIP archive
containing a copy of the pkgwithdata
package, including a
renamed version of the template file. It then adds the ZIP archive to
the import path before using pkgutil
to load the template and
print it. Refer to the discussion of zipfile
for more details
about using writepy()
.
$ python3 pkgutil_get_data_zip.py
Loading pkgwithdata from
pkgwithdatainzip.zip/pkgwithdata/__init__.pyc
Template:
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html> <head>
<title>PyMOTW Template</title>
</head>
<body>
<h1>Example Template</h1>
<p>This is a sample data file.</p>
</body>
</html>
See also
- Standard library documentation for pkgutil
- virtualenv – Ian Bicking’s virtual environment script.
distutils
– Packaging tools from Python standard library.- setuptools – Next-generation packaging tools.
- PEP 302 – Import Hooks
zipfile
– Create importable ZIP archives.zipimport
– Importer for packages in ZIP archives.