Python includes several standard programming data structures, such as
part of its built-in types. Many applications do not require other
structures, but when they do, the standard library provides powerful
and well-tested versions that are ready to be used.
enum module provides an implementation of an enumeration
type, with iteration and comparison capabilities. It can be used to
create well-defined symbols for values, instead of using literal
strings or integers.
collections module includes implementations of several data
structures that extend those found in other modules. For example,
Deque is a double-ended queue, which allows the addition or
removal of items from either end. The
defaultdict is a
dictionary that responds with a default value if a key is missing,
OrderedDict remembers the sequence in which items are
added to it.
namedtuple extends the normal
tuple to give each member item an attribute name in addition
to a numeric index.
For large amounts of data, an
array may make more efficient use
of memory than a
list. Since the
array is limited
to a single data type, it can use a more compact memory representation
than a general-purpose
list. At the same time,
array instances can be manipulated using many of the same
methods as a
list, so it may be possible to replace a
list with an
array in an application without a lot
of other changes.
Sorting items in a sequence is a fundamental aspect of data
list includes a
but sometimes it is more efficient to maintain a list in sorted order
without re-sorting it each time its contents are changed. The
heapq modify the contents of a list while
preserving the sort order of the list with low overhead.
Another option for building sorted lists or arrays is
It uses a binary search to find the insertion point for new items, and
is an alternative to repeatedly sorting a list that changes
Although the built-in
list can simulate a queue using the
pop() methods, it is not thread-safe. For
true ordered communication between threads use the
multiprocessing includes a version of a
that works between processes, making it easier to convert a
multithreaded program to use processes instead.
struct is useful for decoding data from another application,
perhaps coming from a binary file or stream of data, into Python’s
native types for easier manipulation.
This chapter covers two modules related to memory management. For
highly interconnected data structures, such as graphs and trees, use
weakref to maintain references while still allowing the garbage
collector to clean up objects after they are no longer needed. Use the
copy for duplicating data structures and
their contents, including making recursive copies with
Debugging data structures can be time consuming, especially when
wading through printed output of large sequences or dictionaries. Use
pprint to create easy-to-read representations that can be
printed to the console or written to a log file for easier debugging.
Finally, if the available types do not meet the requirements,
subclass one of the native types and customize it, or build a new
container type using one of the abstract base classes defined in
collections as a starting point.
- enum – Enumeration Type
- collections — Container Data Types
- array — Sequence of Fixed-type Data
- heapq – Heap Sort Algorithm
- bisect — Maintain Lists in Sorted Order
- queue — Thread-Safe FIFO Implementation
- struct — Binary Data Structures
- weakref — Impermanent References to Objects
- copy — Duplicate Objects
- pprint — Pretty-Print Data Structures