In-Memory Data Structures¶
Python includes several standard programming data structures as built-in types (list, tuple, dictionary, and set). Most applications won’t need any other structures, but when they do the standard library delivers.
For large amounts of data, it may be more efficient to use an array instead of a list. Since the array is limited to a single data type, it can use a more compact memory representation than a general purpose list. As an added benefit, arrays can be manipulated using many of the same methods as a list, so it may be possible to replaces lists with arrays in to your application without a lot of other changes.
If you need to maintain a sorted list as you add and remove values, check out heapq. By using the functions in heapq to add or remove items from a list, you can maintain the sort order of the list with low overhead.
Another option for building sorted lists or arrays is bisect. bisect uses a binary search to find the insertion point for new items, and is an alternative to repeatedly sorting a list that changes frequently.
Although the built-in list can simulate a queue using the insert() and pop() methods, it isn’t thread-safe. For true ordered communication between threads you should use a Queue. multiprocessing includes a version of a Queue that works between processes, making it easier to port between the modules.
collections includes implementations of several data structures that extend those found in other modules. For example, Deque is a double-ended queue, and allows you to add or remove items from either end. The defaultdict is a dictionary that responds with a default value if a key is missing. And namedtuple extends the normal tuple to give each member item an attribute name in addition to a numerical index.
If you are working with data from another application, perhaps coming from a binary file or stream of data, you will find struct useful for decoding the data into Python’s native types for easier manipulation.