============================================== pipes -- Unix shell command pipeline templates ============================================== .. module:: pipes :synopsis: Unix shell command pipeline templates :Purpose: Create repeatable Unix shell command pipelines. :Available In: Python 1.4 The :mod:`pipes` module implements a class to create arbitrarily complex Unix command pipelines. Inputs and outputs of the commands can be chained together as with the shell ``|`` operator, even if the individual commands need to write to or read from files instead of stdin/stdout. Passing Standard I/O Through a Pipe =================================== A very simple example, passing standard input through a pipe and receiving the results in a file looks like this: .. include:: pipes_simple_write.py :literal: :start-after: #end_pymotw_header The pipeline Template is created and then a single command, ``cat -`` is added. The command reads standard input and writes it to standard output, without modification. The second argument to ``append()`` encodes the input and output sources for the command in two characters (input, then output). Using ``-`` means the command uses standard I/O. Using ``f`` means the command needs to read from a file (as may be the case with an image processing pipeline). The ``debug()`` method toggles debugging output on and off. When debugging is enabled, the commands being run are printed and the shell is given ``set -x`` so it runs verbosely. After the pipeline is set up, a NamedTemporaryFile is created to give the pipeline somewhere to write its output. A file must always be specified as argument to ``open()``, whether reading or writing. .. {{{cog .. cog.out(run_script(cog.inFile, 'pipes_simple_write.py')) .. }}} :: $ python pipes_simple_write.py + cat - cat - >/var/folders/5q/8gk0wq888xlggz008k8dr7180000hg/T/tmpLXAYxI Some text .. {{{end}}} Reading from a pipeline works basically the same way, with a few changes to the arguments. For our example, we need to set up the contents of the input file before opening the pipline. Then we can pass that filename as input to ``open()``. .. include:: pipes_simple_read.py :literal: :start-after: #end_pymotw_header We can read the results from the pipeline directly. .. {{{cog .. cog.out(run_script(cog.inFile, 'pipes_simple_read.py')) .. }}} :: $ python pipes_simple_read.py + cat - cat - /var/folders/5q/8gk0wq888xlggz008k8dr7180000hg/T/tmpMlX74D IN=/var/folders/5q/8gk0wq888xlggz008k8dr7180000hg/T/tmpMlX74D; OUT=/var/folders/5q/8gk0wq888xlggz008k8dr7180000hg/T/tmp4mLFvP; cat $IN > $OUT rm -f /var/folders/5q/8gk0wq888xlggz008k8dr7180000hg/T/tmpMlX74D Some text .. {{{end}}} And the input and output *kind* values can be mixed, so that different steps of the pipeline use files or standard I/O as needed. .. include:: pipes_mixed_kinds.py :literal: :start-after: #end_pymotw_header The trap statements visible in the output here ensure that the temporary files created by the pipeline are cleaned up even if a task fails in the middle or the shell is killed. .. {{{cog .. cog.out(run_script(cog.inFile, 'pipes_mixed_kinds.py')) .. }}} :: $ python pipes_mixed_kinds.py + trap 'rm -f /var/folders/5q/8gk0wq888xlggz008k8dr7180000hg/T/tmpWSCqXa; exit' 1 2 3 13 14 15 + cat + IN=/var/folders/5q/8gk0wq888xlggz008k8dr7180000hg/T/tmpWSCqXa + cat /var/folders/5q/8gk0wq888xlggz008k8dr7180000hg/T/tmpWSCqXa + OUT=/var/folders/5q/8gk0wq888xlggz008k8dr7180000hg/T/tmpvccJBL + cat - + rm -f /var/folders/5q/8gk0wq888xlggz008k8dr7180000hg/T/tmpWSCqXa trap 'rm -f /var/folders/5q/8gk0wq888xlggz008k8dr7180000hg/T/tmpWSCqXa; exit' 1 2 3 13 14 15 cat >/var/folders/5q/8gk0wq888xlggz008k8dr7180000hg/T/tmpWSCqXa IN=/var/folders/5q/8gk0wq888xlggz008k8dr7180000hg/T/tmpWSCqXa; cat $IN | { OUT=/var/folders/5q/8gk0wq888xlggz008k8dr7180000hg/T/tmpvccJBL; cat - > $OUT; } rm -f /var/folders/5q/8gk0wq888xlggz008k8dr7180000hg/T/tmpWSCqXa Some text .. {{{end}}} A More Complex Example ====================== All of the examples up to this point have been fairly trivial. They were constructed to illustrate how to use ``pipes.Template()`` without depending on deep knowledge of shell scripting in general. This example is more complex, and shows how several commands can be combined to manipulate data before bringing it into Python. My `virtualenvwrapper `_ script includes a shell function for listing all of the virtual environments you have created. The function is used for tab-completion and can be called directly to list the environments, in case you forget a name. The heart of that function is a small pipeline that looks in ``$WORKON_HOME`` for directories that look like virtual environments (i.e., they have an ``activate`` script). That pipeline is: :: (cd "$WORKON_HOME"; for f in */bin/activate; do echo $f; done) \ | sed 's|^\./||' \ | sed 's|/bin/activate||' \ | sort Implemented using :mod:`pipes`, the pipeline looks like: .. include:: pipes_multistep.py :literal: :start-after: #end_pymotw_header Since each sandbox name is written to a separate line, parsing the output is easy: .. {{{cog .. cog.out(run_script(cog.inFile, 'pipes_multistep.py')) .. }}} :: $ python pipes_multistep.py SANDBOXES: ['AstronomyPictureOfTheDay', 'aspell', 'athensdocket', 'backups', 'bartender', 'billsapp', 'bpython', 'cliff', 'commandlineapp', 'csvcat', 'd765e36b407a6270', 'dh-akanda', 'dh-betauser-creator', 'dh-ceilometer', 'dh-ceilometerclient', 'dh-keystone', 'dh-openstackclient', 'dictprofilearticle', 'distutils2', 'docket', 'docket-pyparsing', 'dotfiles', 'dreamhost', 'dreamhost-lunch-and-learn', 'emacs_tools', 'extensions', 'feedcache', 'fuzzy', 'git_tools', 'hidden_stdlib', 'ical2org', 'mytweets', 'ndn-billing-usage', 'ndn-datamodels-python', 'ndn-dhc-dude', 'ndn-ndn', 'nose-testconfig', 'openstack', 'personal', 'phonetic-hashing', 'pinboard_tools', 'psfblog', 'psfboard', 'pyatl', 'pyatl-readlines', 'pycon2012', 'pycon2013-plugins', 'pycon2013-sphinx', 'pydotorg', 'pymotw', 'pymotw-book', 'pymotw-ja', 'python-dev', 'pywebdav', 'racemi', 'racemi_status', 'reporting_server', 'rst2blogger', 'rst2marsedit', 'sobell-book', 'sphinxcontrib-bitbucket', 'sphinxcontrib-fulltoc', 'sphinxcontrib-spelling', 'sphinxcontrib-sqltable', 'stevedore', 'summerfield-book', 'svnautobackup', 'virtualenvwrapper', 'website', 'wsme', 'zenofpy'] .. {{{end}}} Passing Files Through Pipelines =============================== If the input to your pipeline already exists in a file on disk, there's no need to read it into Python simply to pass it to the pipeline. You can use the ``copy()`` method to pass the file directly through the pipeline and create an output file for reading. .. include:: pipes_copy.py :literal: :start-after: #end_pymotw_header .. {{{cog .. cog.out(run_script(cog.inFile, 'pipes_copy.py')) .. }}} :: $ python pipes_copy.py + IN=lorem.txt + grep -n tortor lorem.txt IN=lorem.txt; grep -n tortor $IN >/var/folders/5q/8gk0wq888xlggz008k8dr7180000hg/T/tmpjlfxq6 3:elementum elit tortor eu quam. Duis tincidunt nisi ut ante. Nulla 6:lacus. Praesent placerat tortor sed nisl. Nunc blandit diam egestas 11:eget velit auctor tortor blandit sollicitudin. Suspendisse imperdiet .. {{{end}}} Cloning Templates ================= Once you have a pipeline template, you may want to use it multiple times or create variants without re-constructing the entire object. The ``clone()`` method makes both of these operations easy. This example constructs a simple word-counter pipeline, then prepends commands to a couple of clones to make it look for different words. .. include:: pipes_clone.py :literal: :start-after: #end_pymotw_header By prepending a custom command to each clone, we can create separate pipelines that perform the same basic function with small variations. .. {{{cog .. cog.out(run_script(cog.inFile, 'pipes_clone.py')) .. }}} :: $ python pipes_clone.py "py": 1381 "perl": 71 .. {{{end}}} .. seealso:: `pipes `_ The standard library documentation for this module. :mod:`tempfile` The tempfile module includes classes for managing temporary files. :mod:`subprocess` The subprocess module also supports chaining the inputs and outputs of processes together.