=============================================
fileinput -- Process lines from input streams
=============================================
.. module:: fileinput
:synopsis: Process lines from input streams.
:Purpose: Create command-line filter programs to process lines from input streams.
:Available In: 1.5.2 and later
The fileinput module is a framework for creating command line programs
for processing text files in a filter-ish manner.
Converting M3U files to RSS
===========================
For example, the
m3utorss_ app I recently wrote for my friend `Patrick
`_ to convert some of his demo
recordings into a podcastable format.
The inputs to the program are one or more m3u file listing the mp3 files to be
distributed. The output is a single blob of XML that looks like an RSS feed
(output is written to stdout, for simplicity). To process the input, I need to
iterate over the list of filenames and:
* Open each file.
* Read each line of the file.
* Figure out if the line refers to an mp3 file.
* If it does, extract the information from the mp3 file needed for the RSS feed.
* Print the output.
I could have written all of that file handling out by hand. It isn't that
complicated, and with some testing I'm sure I could even get the error
handling right. But with the fileinput module, I don't need to worry about
that. I just write something like:
.. literalinclude:: fileinput_example.py
:lines: 30-37
The ``fileinput.input()`` function takes as argument a list of
filenames to examine. If the list is empty, the module reads data from
standard input. The function returns an iterator which returns
individual lines from the text files being processed. So, all I have
to do is loop over each line, skipping blanks and comments, to find
the references to mp3 files.
Here's the complete program:
.. include:: fileinput_example.py
:literal:
:start-after: #end_pymotw_header
and its output:
.. {{{cog
.. cog.out(run_script(cog.inFile, 'fileinput_example.py sample_data.m3u'))
.. }}}
::
$ python fileinput_example.py sample_data.m3u
Sample podcast feed
Generated for PyMOTW
Thu Feb 21 06:35:49 2013
http://www.doughellmann.com/PyMOTW/
-
episode-one.mp3
-
episode-two.mp3
.. {{{end}}}
Progress Meta-data
==================
In the previous example, I did not care what file or line number we
are processing in the input. For other tools (grep-like searching, for
example) you might. The fileinput module includes functions for
accessing that information (``filename()``, ``filelineno()``,
``lineno()``, etc.).
.. include:: fileinput_grep.py
:literal:
:start-after: #end_pymotw_header
We can use this basic pattern matching loop to find the occurances of
"fileinput" in the source for the examples.
.. {{{cog
.. cog.out(run_script(cog.inFile, 'fileinput_grep.py fileinput *.py'))
.. }}}
::
$ python fileinput_grep.py fileinput *.py
fileinput_change_subnet.py:10:import fileinput
fileinput_change_subnet.py:17:for line in fileinput.input(files, inplace=True):
fileinput_change_subnet_noisy.py:10:import fileinput
fileinput_change_subnet_noisy.py:18:for line in fileinput.input(files, inplace=True):
fileinput_change_subnet_noisy.py:19: if fileinput.isfirstline():
fileinput_change_subnet_noisy.py:20: sys.stderr.write('Started processing %s\n' % fileinput.filename())
fileinput_example.py:6:"""Example for fileinput module.
fileinput_example.py:10:import fileinput
fileinput_example.py:30:for line in fileinput.input(sys.argv[1:]):
fileinput_grep.py :10:import fileinput
fileinput_grep.py :16:for line in fileinput.input(sys.argv[2:]):
fileinput_grep.py :18: if fileinput.isstdin():
fileinput_grep.py :22: print fmt.format(filename=fileinput.filename(),
fileinput_grep.py :23: lineno=fileinput.filelineno(),
.. {{{end}}}
We can also pass input to it through stdin.
.. {{{cog
.. cog.out(run_script(cog.inFile, 'cat *.py | python fileinput_grep.py fileinput', interpreter=None))
.. }}}
::
$ cat *.py | python fileinput_grep.py fileinput
10:import fileinput
17:for line in fileinput.input(files, inplace=True):
29:import fileinput
37:for line in fileinput.input(files, inplace=True):
38: if fileinput.isfirstline():
39: sys.stderr.write('Started processing %s\n' % fileinput.filename())
51:"""Example for fileinput module.
55:import fileinput
75:for line in fileinput.input(sys.argv[1:]):
96:import fileinput
102:for line in fileinput.input(sys.argv[2:]):
104: if fileinput.isstdin():
108: print fmt.format(filename=fileinput.filename(),
109: lineno=fileinput.filelineno(),
.. {{{end}}}
In-place Filtering
==================
Another common file processing operation is to modify the contents.
For example, a Unix hosts file might need to be updated if a subnet
range changes.
.. include:: etc_hosts
:literal:
The safe way to make the change automatically is to create a new file
based on the input and then replace the original with the edited copy.
fileinput supports this automatically using the *inplace* option.
.. include:: fileinput_change_subnet.py
:literal:
:start-after: #end_pymotw_header
.. {{{cog
.. path('PyMOTW/fileinput/etc_hosts').copy('PyMOTW/fileinput/etc_hosts.txt')
.. cog.out(run_script(cog.inFile, 'fileinput_change_subnet.py 172.16.177 172.16.178 etc_hosts.txt'))
.. }}}
::
$ python fileinput_change_subnet.py 172.16.177 172.16.178 etc_hosts.txt
.. {{{end}}}
Although the script uses ``print``, no output is produced to stdout
because fileinput maps stdout to the file being overwritten.
.. include:: etc_hosts.txt
:literal:
Before processing begins, a backup file is created using the original
name plus ``.bak``. The backup file is removed when the input is
closed.
.. include:: fileinput_change_subnet_noisy.py
:literal:
:start-after: #end_pymotw_header
.. {{{cog
.. path('PyMOTW/fileinput/etc_hosts').copy('PyMOTW/fileinput/etc_hosts.txt')
.. cog.out(run_script(cog.inFile, 'fileinput_change_subnet_noisy.py 172.16.177 172.16.178 etc_hosts.txt'))
.. }}}
::
$ python fileinput_change_subnet_noisy.py 172.16.177 172.16.178 etc_host\
s.txt
Started processing etc_hosts.txt
Directory contains: ['etc_hosts.txt', 'etc_hosts.txt.bak']
Finished processing
Directory contains: ['etc_hosts.txt']
.. {{{end}}}
.. seealso::
`fileinput `_
The standard library documentation for this module.
`Patrick Bryant `_
Atlanta-based singer/song-writer.
m3utorss_
Script to convert m3u files listing MP3s to an RSS file
suitable for use as a podcast feed.
:ref:`xml.etree.ElementTree.creating`
More details of using ElementTree to produce XML.
:ref:`article-file-access`
Other modules for working with files.
:ref:`article-text-processing`
Other modules for working with text.
.. _m3utorss: http://www.doughellmann.com/projects/m3utorss