base64 – Encode binary data into ASCII characters

Purpose:The base64 module contains functions for translating binary data into a subset of ASCII suitable for transmission using plaintext protocols.
Available In:1.4 and later

The base64, base32, and base16 encodings convert 8 bit bytes to values with 6, 5, or 4 bits of useful data per byte, allowing non-ASCII bytes to be encoded as ASCII characters for transmission over protocols that require plain ASCII, such as SMTP. The base values correspond to the length of the alphabet used in each encoding. There are also URL-safe variations of the original encodings that use slightly different results.

Base 64 Encoding

A basic example of encoding some text looks like this:

import base64

# Load this source file and strip the header.
initial_data = open(__file__, 'rt').read().split('#end_pymotw_header')[1]

encoded_data = base64.b64encode(initial_data)

num_initial = len(initial_data)
padding = { 0:0, 1:2, 2:1 }[num_initial % 3]

print '%d bytes before encoding' % num_initial
print 'Expect %d padding bytes' % padding
print '%d bytes after encoding' % len(encoded_data)
print
#print encoded_data
for i in xrange((len(encoded_data)/40)+1):
    print encoded_data[i*40:(i+1)*40]

The output shows the 558 bytes of the original source expand to 744 bytes after being encoded.

Note

There are no carriage returns in the output produced by the library, so I have broken the encoded data up artificially to make it fit better on the page.

$ python base64_b64encode.py

113 bytes before encoding
Expect 1 padding bytes
152 bytes after encoding

CgppbXBvcnQgYmFzZTY0CgojIExvYWQgdGhpcyBz
b3VyY2UgZmlsZSBhbmQgc3RyaXAgdGhlIGhlYWRl
ci4KaW5pdGlhbF9kYXRhID0gb3BlbihfX2ZpbGVf
XywgJ3J0JykucmVhZCgpLnNwbGl0KCc=

Base 64 Decoding

The encoded string can be converted back to the original form by taking 4 bytes and converting them to the original 3, using a reverse lookup. The b64decode() function does that for you.

import base64

original_string = 'This is the data, in the clear.'
print 'Original:', original_string

encoded_string = base64.b64encode(original_string)
print 'Encoded :', encoded_string

decoded_string = base64.b64decode(encoded_string)
print 'Decoded :', decoded_string

The encoding process looks at each sequence of 24 bits in the input (3 bytes) and encodes those same 24 bits spread over 4 bytes in the output. The last two characters, the ==, are padding because the number of bits in the original string was not evenly divisible by 24 in this example.

$ python base64_b64decode.py

Original: This is the data, in the clear.
Encoded : VGhpcyBpcyB0aGUgZGF0YSwgaW4gdGhlIGNsZWFyLg==
Decoded : This is the data, in the clear.

URL-safe Variations

Because the default base64 alphabet may use + and /, and those two characters are used in URLs, it became necessary to specify an alternate encoding with substitutes for those characters. The + is replaced with a -, and / is replaced with underscore (_). Otherwise, the alphabet is the same.

import base64

for original in [ chr(251) + chr(239), chr(255) * 2 ]:
    print 'Original         :', repr(original)
    print 'Standard encoding:', base64.standard_b64encode(original)
    print 'URL-safe encoding:', base64.urlsafe_b64encode(original)
    print
$ python base64_urlsafe.py

Original         : '\xfb\xef'
Standard encoding: ++8=
URL-safe encoding: --8=

Original         : '\xff\xff'
Standard encoding: //8=
URL-safe encoding: __8=

Other Encodings

Besides base 64, the module provides functions for working with base 32 and base 16 (hex) encoded data.

import base64

original_string = 'This is the data, in the clear.'
print 'Original:', original_string

encoded_string = base64.b32encode(original_string)
print 'Encoded :', encoded_string

decoded_string = base64.b32decode(encoded_string)
print 'Decoded :', decoded_string
$ python base64_base32.py

Original: This is the data, in the clear.
Encoded : KRUGS4ZANFZSA5DIMUQGIYLUMEWCA2LOEB2GQZJAMNWGKYLSFY======
Decoded : This is the data, in the clear.

The base 16 functions work with the hexadecimal alphabet.

import base64

original_string = 'This is the data, in the clear.'
print 'Original:', original_string

encoded_string = base64.b16encode(original_string)
print 'Encoded :', encoded_string

decoded_string = base64.b16decode(encoded_string)
print 'Decoded :', decoded_string
$ python base64_base16.py

Original: This is the data, in the clear.
Encoded : 546869732069732074686520646174612C20696E2074686520636C6561722E
Decoded : This is the data, in the clear.

See also

base64
The standard library documentation for this module.
RFC 3548
The Base16, Base32, and Base64 Data Encodings