Python: Converting string to bytes object

In this post, we will check how to convert a Python string to a bytes object. Bytes objects are immutable sequences of single bytes in the range between o and 255 (inclusive).

Introduction

In this post, we will check how to convert a Python string to a bytes object. Bytes objects are immutable sequences of single bytes [1] in the range between o and 255 (inclusive) [2].

One of the ways of performing this conversion is by using the bytes class constructor, passing as input the string. Additionally, we need to specify the encoding of the text (in string format) as second argument of the constructor [3].

Alternatively, we can obtain an encoded version of a string as a bytes object by calling the encode method [4]. This method also receives as input the encoding of the text as a string, although in opposition to the previously mentioned constructor this parameter is optional and defaults to UTF-8 [4]. You can read more about Python standard encodings here.

This tutorial was tested on Python version 3.6.

The code

We will start by declaring a string, which we will use to convert to bytes using the two procedures mentioned on the introductory section.

string = "Hello world"

Then, using the first approach, we will create a bytes object from the previous string. To do so, we pass the string as first input of the constructor of the bytes class. As second, we need to specify the encoding, which will be utf-8.

We will store the result in a variable and then print its type, so we can confirm that it is indeed a bytes object. To print the type of a variable, we can simply use Python’s type function.

We will also print our bytes object, created from the string.

bytes1 = bytes(string, 'utf-8')
print(type(bytes1))
print(bytes1)

The result is shown below in figure 1. As can be seen, we obtain an object of class bytes, as expected. Note that although printing the object shows a user friendly textual representation, the data contained in it are actually bytes, as we will see below.

Figure 1 – String to bytes, using the bytes object constructor.

Moving on, we will now use the second mentioned procedure of conversion, which is calling the encode method on the string. As stated in the introductory section, since this method has UTF-8 as the default encoding when no argument is given, then we will not pass any input to it.

We will again print the object returned by this method, to confirm that this is also a bytes object.

bytes2 = string.encode()
print(type(bytes2))
print(bytes2)

The result for this portion of the code can be seen below at figure 2.

Figure 2 – String to bytes, using the string encode method.

In order to check the actual value of each byte of both bytes objects, we can iterate them with a for in loop and print each element.

Note that in Python 3 print is a function which has an argument called end that defaults to “\n”, which is appended at the end of the input to print [5]. Thus, if we specify this argument as a space (” “), all the bytes of each object will be printed with a space between them, rather that each one being printed in a newline.

for b1 in bytes1:
    print(b1, end=' ')

print()

for b2 in bytes2:
    print(b2, end=' ')

As shown in figure 3, both objects have the same sequence of bytes.

Figure 3 – Sequence of bytes from both objects.

The final source code can be seen below.

string = "Hello world"

bytes1 = bytes(string, 'utf-8')
print(type(bytes1))
print(bytes1)

bytes2 = string.encode()
print(type(bytes2))
print(bytes2)

for b1 in bytes1:
    print(b1, end=' ')

print()

for b2 in bytes2:
    print(b2, end=' ')

Just as a final note, if we try to assign a value to a byte of one of the bytes object (by using the [] operator), then we will get the exception shown in figure 4, due to the fact that the bytes objects are immutable.