Python syntax to correctly handle string data type
A string is a sequence of one or multiple characters, which can be letters, numbers or symbols, along with spaces between the characters. String data in Python can be identified and differentiated from other data types if they are enclosed within single or double quotation marks. In Python, string data is coded in Unicode format. This is a standardized system of encoding, processing and displaying all characters of the written text in the form of bits.
In order to handle and manipulate string data in Python, the programmer needs to be aware of the correct syntax and formatting options that apply. This article focuses on the Python syntax applicable for correctly handling string data, and the operations that can be used to format the string data.
Python syntax for string is often different from the basic syntax since the type of data is different. Therefore, a unique set of rules are applicable while dealing with string data.
Using quotes in Python for string data
Python identifies string type data if it is enclosed in double (“ “) or single (‘ ‘) quotes. For example, if we want to assign a variable x with a message, Hi!, we simply have to do the following:
>>> x = ‘Hi!’ >>> print(x) >>> Hi!
As seen in figure 1 below, by using single or double quotations for string data, the Python interpreter will read the data in the same way and give the same output.
Aside from texts and symbols, Python can also consider numbers, decimals and even Boolean values as strings, using quotes. In figure 2 below, integer, float and Boolean type data has been shown as a string.
We can, however, establish the importance of quotes for string, when we see the difference in output for variables “z” and “m”, where, z = ‘5+7’ and m = 5+7 (without quotes). While the Python interpreter read variable z as a string and gave the output as it is; for variable m, without quotes it performed addition, since the ‘+’ operator was present, to give an output of 12 instead of the original 5+7 input.
Along with single and double quotes, Python also considers triple quotes (‘‘‘Hi!’’’ or “““ Hi!”””), which can be used for multi-line strings, as shown in figure 3 below.
You can also add spaces within your string sequences, as shown in Image 3(b), by inserting the desired number of spaces before the required character).Quick Tip
Formatting string Backslashes or escaEe sequences
Backslashes (\) in Python offer programmers shortcuts to format strings, like introducing a new line or a unique character, which is not possible normally. Backslashes are also referred to as, escape sequences, as they help the programmer to escape special string characters, which have special meaning in Python. This includes quotes, backslash itself or a new line within a string sequence. For example, if we want to add double quotes within a string object, He asked, “how have you been?”.
Attempting this in Python typically would involve something like this:
Although this can be avoided by using single quotes in this situation.
>>> a = ‘He asked, “How have you been?”’
This won’t give rise to an error.
However, in case both single and double quotes are to be added within a string object (My height is 5’7”, which has both single and double quotes), the backslash (\) comes to the rescue. Backslashes will signal the Python interpreter that the character soon after \ is to be treated like a normal character. Therefore, \’ will signal Python to add an ‘ at the specific position, while \\ will ask Python to add a backslash itself at that position.
Types of Escape Sequences in Python
Besides printable characters, backslash can also format string sequences by adding multiple lines, tabulation or even backspace. Table 1 below, describes the different Escape Sequences used in Python, along with their definition and relevant examples.
|\n||Newline: To add a new line within a string sequence|
|\r||Carriage Return: To replace the initial existing sequence with a new sequence (depends upon the length of the sequence after \r)|
|\t||Tab: To add a horizontal tab between the string sequence|
|\b||Backspace: To add backspace within string sequences (deleting the character before the \b)|
|\’||Single Quote: To insert a single quote (‘)|
|\”||Double Quote: To insert a double quote (“)|
|\\||Backslash: To insert a backslash (\)|
Combinations of Escape Sequences
Programmers can use these escape sequences alone or in combinations, depending upon the outputs they desire. For example, if we want to create a list of options on the scale of Agreeability, for a Multiple Choice question in a survey form, we can do the following:
>>> x = ‘1.\tStrongly Agree\n2.\tAgree\n3.\tNeutral\n4.\tDisagree\n5.\tStrongly Disagree >>> print(x)
- Strongly Agree
- Strongly Disagree
The original output for such a string variable is shown in Image 5 below.
Raw string data in Python
The escape sequences discussed in Table 1 above, allows programmers to format string data. However, there are situations where we don’t want the Python Interpreter to consider any of the escape sequences within our string data, but instead, keep the backslashes as they are. In such cases, we create raw strings, using the ‘r’ prefix before the quotes of the string sequence. Raw strings will signal the Python interpreter to ignore all formatting within the string sequence, including backslashes.
For example, if we have a variable, a,
>>> a = ‘Everyone\’s\nhappiness\nis\chocolates!’
Printing the variable a will give the output,
However, if the prefix ‘r’ is given, before the quotation marks, for this variable, a, the output will be as shown in figure 6 below.