Python syntax to correctly handle string data type

By Chandrika Kapagunta & Priya Chetty on July 23, 2021

A string is a sequence of one or multiple characters, which can be letters, numbers or symbols, along with spaces between the characters. String data in Python can be identified and differentiated from other data types if they are enclosed within single or double quotation marks. In Python, string data is coded in Unicode format. This is a standardized system of encoding, processing and displaying all characters of the written text in the form of bits.

In order to handle and manipulate string data in Python, the programmer needs to be aware of the correct syntax and formatting options that apply. This article focuses on the Python syntax applicable for correctly handling string data, and the operations that can be used to format the string data.

Python syntax for string is often different from the basic syntax since the type of data is different. Therefore, a unique set of rules are applicable while dealing with string data.

Using quotes in Python for string data

Python identifies string type data if it is enclosed in double (“ “) or single (‘ ‘) quotes. For example, if we want to assign a variable x with a message, Hi!, we simply have to do the following:

 >>> x = ‘Hi!’
 >>> print(x)
 >>> Hi! 

As seen in figure 1 below, by using single or double quotations for string data, the Python interpreter will read the data in the same way and give the same output.

Quotations in string data in Python
Figure 1: Quotations in string data in Python

Aside from texts and symbols, Python can also consider numbers, decimals and even Boolean values as strings, using quotes. In figure 2 below, integer, float and Boolean type data has been shown as a string.

Quotations in string data for other data formats
Figure 2: Quotations in string data for other data formats

We can, however, establish the importance of quotes for string, when we see the difference in output for variables “z” and “m”, where, z = ‘5+7’ and m = 5+7 (without quotes). While the Python interpreter read variable z as a string and gave the output as it is; for variable m, without quotes it performed addition, since the ‘+’ operator was present, to give an output of 12 instead of the original 5+7 input.

Along with single and double quotes, Python also considers triple quotes (‘‘‘Hi!’’’ or “““ Hi!”””), which can be used for multi-line strings, as shown in figure 3 below.

You can also add spaces within your string sequences, as shown in Image 3(b), by inserting the desired number of spaces before the required character). 

Quick Tip
Using triple quotes in Python
Figure 3: Using triple quotes in Python

Formatting string Backslashes or escaEe sequences

Backslashes (\) in Python offer programmers shortcuts to format strings, like introducing a new line or a unique character, which is not possible normally. Backslashes are also referred to as, escape sequences, as they help the programmer to escape special string characters, which have special meaning in Python. This includes quotes, backslash itself or a new line within a string sequence. For example, if we want to add double quotes within a string object, He asked, “how have you been?”.

Attempting this in Python typically would involve something like this:

Syntax Error for introducing quotes within string object
Figure 4: Syntax Error for introducing quotes within string object

Although this can be avoided by using single quotes in this situation.

>>> a = ‘He asked, “How have you been?”’

This won’t give rise to an error.

However, in case both single and double quotes are to be added within a string object (My height is 5’7”, which has both single and double quotes), the backslash (\) comes to the rescue. Backslashes will signal the Python interpreter that the character soon after \ is to be treated like a normal character. Therefore, \’ will signal Python to add an ‘ at the specific position, while \\ will ask Python to add a backslash itself at that position.

Types of Escape Sequences in Python

Besides printable characters, backslash can also format string sequences by adding multiple lines, tabulation or even backspace. Table 1 below, describes the different Escape Sequences used in Python, along with their definition and relevant examples.  

BackslashDefinitionExample
\nNewline: To add a new line within a string sequenceSyntax Error for introducing quotes within string object
\rCarriage Return: To replace the initial existing sequence with a new sequence (depends upon the length of the sequence after \r)Carriage return
\tTab: To add a horizontal tab between the string sequenceTab
\bBackspace: To add backspace within string sequences (deleting the character before the \b)Backspace
\’Single Quote: To insert a single quote (‘)Single space
\”Double Quote: To insert a double quote (“)Double quote
\\Backslash: To insert a backslash (\)Backlash
Table 1: Backslashes used in Python

Combinations of Escape Sequences

Programmers can use these escape sequences alone or in combinations, depending upon the outputs they desire. For example, if we want to create a list of options on the scale of Agreeability, for a Multiple Choice question in a survey form, we can do the following:

 >>> x = ‘1.\tStrongly Agree\n2.\tAgree\n3.\tNeutral\n4.\tDisagree\n5.\tStrongly Disagree
 >>> print(x)
  1. Strongly Agree
  2. Agree
  3. Neutral
  4. Disagree
  5. Strongly Disagree

The original output for such a string variable is shown in Image 5 below.

Combination of escape sequences in Python
Figure 5: Combination of escape sequences in Python

Raw string data in Python

The escape sequences discussed in Table 1 above, allows programmers to format string data. However, there are situations where we don’t want the Python Interpreter to consider any of the escape sequences within our string data, but instead, keep the backslashes as they are. In such cases, we create raw strings, using the ‘r’ prefix before the quotes of the string sequence. Raw strings will signal the Python interpreter to ignore all formatting within the string sequence, including backslashes.

For example, if we have a variable, a,

>>> a = ‘Everyone\’s\nhappiness\nis\chocolates!’

Printing the variable a will give the output,

Everyone’s

happiness

is

chocolates!

However, if the prefix ‘r’ is given, before the quotation marks, for this variable, a, the output will be as shown in figure 6 below.

Raw strings in Python
Figure 6: Raw strings in Python
NOTES

I am a management graduate with specialisation in Marketing and Finance. I have over 12 years' experience in research and analysis. This includes fundamental and applied research in the domains of management and social sciences. I am well versed with academic research principles. Over the years i have developed a mastery in different types of data analysis on different applications like SPSS, Amos, and NVIVO. My expertise lies in inferring the findings and creating actionable strategies based on them. 

Over the past decade I have also built a profile as a researcher on Project Guru's Knowledge Tank division. I have penned over 200 articles that have earned me 400+ citations so far. My Google Scholar profile can be accessed here

I now consult university faculty through Faculty Development Programs (FDPs) on the latest developments in the field of research. I also guide individual researchers on how they can commercialise their inventions or research findings. Other developments im actively involved in at Project Guru include strengthening the "Publish" division as a bridge between industry and academia by bringing together experienced research persons, learners, and practitioners to collaboratively work on a common goal. 

 

Discuss

1 thought on “Python syntax to correctly handle string data type”