gettext() is a Beatifoulsoup method that uses to get all child strings concatenated using the given separator. In this tutorial, we will learn how to use gettext() with examples, and we'll also know the difference between gettext() and the .string property.
Let's get started.
And all of these arguments are Optional
Let's see an example to understand how to use the get_text() method. In the following example, we'll get all child text of the .
child 1
child 2
child 3
''' soup = BeautifulSoup(html_source, 'html.parser') # 👉️ Parsing el = soup.find("div") # 👉️ Find TAG g_txt = el.get_text() # 👉️ Get text of the print(g_txt) # 👉️ Print output
As you can see in the code, we've used get_text() with no arguments.
If you want to remove the newlines \n from the output, set strip=True in the parameter like the example below.
child 1
child 2
child 3
''' soup = BeautifulSoup(html_source, 'html.parser') # 👉️ Parsing el = soup.find("div") # 👉️ Find TAG g_txt = el.get_text(strip=True) # 👉️ Get Text of the and Remove newline from the output print(g_txt) # 👉️ Print output
To add space between strings, set separator parameters like the example below.
child 1
child 2
child 3
''' soup = BeautifulSoup(html_source, 'html.parser') # 👉️ Parsing el = soup.find("div") # 👉️ Find TAG g_txt = el.get_text(strip=True, separator=" ") # 👉️ Set separator an dstript print(g_txt) # 👉️ Print output
Now, we'll split the response by \n and strip it.
child 1
child 2
child 3
''' soup = BeautifulSoup(html_source, 'html.parser') # 👉️ Parsing el = soup.find("div") # 👉️ find TAG g_txt = el.get_text(strip=True, separator="\n") # 👉️ Set separator and strip print(g_txt) # 👉️ Print output
The difference between get_text() and .string
Let's see some examples to figure out the difference between the get_text() method and the .string property.
Example -1:
child 1
child 2
child 3
''' soup = BeautifulSoup(html_source, 'html.parser') # 👉️ Parsing el = soup.find("div") # 👉️ Find TAG print(el.get_text()) # 👉️ Get content of div using get_text() print(el.string) # 👉️ Get Content of using .string
Output of get_text() :
Output of .string :
As you can see, the get_text returns the text of div children instead of the .string property. That is because .string is used for getting the text of the given element. And the div tag have no text.
Example -2:
''' soup = BeautifulSoup(html_source, 'html.parser') # 👉️ Parsing el = soup.find("div") # 👉️ Find TAG print(el.get_text()) # 👉️ Get Content of empty using .string print(el.string) # 👉️ Get content of empty using .string
Output of get_text() :
Output of .string :
- get_text() returns empty value
- .string returns None
Conclusion
To summarize this article, I'd like to say you should use the get_text() method to get all text inside an element.
For more articles about Beatifoulsoup, scroll down and happy learning >
Related Tutorials:
- Understand How to Work with Table in beautifulsoup
- Beautifulsoup Get All Links
- How to Use BeautifulSoup To Extract Title Tag
- 2 Ways to Find by Multiple Class in Beautifulsoup
- Beautifulsoup: How to Get Text Inside Tag or Tags
- How to Find by ID and Class in BeautifulSoup
- Beautifulsoup: How to Select ID
- BeautifulSoup Get Title tag