Objective
In this unit, we will explore common functions that operate on strings. Strings are a fundamental data type in Python and understanding how to manipulate them is a crucial skill. By the end of this unit, you will understand how to perform basic string operations, use string methods, and format strings.
Basic String Operations
Strings in Python are sequences of characters. Python treats single quotes the same as double quotes. Creating strings is as simple as assigning a value to a variable. For example:
str1 = "Hello"
str2 = 'World'
Strings that span multiple lines can also be defined using triple quotes (''' ''' or """ """).
Here's an example:
text = """
Four score and seven years ago our fathers brought forth on this continent,
a new nation, conceived in Liberty, and dedicated to the proposition that all
men are created equal.
"""
In this example, the text
variable is assigned a string that spans multiple
lines. The string starts and ends with triple quotes, which tells Python to
include everything, including the newlines, until the closing triple quotes.
Once a string is defined, you can perform a variety of operations on it. For
example, you can concatenate (join together) two strings using the +
operator:
greeting = "Hello"
name = "Alice"
message = greeting + ", " + name + "!"
print(message) # Prints: Hello, Alice!
You can also repeat a string a certain number of times using the *
operator:
repeat = "ha"
laugh = repeat * 5
print(laugh) # Prints: hahahahaha
String Methods
Python has a set of built-in methods that you can use on strings. Here are some commonly used string methods:
lower()
: Converts a string into lower case.upper()
: Converts a string into upper case.split()
: Splits the string at the specified separator and returns a list of strings.replace()
: Replaces a specified phrase with another specified phrase.
Here's an example of how to use these methods:
text = "Hello, World!"
print(text.lower()) # "hello, world!"
print(text.upper()) # "HELLO, WORLD!"
print(text.split(",")) # ['Hello', ' World!']
print(text.replace("World", "Python")) # "Hello, Python!"
Formatting Strings
Python provides several ways to format strings, to interpolate variables into strings, and to concatenate smaller strings into larger strings. Here are two common ways to format strings:
- f-strings: Introduced in Python 3.6, f-strings offer several benefits
over the older
.format()
string method. With f-strings, you can embed expressions inside string literals, using curly braces{}
.
name = "Alice"
print(f"Hello, {name}!") # "Hello, Alice!"
- The
format()
method: This is an older method for formatting strings, but it's still widely used. It's a good method to know if you're working with older Python code.
name = "Alice"
print("Hello, {}!".format(name)) # "Hello, Alice!"
Project: Word Processing and Analysis
In this project, we will use string operations to process and analyze the Gettysburg Address. We'll count the number of words, the number of occurrences of a specific word, replace certain words in the text, and find the longest word.
# The text to analyze
text = """
Four score and seven years ago our fathers brought forth on this continent,
a new nation, conceived in Liberty, and dedicated to the proposition that all
men are created equal.
Now we are engaged in a great civil war, testing whether that nation, or any
nation so conceived and so dedicated, can long endure. We are met on a great
battlefield of that war. We have come to dedicate a portion of that field, as
a final resting place for those who here gave their lives that that nation
might live. It is altogether fitting and proper that we should do this.
But, in a larger sense, we cannot dedicate -- we cannot consecrate -- we
cannot hallow -- this ground. The brave men, living and dead, who struggled
here, have consecrated it, far above our poor power to add or detract. The
world will little note, nor long remember what we say here, but it can never
forget what they did here. It is for us the living, rather, to be dedicated
here to the unfinished work which they who fought here have thus far so nobly
advanced. It is rather for us to be here dedicated to the great task remaining
before us -- that from these honored dead we take increased devotion to that
cause for which they gave the last full measure of devotion -- that we here
highly resolve that these dead shall not have died in vain -- that this nation,
under God, shall have a new birth of freedom -- and that government of the
people, by the people, for the people, shall not perish from the earth.
"""
# Convert the text to lower case
text = text.lower()
# Count the number of sentences (assuming each sentence ends with a period)
num_sentences = text.count(".")
print(f"\nThe text contains {num_sentences} sentences.")
# Remove punctuation from the text for the rest of the analysis
text = text.replace(",", "")
text = text.replace(".", "")
text = text.replace("--", "")
# Split the text into words
words = text.split()
# Count the number of words
num_words = len(words)
print(f"The text contains {num_words} words.")
# Count the number of occurrences of a specific word
word_to_count = "nation"
count = words.count(word_to_count)
print(f"The word '{word_to_count}' occurs {count} times.")
# Replace a word in the text
text = text.replace("nation", "country")
print("\nText after replacing 'nation' with 'country':\n")
print(text)
# Find the longest word
longest_word = max(words, key=len)
print(f"\nThe longest word is '{longest_word}'.")
In this program, we first convert the text to lower case to ensure that word counting is case-insensitive. Then we count the number of sentences by counting the number of periods in the text.
To perform the rest of the analysis, we remove punctuation from the text using
the replace()
method. We split the text into words using the split()
method,
which splits the text at spaces by default. We count the number of words by
using the len()
function on the list of words.
Next, we count the number of occurrences of a specific word using the count()
method of the list of words. We replace all occurrences of "nation" with
"country" in the text using the replace()
method.
Finally, we find the longest word in the list of words using the max()
function with the key=len
argument, which tells max()
to find the longest
string in the list.
Try to modify this program to analyze different texts and perform different types of word processing!
In the next unit, we'll look at lists and explore how we can store, access, and manipulate multiple values.