Strings, Sets, and Collection Choice
Key Takeaways
- Strings are immutable sequences of characters, so string indexing works but item assignment raises TypeError.
- String methods such as strip, lower, replace, and split return new values rather than modifying the original string.
- Sets store unique hashable elements, so duplicate elements are collapsed into one set member.
- Set membership is useful for checking whether a value is present, but set iteration order should not be relied on for exam output.
- Choose lists for ordered mutable data, tuples for fixed records, dictionaries for key-value lookup, strings for text, and sets for uniqueness.
Strings as immutable sequences
A string is an ordered sequence of characters. That means indexing, slicing, len(), iteration, and membership all work with strings.
word = 'python'
print(word[0]) # p
print(word[-1]) # n
print(word[1:4]) # yth
print('th' in word) # True
The same slice stop-exclusion rule applies: word[1:4] includes indexes 1, 2, and 3, but excludes index 4. Strings differ from lists because strings are immutable. You cannot replace a character in place.
word = 'python'
word[0] = 'P' # TypeError
To change text, build a new string and assign the name to that result.
word = 'python'
word = 'P' + word[1:]
print(word) # Python
String methods return new values
String methods do not mutate the original string. Methods such as lower(), upper(), strip(), replace(), and split() return new values.
| Method | Return value idea | Original string changed? |
|---|---|---|
lower() | New lowercase string | No |
strip() | New string without surrounding whitespace | No |
replace(old, new) | New string with replacements | No |
split(sep) | New list of pieces | No |
join(parts) | New combined string | No |
text = ' Exam '
clean = text.strip()
print(text) # original still has spaces
print(clean) # Exam
This contrasts with list methods such as append() and sort(), which mutate and return None.
Sets and uniqueness
A set is a mutable collection of unique hashable elements. It is useful when duplicates should disappear or when membership is the main operation.
seen = {'p', 'c', 'e', 'p'}
print(seen) # contains p, c, and e once each
print('p' in seen) # True
The printed order of a set is not a reliable exam-tracing target. Treat sets as unordered for PCEP purposes. Sets do not support indexing or slicing because there is no stable position to ask for.
letters = {'a', 'b'}
letters[0] # TypeError
Set elements must be hashable. Strings, numbers, booleans, and tuples of hashable elements can be set elements. Lists cannot.
Set operations
PCEP may expect basic set operation vocabulary.
| Operation | Operator | Meaning |
|---|---|---|
| Union | ` | ` |
| Intersection | & | Values in both sets |
| Difference | - | Values in left set but not right set |
| Symmetric difference | ^ | Values in exactly one set |
Sets also have mutating methods such as add(), remove(), discard(), and clear(). add() changes the set and returns None. Adding an element already in the set does not create a duplicate and does not raise an error.
Choosing the right collection
Use the collection type that matches the job:
- Use a list when order matters and the data will change.
- Use a tuple for a fixed-size record or multiple return values.
- Use a dictionary when each value should be found by a meaningful key.
- Use a string for text, remembering that text changes create new strings.
- Use a set when uniqueness and membership checks matter more than order.
For exam snippets, ask three quick questions: does position matter, can it mutate, and must values be unique? Those answers usually identify the correct collection and predict the behavior of the code.
What happens when this code runs? s = 'cat'; s[0] = 'b'
What does set([1, 1, 2, 2, 3]) contain?
Which collection is the best fit for fast membership checks where duplicates should be ignored?