Strings, Sets, and Collection Choice

Key Takeaways

  • Strings are immutable sequences of characters, so string indexing works but item assignment raises TypeError.
  • String methods such as strip, lower, replace, and split return new values rather than modifying the original string.
  • Sets store unique hashable elements, so duplicate elements are collapsed into one set member.
  • Set membership is useful for checking whether a value is present, but set iteration order should not be relied on for exam output.
  • Choose lists for ordered mutable data, tuples for fixed records, dictionaries for key-value lookup, strings for text, and sets for uniqueness.
Last updated: June 2026

Strings as immutable sequences

A string is an ordered sequence of characters. Because it is a sequence, indexing, slicing, len(), iteration, and membership all work.

word = 'python'
print(word[0])       # p
print(word[-1])      # n
print(word[1:4])     # yth
print('th' in word)  # True
print(len(word))     # 6

The same slice stop-exclusion rule applies: word[1:4] includes indexes 1, 2, and 3 but excludes index 4. Strings differ from lists because strings are immutable: you cannot replace a character in place.

word = 'python'
word[0] = 'P'     # TypeError

To change text, build a new string and rebind the name to that result. Concatenation with + and repetition with * both create new strings.

word = 'python'
word = 'P' + word[1:]
print(word)       # Python
print('ab' * 3)   # ababab

String methods return new values

String methods never mutate the original string; they return new objects. This contrasts sharply with list methods such as append() and sort(), which mutate and return None.

MethodReturnsOriginal changed?
lower() / upper()New cased stringNo
strip()New string without surrounding whitespaceNo
replace(old, new)New string with replacementsNo
split(sep)New list of piecesNo
join(parts)New combined stringNo
find(sub)int index, or -1 if absentNo
count(sub)int number of occurrencesNo
text = '  Exam  '
clean = text.strip()
print(text)   # '  Exam  '  (unchanged)
print(clean)  # 'Exam'

print('a,b,c'.split(','))   # ['a', 'b', 'c']
print('-'.join(['1','2']))  # '1-2'

Because methods return new values, chaining is common on PCEP: ' Yes '.strip().lower() evaluates to 'yes'. A frequent trap is writing text.strip() on its own line and expecting text to change; the result is discarded unless you assign it.

Sets and uniqueness

A set is a mutable, unordered collection of unique hashable elements, written with curly braces (but {} alone is an empty dictionary; use set() for an empty set). Sets are useful when duplicates should disappear or when membership is the main operation.

seen = {'p', 'c', 'e', 'p'}
print(len(seen))    # 3  (one duplicate 'p' collapses)
print('p' in seen)  # True

The printed order of a set is not a reliable exam-tracing target; treat sets as unordered. Sets do not support indexing or slicing because there is no stable position.

letters = {'a', 'b'}
letters[0]        # TypeError: 'set' object is not subscriptable

Set elements must be hashable: strings, numbers, booleans, and tuples of hashable elements qualify; lists do not.

Set operations

OperationOperatorMeaning
Union``
Intersection&Values in both sets
Difference-Values in left set but not right
Symmetric difference^Values in exactly one set

Mutating methods include add(), remove() (KeyError if absent), discard() (no error if absent), and clear(). add() changes the set and returns None. Adding an element already present neither duplicates it nor raises an error.

Choosing the right collection

  • Use a list when order matters and the data will change.
  • Use a tuple for a fixed-size record or multiple return values.
  • Use a dictionary when each value should be found by a meaningful key.
  • Use a string for text, remembering that text changes create new strings.
  • Use a set when uniqueness and fast membership matter more than order.

For exam snippets, ask three quick questions: does position matter, can it mutate, and must values be unique? Those answers usually identify the correct collection and predict the code's behavior.

Iterating strings, indexing characters, and set construction traps

Because a string is a sequence of characters, looping with for ch in 'abc' yields the single-character strings 'a', 'b', then 'c'; Python has no separate character type, so each item is itself a length-one string. Indexing a string always returns a one-character string, never an integer code, so 'cat'[0] is 'c', and to obtain the numeric code point you must call ord('c'), with chr(99) converting back. This matters when a snippet compares characters: 'a' < 'b' is True because comparison uses code-point order, and uppercase letters sort before lowercase because their code points are smaller.

For sets, the construction rules generate frequent exam traps. Writing s = {} creates an empty dictionary, not an empty set, so the only way to make an empty set is set(). Passing a string to the constructor explodes it into characters: set('hello') produces a set of the unique characters {'h', 'e', 'l', 'o'}, collapsing the repeated 'l'. Because sets are unordered, you should never predict a specific print order for a set on the exam; instead reason about which unique elements survive.

Combining these collection types is common in real questions: converting a list to a set with set(my_list) removes duplicates, and wrapping that in len() counts distinct values — a pattern the exam uses to test whether you understand that uniqueness and membership, not position, are the defining traits of a set.

Test Your Knowledge

What happens when this code runs? s = 'cat'; s[0] = 'b'

A
B
C
D
Test Your Knowledge

What does set([1, 1, 2, 2, 3]) contain?

A
B
C
D
Test Your Knowledge

Which collection is the best fit for fast membership checks where duplicates should be ignored?

A
B
C
D