Python 101Free
DATA STRUCTURES

Sets

Unordered collections of unique elements.

SECTION 01

The set model

A set is an unordered collection that stores each element only once. Adding a duplicate does nothing. Iteration order is not guaranteed (unlike dicts).

There are two main reasons to reach for a set. The first is to deduplicate: set(items) collapses any repeats. The second is to ask "is this thing in this collection?" really quickly. Membership tests with in run in O(1) for sets, versus O(n) for lists.

Like dict keys, set elements must be hashable. Strings, numbers, and tuples are fine. Lists and other sets are not. The empty set literal is set(), since {} already means an empty dict.

python
nums = [1, 2, 2, 3, 1]
unique = set(nums)      # {1, 2, 3}
2 in unique             # True
SECTION 02

Union, intersection, difference

Sets support the math operators directly. a | b is union (everything in either). a & b is intersection (only what is in both). a - b is difference (in a but not b). a ^ b is symmetric difference (in either but not both).

These are the same operations you would do with lists and a few for loops, except faster, shorter, and more obvious. If you find yourself writing nested loops to compare collections of unique things, switch to sets.

The operators always return new sets. The matching method names (union, intersection, etc.) take any iterable on the right side, which is sometimes useful when you do not have a set yet. For one-shot work the operators read better.

python
a = {1, 2, 3}
b = {2, 3, 4}
a | b    # {1, 2, 3, 4}   union
a & b    # {2, 3}         intersection
a - b    # {1}            difference
a ^ b    # {1, 4}         symmetric difference
← PREVIOUS
Dictionaries