Python – Dictionaries & Sets

Dictionaries and sets are two of the most powerful and frequently used data structures in Python. Dictionaries give you a fast, flexible way to associate keys with values — think of them as a lookup table where you can instantly retrieve data by its label. Sets give you an unordered collection of unique elements with blazing-fast membership testing. Together, they solve a huge range of real-world programming problems: configuration management, deduplication, counting, caching, grouping, and more. If you are writing Python professionally, you will reach for dicts and sets daily.

In this tutorial, we will cover both data structures thoroughly — from creation and basic operations through advanced patterns like defaultdict, Counter, set algebra, and frozensets. By the end, you will understand not just the syntax, but when and why to choose each structure.

Part 1: Dictionaries

Introduction to Dictionaries

A dictionary (dict) is a mutable, unordered (as of Python 3.7+, insertion-ordered) collection of key-value pairs. Each key must be unique and hashable (strings, numbers, tuples of immutables), and each key maps to exactly one value. Dictionaries are implemented as hash tables, which means lookups, insertions, and deletions all run in O(1) average time — regardless of how many entries the dictionary contains.

Use a dictionary when you need to:

  • Map identifiers to data (user ID to user record, config key to value)
  • Count occurrences of items
  • Group data by category
  • Build lookup tables for fast retrieval
  • Represent structured data (similar to JSON objects)

Creating Dictionaries

There are several ways to create a dictionary in Python. Choose the one that best fits your situation.

Literal syntax (most common)

# Curly braces with key: value pairs
user = {
    "name": "Folau",
    "age": 30,
    "city": "Salt Lake City",
    "is_active": True
}
print(user)
# {'name': 'Folau', 'age': 30, 'city': 'Salt Lake City', 'is_active': True}

The dict() constructor

# From keyword arguments (keys must be valid identifiers)
user = dict(name="Folau", age=30, city="Salt Lake City")
print(user)
# {'name': 'Folau', 'age': 30, 'city': 'Salt Lake City'}

# From a list of tuples
pairs = [("host", "localhost"), ("port", 5432), ("db", "myapp")]
config = dict(pairs)
print(config)
# {'host': 'localhost', 'port': 5432, 'db': 'myapp'}

# From two parallel lists using zip
keys = ["name", "language", "level"]
values = ["Folau", "Python", "Senior"]
profile = dict(zip(keys, values))
print(profile)
# {'name': 'Folau', 'language': 'Python', 'level': 'Senior'}

Dictionary comprehension

# Create a dict of squares
squares = {x: x ** 2 for x in range(1, 6)}
print(squares)
# {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}

dict.fromkeys() — initialize with a default value

# All keys get the same default value
statuses = dict.fromkeys(["Alice", "Bob", "Charlie"], "pending")
print(statuses)
# {'Alice': 'pending', 'Bob': 'pending', 'Charlie': 'pending'}

# Without a default, values are None
placeholders = dict.fromkeys(["name", "email", "phone"])
print(placeholders)
# {'name': None, 'email': None, 'phone': None}

Accessing Values

You can retrieve values from a dictionary using bracket notation or the get() method. The key difference: brackets raise a KeyError if the key does not exist, while get() returns a default value (defaulting to None).

user = {"name": "Folau", "age": 30, "city": "Salt Lake City"}

# Bracket notation
print(user["name"])    # Folau
# print(user["email"])  # KeyError: 'email'

# get() with default - safer for uncertain keys
print(user.get("name"))           # Folau
print(user.get("email"))          # None (no KeyError)
print(user.get("email", "N/A"))   # N/A (custom default)

You can also retrieve all keys, values, or key-value pairs as view objects. These views are dynamic — they reflect changes to the dictionary in real time.

user = {"name": "Folau", "age": 30, "city": "Salt Lake City"}

# Keys
print(user.keys())    # dict_keys(['name', 'age', 'city'])

# Values
print(user.values())  # dict_values(['Folau', 30, 'Salt Lake City'])

# Key-value pairs as tuples
print(user.items())   # dict_items([('name', 'Folau'), ('age', 30), ('city', 'Salt Lake City')])

# Check if a key exists
print("name" in user)    # True
print("email" in user)   # False

Modifying Dictionaries

Dictionaries are mutable. You can add new key-value pairs, update existing ones, and merge dictionaries together.

Add or update a single key

user = {"name": "Folau", "age": 30}

# Add a new key
user["email"] = "folau@example.com"

# Update an existing key
user["age"] = 31

print(user)
# {'name': 'Folau', 'age': 31, 'email': 'folau@example.com'}

update() — merge another dictionary or key-value pairs

config = {"host": "localhost", "port": 5432}

# Merge from another dict (existing keys are overwritten)
config.update({"port": 3306, "database": "myapp"})
print(config)
# {'host': 'localhost', 'port': 3306, 'database': 'myapp'}

# Merge from keyword arguments
config.update(user="admin", password="secret")
print(config)
# {'host': 'localhost', 'port': 3306, 'database': 'myapp', 'user': 'admin', 'password': 'secret'}

setdefault() — set a key only if it does not exist

user = {"name": "Folau", "age": 30}

# Key does not exist - sets it and returns the value
email = user.setdefault("email", "folau@example.com")
print(email)  # folau@example.com
print(user)   # {'name': 'Folau', 'age': 30, 'email': 'folau@example.com'}

# Key already exists - does nothing, returns existing value
name = user.setdefault("name", "Unknown")
print(name)   # Folau (not overwritten)

Merge operator |= (Python 3.9+)

# The | operator creates a new merged dictionary
defaults = {"theme": "dark", "language": "en", "page_size": 25}
overrides = {"theme": "light", "page_size": 50}

final = defaults | overrides
print(final)
# {'theme': 'light', 'language': 'en', 'page_size': 50}

# The |= operator updates in place
defaults |= overrides
print(defaults)
# {'theme': 'light', 'language': 'en', 'page_size': 50}

Removing Items

Python provides several ways to remove entries from a dictionary, each with different behavior.

user = {"name": "Folau", "age": 30, "city": "Salt Lake City", "email": "folau@example.com"}

# del - remove a specific key (raises KeyError if missing)
del user["email"]
print(user)
# {'name': 'Folau', 'age': 30, 'city': 'Salt Lake City'}

# pop() - remove and return the value (with optional default)
age = user.pop("age")
print(age)    # 30
print(user)   # {'name': 'Folau', 'city': 'Salt Lake City'}

# pop() with default avoids KeyError
missing = user.pop("phone", "not found")
print(missing)  # not found

# popitem() - remove and return the last inserted key-value pair
user["role"] = "developer"
user["level"] = "senior"
last = user.popitem()
print(last)   # ('level', 'senior')
print(user)   # {'name': 'Folau', 'city': 'Salt Lake City', 'role': 'developer'}

# clear() - remove all entries
user.clear()
print(user)   # {}

Iterating Over Dictionaries

Dictionaries support several iteration patterns. The default behavior iterates over keys.

user = {"name": "Folau", "age": 30, "city": "Salt Lake City"}

# Iterate over keys (default)
for key in user:
    print(key)
# name
# age
# city

# Iterate over values
for value in user.values():
    print(value)
# Folau
# 30
# Salt Lake City

# Iterate over key-value pairs (most common)
for key, value in user.items():
    print(f"{key}: {value}")
# name: Folau
# age: 30
# city: Salt Lake City

# With enumerate (when you also need an index)
for index, (key, value) in enumerate(user.items()):
    print(f"{index}. {key} = {value}")
# 0. name = Folau
# 1. age = 30
# 2. city = Salt Lake City

Dictionary Comprehensions

Dictionary comprehensions let you build dictionaries in a single expression, similar to list comprehensions. They are concise, readable, and often more performant than building a dict with a loop.

Basic comprehension

# Square numbers
squares = {n: n ** 2 for n in range(1, 8)}
print(squares)
# {1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49}

# Swap keys and values
original = {"a": 1, "b": 2, "c": 3}
flipped = {v: k for k, v in original.items()}
print(flipped)
# {1: 'a', 2: 'b', 3: 'c'}

With conditions

scores = {"Alice": 92, "Bob": 67, "Charlie": 85, "Diana": 45, "Eve": 78}

# Only passing scores (>= 70)
passing = {name: score for name, score in scores.items() if score >= 70}
print(passing)
# {'Alice': 92, 'Charlie': 85, 'Eve': 78}

# Categorize scores
grades = {
    name: ("A" if score >= 90 else "B" if score >= 80 else "C" if score >= 70 else "F")
    for name, score in scores.items()
}
print(grades)
# {'Alice': 'A', 'Bob': 'F', 'Charlie': 'B', 'Diana': 'F', 'Eve': 'C'}

Nested comprehension

# Multiplication table as a nested dict
table = {
    i: {j: i * j for j in range(1, 4)}
    for i in range(1, 4)
}
print(table)
# {1: {1: 1, 2: 2, 3: 3}, 2: {1: 2, 2: 4, 3: 6}, 3: {1: 3, 2: 6, 3: 9}}
print(table[2][3])  # 6

Nested Dictionaries

Dictionaries can contain other dictionaries as values, creating a tree-like structure. This is the natural way to represent JSON data, configuration files, and hierarchical records in Python.

# A nested structure representing company data (JSON-like)
company = {
    "name": "Tech Corp",
    "founded": 2015,
    "departments": {
        "engineering": {
            "head": "Folau",
            "team_size": 25,
            "technologies": ["Python", "Java", "AWS"]
        },
        "marketing": {
            "head": "Sarah",
            "team_size": 10,
            "budget": 500000
        }
    },
    "locations": [
        {"city": "Salt Lake City", "is_hq": True},
        {"city": "San Francisco", "is_hq": False}
    ]
}

# Accessing nested data
print(company["departments"]["engineering"]["head"])          # Folau
print(company["departments"]["engineering"]["technologies"])  # ['Python', 'Java', 'AWS']
print(company["locations"][0]["city"])                        # Salt Lake City

# Safe nested access with get()
budget = company.get("departments", {}).get("sales", {}).get("budget", 0)
print(budget)  # 0 (no KeyError even though 'sales' does not exist)

For deeply nested structures, chaining get() calls is a common defensive pattern. Each call returns an empty dict if the key is missing, so the next get() still works without raising an error.

DefaultDict, OrderedDict, and Counter

The collections module provides specialized dictionary subclasses that handle common patterns more elegantly than a plain dict.

defaultdict — auto-initialize missing keys

from collections import defaultdict

# With a regular dict, you must check if a key exists before appending
groups = {}
words = ["apple", "banana", "avocado", "blueberry", "cherry", "apricot"]
for word in words:
    first_letter = word[0]
    if first_letter not in groups:
        groups[first_letter] = []
    groups[first_letter].append(word)

# With defaultdict, the factory function handles initialization
groups = defaultdict(list)
for word in words:
    groups[word[0]].append(word)

print(dict(groups))
# {'a': ['apple', 'avocado', 'apricot'], 'b': ['banana', 'blueberry'], 'c': ['cherry']}

# defaultdict with int (perfect for counting)
word_count = defaultdict(int)
for word in ["apple", "banana", "apple", "cherry", "banana", "apple"]:
    word_count[word] += 1

print(dict(word_count))
# {'apple': 3, 'banana': 2, 'cherry': 1}

OrderedDict — dictionary with guaranteed order

from collections import OrderedDict

# Since Python 3.7, regular dicts preserve insertion order.
# OrderedDict is still useful for two reasons:
# 1. It supports move_to_end() and popitem(last=False)
# 2. Order matters in equality comparison

od = OrderedDict()
od["first"] = 1
od["second"] = 2
od["third"] = 3

# Move an item to the end
od.move_to_end("first")
print(list(od.keys()))  # ['second', 'third', 'first']

# Move to the beginning
od.move_to_end("third", last=False)
print(list(od.keys()))  # ['third', 'second', 'first']

# Pop from the front (FIFO behavior)
od.popitem(last=False)  # Removes 'third'
print(list(od.keys()))  # ['second', 'first']

# Equality comparison considers order
dict1 = OrderedDict(a=1, b=2)
dict2 = OrderedDict(b=2, a=1)
print(dict1 == dict2)  # False (order differs)

# Regular dicts ignore order in comparison
print({"a": 1, "b": 2} == {"b": 2, "a": 1})  # True

Counter — count occurrences effortlessly

from collections import Counter

# Count elements in a list
fruits = ["apple", "banana", "apple", "cherry", "banana", "apple", "date"]
fruit_count = Counter(fruits)
print(fruit_count)
# Counter({'apple': 3, 'banana': 2, 'cherry': 1, 'date': 1})

# Most common elements
print(fruit_count.most_common(2))
# [('apple', 3), ('banana', 2)]

# Count characters in a string
char_count = Counter("mississippi")
print(char_count)
# Counter({'s': 4, 'i': 4, 'p': 2, 'm': 1})

# Arithmetic with Counters
inventory = Counter(apples=5, oranges=3, bananas=2)
sold = Counter(apples=2, oranges=1)
remaining = inventory - sold
print(remaining)
# Counter({'apples': 3, 'oranges': 2, 'bananas': 2})

# Combine inventories
new_stock = Counter(apples=10, grapes=5)
total = remaining + new_stock
print(total)
# Counter({'apples': 13, 'grapes': 5, 'oranges': 2, 'bananas': 2})

Practical Dictionary Examples

Word frequency counter

def word_frequency(text):
    """
    Count the frequency of each word in a text.
    Returns a dictionary sorted by frequency (descending).
    """
    # Normalize: lowercase and split on whitespace
    words = text.lower().split()
    # Remove punctuation from each word
    cleaned = [word.strip(".,!?;:\"'()") for word in words]
    # Count using a dict comprehension on Counter
    from collections import Counter
    counts = Counter(cleaned)
    # Sort by frequency
    return dict(counts.most_common())

sample = """Python is great. Python is powerful.
Python is used by developers who love Python."""

result = word_frequency(sample)
for word, count in result.items():
    print(f"  {word}: {count}")
# python: 4
# is: 3
# great: 1
# powerful: 1
# used: 1
# by: 1
# developers: 1
# who: 1
# love: 1

Configuration manager

class ConfigManager:
    """
    A simple configuration manager that supports defaults,
    environment-specific overrides, and dot-notation-style access.
    """

    def __init__(self, defaults=None):
        self._config = defaults.copy() if defaults else {}

    def load_env(self, env_name, overrides):
        """Apply environment-specific overrides."""
        self._config["environment"] = env_name
        self._config.update(overrides)

    def get(self, key, default=None):
        """Retrieve a config value with an optional default."""
        keys = key.split(".")
        value = self._config
        for k in keys:
            if isinstance(value, dict):
                value = value.get(k)
            else:
                return default
            if value is None:
                return default
        return value

    def set(self, key, value):
        """Set a config value."""
        self._config[key] = value

    def to_dict(self):
        return self._config.copy()


# Usage
defaults = {
    "app_name": "MyApp",
    "debug": False,
    "database": {
        "host": "localhost",
        "port": 5432,
        "name": "myapp_db"
    },
    "cache_ttl": 300
}

config = ConfigManager(defaults)
config.load_env("production", {
    "debug": False,
    "database": {
        "host": "db.production.com",
        "port": 5432,
        "name": "myapp_prod"
    },
    "cache_ttl": 3600
})

print(config.get("app_name"))          # MyApp
print(config.get("database.host"))     # db.production.com
print(config.get("missing_key", 42))   # 42
print(config.get("environment"))       # production

Caching with memoization

import time

def memoize(func):
    """
    A simple memoization decorator using a dictionary cache.
    Caches results of expensive function calls.
    """
    cache = {}

    def wrapper(*args):
        if args in cache:
            print(f"  Cache hit for {args}")
            return cache[args]
        print(f"  Computing result for {args}")
        result = func(*args)
        cache[args] = result
        return result

    wrapper.cache = cache  # Expose cache for inspection
    return wrapper


@memoize
def expensive_computation(n):
    """Simulate an expensive operation."""
    time.sleep(0.1)  # Simulate delay
    return n ** 3 + n ** 2 + n + 1


# First call - computes and caches
result1 = expensive_computation(10)
print(f"Result: {result1}")  # Result: 1111

# Second call - returns from cache instantly
result2 = expensive_computation(10)
print(f"Result: {result2}")  # Result: 1111

# Different argument - computes and caches
result3 = expensive_computation(5)
print(f"Result: {result3}")  # Result: 156

# Inspect the cache
print(f"Cache contents: {expensive_computation.cache}")
# Cache contents: {(10,): 1111, (5,): 156}

For production code, Python provides functools.lru_cache which handles this pattern with additional features like a maximum cache size and thread safety.

Part 2: Sets

Introduction to Sets

A set is an unordered collection of unique, hashable elements. Sets are implemented as hash tables (like dictionary keys without values), which gives them O(1) average time for membership testing, insertion, and deletion. If you need to check whether something is “in” a collection, a set is almost always the right choice — it is dramatically faster than scanning a list.

Sets are ideal when you need to:

  • Eliminate duplicate entries from a collection
  • Perform mathematical set operations (union, intersection, difference)
  • Test membership efficiently
  • Find common or unique elements between collections

Creating Sets

Sets can be created using curly braces or the set() constructor.

# Literal syntax with curly braces
fruits = {"apple", "banana", "cherry"}
print(fruits)       # {'cherry', 'apple', 'banana'} (order may vary)
print(type(fruits)) # <class 'set'>

# Using the set() constructor
numbers = set([1, 2, 3, 4, 5])
print(numbers)  # {1, 2, 3, 4, 5}

# From a string (each character becomes an element)
letters = set("hello")
print(letters)  # {'h', 'e', 'l', 'o'} (duplicates removed)

# IMPORTANT: empty set must use set(), not {}
empty_set = set()     # Correct: empty set
empty_dict = {}       # This is an empty DICTIONARY, not a set!
print(type(empty_set))   # <class 'set'>
print(type(empty_dict))  # <class 'dict'>

Deduplication — removing duplicates from a list

# The simplest way to remove duplicates
numbers = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]
unique = list(set(numbers))
print(unique)  # [1, 2, 3, 4, 5, 6, 9] (order not preserved)

# To preserve original order (Python 3.7+)
def deduplicate(items):
    """Remove duplicates while preserving insertion order."""
    seen = set()
    result = []
    for item in items:
        if item not in seen:
            seen.add(item)
            result.append(item)
    return result

print(deduplicate(numbers))  # [3, 1, 4, 5, 9, 2, 6]

# Or use dict.fromkeys() (preserves order, Python 3.7+)
print(list(dict.fromkeys(numbers)))  # [3, 1, 4, 5, 9, 2, 6]

Set comprehension

# Create a set using comprehension syntax
even_squares = {x ** 2 for x in range(1, 11) if x % 2 == 0}
print(even_squares)  # {4, 16, 36, 64, 100}

Set Operations

Sets support all the standard mathematical set operations. Each operation is available as both a method and an operator.

python_devs = {"Alice", "Bob", "Charlie", "Diana"}
java_devs = {"Bob", "Diana", "Eve", "Frank"}

# UNION - all elements from both sets
# Method: .union() | Operator: |
all_devs = python_devs.union(java_devs)
print(all_devs)
# {'Alice', 'Bob', 'Charlie', 'Diana', 'Eve', 'Frank'}

all_devs = python_devs | java_devs  # Same result

# INTERSECTION - elements in both sets
# Method: .intersection() | Operator: &
both = python_devs.intersection(java_devs)
print(both)  # {'Bob', 'Diana'}

both = python_devs & java_devs  # Same result

# DIFFERENCE - elements in first set but not in second
# Method: .difference() | Operator: -
python_only = python_devs.difference(java_devs)
print(python_only)  # {'Alice', 'Charlie'}

python_only = python_devs - java_devs  # Same result

java_only = java_devs - python_devs
print(java_only)  # {'Eve', 'Frank'}

# SYMMETRIC DIFFERENCE - elements in either set but not both
# Method: .symmetric_difference() | Operator: ^
exclusive = python_devs.symmetric_difference(java_devs)
print(exclusive)  # {'Alice', 'Charlie', 'Eve', 'Frank'}

exclusive = python_devs ^ java_devs  # Same result

The operator forms (|, &, -, ^) require both operands to be sets. The method forms accept any iterable as the argument, which can be more flexible.

# Method accepts any iterable
my_set = {1, 2, 3}
result = my_set.union([4, 5, 6])        # Works with a list
print(result)  # {1, 2, 3, 4, 5, 6}

# Operator requires a set
# my_set | [4, 5, 6]  # TypeError: unsupported operand type(s)
result = my_set | set([4, 5, 6])         # Must convert to set first

Set Methods

Beyond set operations, sets provide methods for adding, removing, and testing relationships between sets.

skills = {"Python", "Java", "SQL"}

# add() - add a single element
skills.add("Docker")
print(skills)  # {'Python', 'Java', 'SQL', 'Docker'}

# Adding a duplicate has no effect
skills.add("Python")
print(skills)  # {'Python', 'Java', 'SQL', 'Docker'}

# remove() - remove an element (raises KeyError if missing)
skills.remove("Java")
print(skills)  # {'Python', 'SQL', 'Docker'}
# skills.remove("Go")  # KeyError: 'Go'

# discard() - remove an element (NO error if missing)
skills.discard("Go")      # No error
skills.discard("Docker")  # Removes Docker
print(skills)  # {'Python', 'SQL'}

# pop() - remove and return an arbitrary element
skills = {"Python", "Java", "SQL", "Docker"}
removed = skills.pop()
print(f"Removed: {removed}")  # Removed: (arbitrary element)

# clear() - remove all elements
skills.clear()
print(skills)  # set()

Subset and superset testing

backend_skills = {"Python", "Java", "SQL", "Docker", "AWS"}
my_skills = {"Python", "SQL"}

# issubset() - is every element of my_skills in backend_skills?
print(my_skills.issubset(backend_skills))    # True
print(my_skills <= backend_skills)            # True (operator form)

# issuperset() - does backend_skills contain all of my_skills?
print(backend_skills.issuperset(my_skills))  # True
print(backend_skills >= my_skills)            # True (operator form)

# Proper subset (subset but not equal)
print(my_skills < backend_skills)  # True
print(backend_skills < backend_skills)  # False (equal, not proper subset)

# isdisjoint() - do the sets share NO elements?
frontend = {"React", "CSS", "JavaScript"}
print(frontend.isdisjoint(backend_skills))  # True (no overlap)
print(my_skills.isdisjoint(backend_skills)) # False (overlap exists)

Frozen Sets

A frozenset is an immutable version of a set. Once created, you cannot add or remove elements. Because frozensets are immutable and hashable, they can be used as dictionary keys or as elements of another set — something regular sets cannot do.

# Create a frozenset
immutable_skills = frozenset(["Python", "Java", "SQL"])
print(immutable_skills)  # frozenset({'Python', 'Java', 'SQL'})

# All read operations work
print("Python" in immutable_skills)  # True
print(len(immutable_skills))          # 3

# Set operations return new frozensets
more_skills = frozenset(["Docker", "Python"])
combined = immutable_skills | more_skills
print(combined)  # frozenset({'Python', 'Java', 'SQL', 'Docker'})

# Mutation is not allowed
# immutable_skills.add("Go")     # AttributeError
# immutable_skills.remove("SQL") # AttributeError

# Use as dictionary keys (regular sets cannot do this)
permissions = {
    frozenset(["read"]): "viewer",
    frozenset(["read", "write"]): "editor",
    frozenset(["read", "write", "admin"]): "admin"
}

user_perms = frozenset(["read", "write"])
print(permissions[user_perms])  # editor

# Use as elements of another set
set_of_sets = {frozenset([1, 2]), frozenset([3, 4])}
print(set_of_sets)  # {frozenset({1, 2}), frozenset({3, 4})}

Use frozenset when you need a set that should never change after creation — for example, representing a fixed set of permissions, a cache key based on a combination of values, or a constant lookup table.

Practical Set Examples

Remove duplicates while tracking what was removed

def find_duplicates(items):
    """
    Find and return duplicate items from a list.
    Returns a tuple of (unique_items, duplicates).
    """
    seen = set()
    duplicates = set()
    for item in items:
        if item in seen:
            duplicates.add(item)
        else:
            seen.add(item)
    return list(seen), list(duplicates)

names = ["Alice", "Bob", "Charlie", "Alice", "Diana", "Bob", "Eve", "Alice"]
unique, dupes = find_duplicates(names)
print(f"Unique: {unique}")
print(f"Duplicates: {dupes}")
# Unique: ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve']
# Duplicates: ['Alice', 'Bob']

Find common elements across multiple collections

def common_elements(*collections):
    """
    Find elements common to all provided collections.
    Accepts any number of iterables.
    """
    if not collections:
        return set()
    result = set(collections[0])
    for collection in collections[1:]:
        result &= set(collection)
    return result

team_a_skills = ["Python", "Java", "SQL", "Docker"]
team_b_skills = ["Python", "Go", "SQL", "Kubernetes"]
team_c_skills = ["Python", "Rust", "SQL", "AWS"]

shared = common_elements(team_a_skills, team_b_skills, team_c_skills)
print(f"Skills all teams share: {shared}")
# Skills all teams share: {'Python', 'SQL'}

Membership testing performance

import time

# Build a large dataset
data_list = list(range(1_000_000))
data_set = set(data_list)

target = 999_999  # Worst case for a list (last element)

# List lookup - O(n)
start = time.perf_counter()
for _ in range(1000):
    _ = target in data_list
list_time = time.perf_counter() - start

# Set lookup - O(1)
start = time.perf_counter()
for _ in range(1000):
    _ = target in data_set
set_time = time.perf_counter() - start

print(f"List lookup (1000x): {list_time:.4f}s")
print(f"Set lookup  (1000x): {set_time:.6f}s")
print(f"Set is ~{list_time / set_time:.0f}x faster")
# Typical output:
# List lookup (1000x): 8.1234s
# Set lookup  (1000x): 0.000045s
# Set is ~180000x faster

This performance difference is exactly why you should convert a list to a set when you need to check membership repeatedly. The conversion itself is O(n), but each lookup after that is O(1).

Shared Topics

Common Pitfalls

1. Empty dict vs empty set — the {} trap

# This is a dict, NOT a set!
empty = {}
print(type(empty))  # <class 'dict'>

# To create an empty set, you must use set()
empty_set = set()
print(type(empty_set))  # <class 'set'>

2. Unhashable types as dictionary keys or set elements

# Lists and dicts are mutable, so they are NOT hashable
# my_dict = {[1, 2, 3]: "value"}   # TypeError: unhashable type: 'list'
# my_set = {[1, 2], [3, 4]}         # TypeError: unhashable type: 'list'

# Use tuples instead (they are immutable and hashable)
my_dict = {(1, 2, 3): "value"}     # Works
my_set = {(1, 2), (3, 4)}          # Works
print(my_dict[(1, 2, 3)])  # value

# But tuples containing mutable objects are also unhashable
# bad = {([1, 2], [3, 4]): "value"}  # TypeError: unhashable type: 'list'

3. Dictionary ordering assumptions

# Since Python 3.7, dicts preserve INSERTION order.
# But do not assume dicts are sorted by key or value.
d = {"banana": 2, "apple": 1, "cherry": 3}
print(list(d.keys()))  # ['banana', 'apple', 'cherry'] (insertion order)

# If you need sorted keys, sort explicitly
for key in sorted(d.keys()):
    print(f"{key}: {d[key]}")
# apple: 1
# banana: 2
# cherry: 3

4. Modifying a dict while iterating over it

scores = {"Alice": 90, "Bob": 45, "Charlie": 72, "Diana": 38}

# BAD: modifying during iteration causes RuntimeError
# for name, score in scores.items():
#     if score < 50:
#         del scores[name]  # RuntimeError: dictionary changed size during iteration

# GOOD: collect keys to remove, then delete
to_remove = [name for name, score in scores.items() if score < 50]
for name in to_remove:
    del scores[name]
print(scores)  # {'Alice': 90, 'Charlie': 72}

# Or use a dict comprehension to create a new dict
scores = {"Alice": 90, "Bob": 45, "Charlie": 72, "Diana": 38}
passing = {k: v for k, v in scores.items() if v >= 50}
print(passing)  # {'Alice': 90, 'Charlie': 72}

Best Practices

1. Use get() with defaults instead of bracket notation

When a missing key is a normal possibility (not an error), use get() to avoid try/except blocks or pre-checks with in.

# Instead of this:
if "email" in user:
    email = user["email"]
else:
    email = "not provided"

# Do this:
email = user.get("email", "not provided")

2. Use dict comprehensions over manual loops

# Instead of this:
result = {}
for key, value in data.items():
    if value > 0:
        result[key] = value * 2

# Do this:
result = {k: v * 2 for k, v in data.items() if v > 0}

3. Use sets for membership testing

If you check if x in collection inside a loop, convert collection to a set first. The speedup can be orders of magnitude on large datasets.

# Instead of searching a list:
valid_codes = ["US", "CA", "MX", "UK", "DE", "FR", "JP"]  # O(n) per lookup

# Use a set:
valid_codes = {"US", "CA", "MX", "UK", "DE", "FR", "JP"}  # O(1) per lookup

if country_code in valid_codes:
    process(country_code)

4. Use defaultdict or setdefault to avoid key-existence checks

from collections import defaultdict

# Instead of:
groups = {}
for item in items:
    key = item["category"]
    if key not in groups:
        groups[key] = []
    groups[key].append(item)

# Do this:
groups = defaultdict(list)
for item in items:
    groups[item["category"]].append(item)

5. Use the merge operator for combining dicts (Python 3.9+)

# Clean, readable dict merging
defaults = {"timeout": 30, "retries": 3, "verbose": False}
user_config = {"timeout": 60, "verbose": True}

final = defaults | user_config  # user_config wins on conflicts

6. Prefer frozenset when immutability is needed

If a set should not change after creation, use frozenset. This communicates intent, prevents accidental modification, and enables use as a dict key or set element.

Key Takeaways

  • Dictionaries store key-value pairs with O(1) average lookup, insertion, and deletion. They are the go-to structure for mappings, lookups, and structured data.
  • Create dicts with literals {}, dict(), comprehensions, or fromkeys(). Use whichever is most readable for your situation.
  • Always prefer get() with a default over bracket notation when a missing key is a normal case, not an error.
  • update() and the |= operator (Python 3.9+) merge dictionaries. The right-hand side wins on key conflicts.
  • defaultdict, Counter, and OrderedDict from the collections module handle specialized patterns more cleanly than a plain dict.
  • Dictionary comprehensions are concise, readable, and usually faster than manual loops.
  • Sets store unique, hashable elements with O(1) membership testing. Use them when duplicates are not allowed or when you need fast "is this in the collection?" checks.
  • Set operations (|, &, -, ^) correspond to union, intersection, difference, and symmetric difference. Both operator and method forms are available.
  • frozenset is an immutable set that can be used as a dictionary key or an element of another set.
  • Remember: {} creates an empty dict, not a set. Use set() for an empty set.
  • Never use mutable objects (lists, dicts, sets) as dictionary keys or set elements — use their immutable counterparts (tuples, frozensets) instead.
  • Convert lists to sets when you need repeated membership checks — the performance difference is dramatic.
  • Never modify a dictionary or set while iterating over it — collect changes first, then apply them.



Subscribe To Our Newsletter
You will receive our latest post and tutorial.
Thank you for subscribing!

required
required


Leave a Reply

Your email address will not be published. Required fields are marked *