Dictionaries
An almost complete guide to Python's key:value store

Dictionaries are key-value stores, meaning they store, and allow retrieval of data (or values) through a unique key. This is analogous with a real dictionary where you look up definitions (data) using a given key — the word. Unlike a language dictionary however, keys in Python dictionaries are not alphabetically sorted.

From Python 3.6 onwards dictionaries are ordered in that elements are stored and retrieved in the order in which they are added. This usually only has consequences for iterating (see later).

Anything which can be stored in a Python variable can be stored in a dictionary value. That includes mutable types including list and even dict — meaning you can nest dictionaries inside on another. In contrast keys must be hashable1 and immutable — the object hash must not change once calculated. This means list or dict objects cannot be used for dictionary keys, however a tuple is fine.

Creating

Dictionaries can be defined using both literal or constructor syntax. Literal syntax is a bit cleaner, but there are situations where dict() is useful.

d = {}        # An empty dictionary, using literal syntax
d = dict()    # An empty dictionary, using object syntax

You can add initial items to a dictionary by passing the key-value pairs at creation time. The following two syntaxes are equivalent, and will produce an identical dictionary.

>>> d = {'key1': 'value1', 'key2': 'value2', 'key3': 3}
>>> d
{'key1': 'value1', 'key2': 'value2', 'key3': 3}

>>> d = dict(key1='value1', key2='value2', key3=3)
>>> d
{'key1': 'value1', 'key2': 'value2', 'key3': 3}

However, note that keys in the dict syntax are limited to valid keyword parameter names only — for example, you cannot use anything which would not be a valid variable name (including numbers, number-initial alphanumeric names or punctuation).

>>> dict(1='hello')
SyntaxError: invalid syntax

>>> dict(1a='hello')
SyntaxError: invalid syntax

As always in Python, keyword parameters are interpreted as string names, ignoring any variables defined with the same name.

>>> a = 12345
>>> {a:'test'}
{12345: 'test'}

>>> dict(a='test')
{'a': 'test'}

For this reason dict() is only really useful where you have very restricted key names. This is often the case, but you can avoid these annoyances completely by sticking with the literal {} syntax.

Adding

You can add items to a dictionary by assigning a value to a key, using the square bracket [] syntax.

>>> d = {}
>>> d['this'] = 'that'
>>> d
{'this':'that'}

Assigning to keys which already exist will replace the existing value for that key.

>>> d = {}
>>> d['this'] = 'that'
>>> d['this'] = 'the other'
>>> d
{'this':'the other'}

Retrieving

Values for a given key can be retrieved by key, using the square bracket [] syntax.

>>> d = {'key1': 'value1', 'key2': 'value2', 'key3': 3}
>>> d['key1']
'value1'

Retrieving an item does not remove it from the dictionary.

>>> d
{'key1': 'value1', 'key2': 'value2', 'key3': 3}

The value returned is the same object stored in the dictionary, not a copy. This is important to bear in mind when using mutable objects such as lists as values.

>>> d = {'key1': [1,2,3,4]}
>>> l = d['key1']
>>> l
[1,2,3,4]

>>> l.pop()
4

>>> d
d = {'key1': [1,2,3]}

Notice that changes made to the returned list continue to be reflected in the dictionary. The retrieved list and the value in the dictionary are the same object.

Removing

To remove an item from a dictionary you can use del using square bracket syntax with the key to access the element.

>>> d = {'key1': 'value1', 'key2': 'value2', 'key3': 3}

>>> del d['key1]
>>> d
{'key2':'value2', 'key3': 3}

You can also remove items from a dictionary by using .pop(<key>). This removes the given key from the dictionary, and returns the value.

>>> d = {'key1': 'value1', 'key2': 'value2', 'key3': 3}

>>> d.pop('key1)
'value1'

>>> d
{'key2':'value2', 'key3': 3}

Counting

The number of elements in a dictionary can be found by using len().

>>> d = {'key1': 'value1', 'key2': 'value2', 'key3': 3}
>>> len(d)
3

The length of a dictionaries .keys(), .values() and .items() are always equal.

View objects

The keys, values and items from a dictionary can be accessed using the .keys(), .values() and .items() methods. These methods return view objects which provide a view on the source dictionary.

There are separate view objects for each of keys, values and itemsdict_keys, dict_values and dict_items respectively.

>>> d = {'key1': 'value1', 'key2': 'value2', 'key3': 3}
>>> d.keys()
dict_keys(['key1', 'key2', 'key3'])

>>> d.values()
dict_values(['value1', 'value2', 3])

dict_items provides a view over tuples of (key, value) pairs.

>>> d.items()
dict_items([('key1', 'value1'), ('key2', 'value2'), ('key3', 3)])

These view objects are all iterable. They are also dynamic — changes to the original dictionary continue to be reflected in the view after it is created.

>>> k = d.keys()
>>> k
dict_keys(['key1', 'key2', 'key3'])

>>> d['key4'] = 'value4'
>>> k
dict_keys(['key1', 'key2', 'key3', 'key4'])

This is different to Python 2.7, where .keys(), .values() and .items() returned a static list.

Membership

To determine if a given key is present in a dictionary, you can use the in keyword. This will return True if the give key is found, False if it is not.

>>> d = {'key1': 'value1', 'key2': 'value2', 'key3': 3}

>>> 'key2' in d
True

>>> 'key5' in d
False

You can also check whether a given value or key-value pair is in a dictionary by using the .values() and .items() views.

>>> 'value1' in d.values()
True

>>> 'value5' in d.values()
False

>>> ('key1', 'value1') in d.items()
True

>>> ('key3', 'value5') in d.items()
False

These lookups are less efficient that key-based lookups on dictionaries, and needing to lookup values or items is often an indication that a dict is not a good store for your data.

Lists from dictionaries

To get a list of a dictionary's keys, values or items of a dictionary to lists, we can take the dict_keys, dict_values or dict_items view objects and pass them to list().

>>> d = {'key1': 'value1', 'key2': 'value2', 'key3': 3}

>>> list(d.keys())
['key1', 'key2', 'key3']

>>> list(d.values())
['value1', 'value2', 3]

>>> list(d.items())
[('key1', 'value1'), ('key2', 'value2'), ('key3', 3)]

Converting the view objects to lists breaks the link to the original dictionary, so further updates to the dictionary will not be reflected in the list.

Dictionaries from lists

Similarly lists can be used to generate dictionaries. The simplest approach is using a list of 2-tuple where the first element in the tuple is used for the key and the second for the value.

>>> l = [('key1', 'value1'), ('key2', 'value2'), ('key3', 3)]
>>> d = dict(l) # Pass the list as to the dict constructor

>>> d
{'key1': 'value1', 'key2': 'value2', 'key3': 3}

You can pass in other iterators, not just lists. The only restriction is that the iterator needs to return 2 items per iteration.

If you have your key and value elements in seperate lists, you can use zip to combine them together into tuples before creating the dictionary.

>>> keys = ['key1', 'key2', 'key3']
>>> vals = ['value1', 'value2', 3]

>>> l = zip(keys, vals)
>>> l
<zip object>

>>> dict(l)
{'key1': 'value1', 'key2': 'value2', 'key3': 3}

If key and value lists are not of the same length, the behaviour of zip is to silently drop any extra items from the longer list.

>>> keys = ['key1', 'key2', 'oops']
>>> vals = ['value1', 'value2']

>>> dict(zip(keys, vals))
{'key1': 'value1', 'key2': 'value2'}

Iterating

By default iterating over a dictionary iterates over the keys.

>>> d = {'key1': 'value1', 'key2': 'value2', 'key3': 3}

>>> for k in d:
...     print(k)
key1
key2
key3

This is functionally equivalent to iterating over the .keys() view.

>>> d = {'key1': 'value1', 'key2': 'value2', 'key3': 3}

>>> for k in d.keys():
...     print(k)
key1
key2
key3

The dictionary is unaffected by iterating over it, and you can use the key within your loop to access the value from the dictionary.

>>> d = {'key1': 'value1', 'key2': 'value2', 'key3': 3}

>>> for k in d:
...     print(k, d[k])  # Access value by key.
key1 value1
key2 value2
key3 3

If you want access to dictionary values within your loop, you can iterate over items to have them returned in the for loop. The keys vand values are returned as a 2-tuple.

>>> d = {'key1':'value1', 'key2':'value2', 'key3':3}

>>> for kv in d.items():
...     print(kv)
('key1', 'value1')
('key2', 'value2')
('key3', 3)

You can unpack the key and value to seperate variables in the loop, making them available without indexing. This is the most common loop structure used with dictionaries.

>>> d = {'key1':'value1', 'key2':'value2', 'key3':3}

>>> for k, v in d.items():
...     print(k, v)
key1 value1
key2 value2
key3 3

If you are only interested in the dictionary values you can also iterate over these directly.

>>> d = {'key1':'value1', 'key2':'value2', 'key3':3}

>>> for v in d.values():
...     print(v)
value1
value2
3

If you want to count as you iterate you can use enumerate as with any iterator, but you must nest the unpacking.

>>> d = {'key1':'value1', 'key2':'value2', 'key3':3}

>>> for n, (k, v) in enumerate(d.items()):
...     print(n, k, v)
0 key1 value1
1 key2 value2
2 key3 3

Dictionary comprehensions

Dictionary comprehensions are shorthand iterations which can be used to construct dictionaries, while filtering or altering keys or values.

Iterating over a list of (key, value) tuples and assigning to keys and values will create a new dictionary.

>>> l = [('key1','value1'), ('key2','value2'), ('key3',3)]

>>> {k:v for k,v in l} 
{'key1': 'value1', 'key2': 'value2', 'key3': 3}

You can filter elements by using a trailing if clause. If this expression evaluates to False the element will be skipped (if it evaluates True it will be added).

>>> l = [('key1','value1'), ('key2','value2'), ('key3',3)]

>>> {k:v for k,v in l if isinstance(v, str)}  # Only add strings.
{'key1': 'value1', 'key2': 'value2'}

Any valid expression can be used for the comparison, as long as it returns thruthy or falsey values.

>>> l = [('key1','value1'), ('key2','value2'), ('key3',3)]

>>> {k:v for k,v in l if v != 'value1'}
{'key2': 'value2', 'key3': 3}

Comparisons can be performed against keys, values, or both.

>>> l = [('key1','value1'), ('key2','value2'), ('key3',3)]

>>> {k:v for k,v in l if v != 'value1' and k != 'key3'}
{'key2': 'value2'}

Since empty string evaluates as False in Python testing the value alone can be used to strip empty string values from a dictionary.

>>> d = {'key1':'value1', 'key2':'value2', 'key3':'', 'another-empty':''}

>>> {k:v for k,v in d.items() if v}
{'key1': 'value1', 'key2': 'value2'}

Separate lists of keys and values can be zipped, and filtered using a dictionary comprehension.

>> k = ['key1', 'key2', 'key3']
>> v = ['value1', 'value2', 3]

>>> {k:v for k,v in zip(k,v) if k != 'key1'}
{'key2': 'value2', 'key3': 3}

Expressions can also be used in the k:v construct to alter keys or values that are generated for the dictionary.

>>> l = [('key1', 1), ('key2', 2), ('key3', 3)]

>>> {k:v**2 for k,v in l}
{'key1': 1, 'key2': 4, 'key3': 9}

Any expressions are valid, for both keys and values, including calling functions.

>>> l = [('key1', 1), ('key2', 2), ('key3', 3)]

>>> def cube(v):
...     return v**3

>>> def reverse(k):
...     return k[::-1]

>>> {reverse(k):cube(v) for k,v in l}
{'1yek': 1, '2yek': 8, '3yek': 27}

You can use a ternary if-else in the k:v to selectively replace keys. In the following example values are replaced if they don't match 'value1'.

>>> l = [('key1','value1'), ('key2','value2'), ('key3',3)]

>>> {k:v if v=='value1' else None for k,v in l}
{'key1': 'value1', 'key2': None, 'key3': None}

You can also use ternary syntax to process keys. Any expressions are valid here, in the follow example we replace missing keys with the current iteration number (1-indexed).

>>> l = [(None,'value1'), (None,'value2'), ('key3',3)]

>>> {k if k else n:v for n,(k,v) in enumerate(l, 1)}
{1: 'value1', 2: 'value2', 'key3': 3}

If your expressions generate duplicate keys, the later value will take precedence for that key.

>>> l = [(None,'value1'), (None,'value2'), ('key3',3)]

>>> {k if k else 0:v for n,(k,v) in enumerate(l)}
{0: 'value2', 'key3': 3} # 0:value1 has been overwritten by 0:value1

You can use nested loops within dictionary comprehensions although you often won't want to since it can get pretty confusing. One useful application of this however is for flattening nested dictionaries. The follow example unnestes 2-deep dictionaries, discarding the outer keys.

>>> d = {'a': {'naa':1, 'nab':2, 'nac':3}, 'b': {'nba':4, 'nbb':5, 'nbc':6}}

>>> {k:v for di in d.values() for k,v in di.items()}
{'naa': 1, 'nab': 2, 'nac': 3, 'nba': 4, 'nbb': 5, 'nbc': 6}

The left hand loops it the outer loop, which iterates the d dictionary producing the values in di. The inner loop on the right iterates this dictionary keys and values as k and v, which are used to construct the new dictionary on the far left k:v.

Merging

There are a number of ways to merge dictionaries. The major difference between the approaches is in how (or whether) they handle duplicate keys.

Update

Each dictionary object has an .update() method, which can be used to add a set of keys and values to an existing dictionary, using another dictionary as the source.

>>> d1 = {'key1':'value1', 'key2':'value2', 'key3':3}
>>> d2 = {'key4':'value4', 'key5':'value5'}

>>> d1.update(d2)
>>> d1
{'key1':'value1', 'key2':'value2', 'key3': 3, 'key4':'value4', 'key5':'value5'}

This updates the original dictionary, and does not return a copy.

If there are duplicate keys in the dictionary being updated from, the values from that dictionary will replace those in the dictionary being updated.

>>> d1 = {'key1':'value1', 'key2':'value2', 'key3':3}
>>> d2 = {'key3':'value3-new', 'key5':'value5'}

>>> d1.update(d2)
>>> d1
{'key1':'value1', 'key2':'value2', 'key3':'value3-new', 'key5':'value5'}

If you do not want to replace already existing keys, you can use a dictionary comprehension to pre-filter.

>>> d1 = {'key1':'value1', 'key2':'value2', 'key3':3}
>>> d2 = {'key3':'value3-new', 'key5':'value5'}

>>> d1.update({k:v for k, v in d2.items() if k not in d1})
>>> d1
{'key1':'value1', 'key2':'value2', 'key3': 3, 'key5':'value5'}

Unpacking

Dictionaries can be unpacked to key=value keyword pairs, which is used to pass parameters to functions or constructors. This can be used to combine multiple dictionaries by unpacking them consecutively.

This requires Python 3.6 and above.

>>> d1 = {'key1':'value1', 'key2':'value2', 'key3':3}
>>> d2 = {'key4':'value4', 'key5':'value5'}

>>> d = {**d1, **d2}
>>> d
{'key1': 'value1', 'key2': 'value2', 'key3': 3, 'key4': 'value4', 'key5': 'value5'}

Unpacking using this syntax handles duplicate keys, with the later dictionary taking precedence of the earlier.

>>> d1 = {'key1':'value1', 'key2':'value2', 'key3':3}
>>> d2 = {'key3':'value3-new', 'key5':'value5'}

>>> d = {**d1, **d2}
>>> d
{'key1': 'value1', 'key2': 'value2', 'key3': 'value3-new', 'key5': 'value5'}

You can use this same syntax to merge multiple dictionaries together.

>>> d1 = {'key1':'value1', 'key2':'value2', 'key3':3}
>>> d2 = {'key3':'value3-new', 'key5':'value5'}
>>> d3 = {'key4':'value4', 'key6':'value6'}

>>> d = {**d1, **d2, **d3}
>>> d
{'key1': 'value1', 'key2': 'value2', 'key3': 'value3-new', 'key5': 'value5', 'key4': 'value4', 'key6': 'value6'}

You can also unpack to a dict()

>>> dict(**d1, **d3)
{'key1': 'value1', 'key2': 'value2', 'key3': 3, 'key4': 'value4', 'key6': 'value6'}

>>> dict(**d1, **d2)
TypeError: type object got multiple values for keyword argument 'key3'

However, in this case duplicate keys are not supported, and you are limited by the keyword naming restrictions described earlier.

>>> dict(**d1, **d2)
TypeError: type object got multiple values for keyword argument 'key3'

>>> dict(**{3:'value3'})
TypeError: keyword arguments must be strings

There is no such restriction for {} unpacking.

>>> {**{3:'value3'}}
{3:'value3'}

Addition (Python 2.7 only)

In Python 2.7 dict.items() returns a list of (key, value) tuples. Lists can be concatenated using the + operator, and the resulting list can be converted back to a new dictionary by passing to the dict constructor.

>>> d1 = {'key1':'value1', 'key2':'value2', 'key3':3}
>>> d2 = {'key3':'value3-new', 'key5':'value5'}

>>> l = dict(d1.items() + d2.items())
>>> l
[('key3', 3), ('key2', 'value2'), ('key1', 'value1'), ('key3', 'value3-new'), ('key5', 'value5')]

>>> dict(l)
{'key3': 'value3-new', 'key2': 'value2', 'key1': 'value1', 'key5': 'value5'}

You can add together multiple dictionaries using this method. The later dictionary keys take precedence over the former.

Union (set merge)

If both the keys and values of a dictionary are hashable, the dict_items view supports set-like operations.

>>> d1 = {'key1':'value1', 'key2':'value2', 'key3':3}
>>> d2 = {'key3':'value3-new', 'key5':'value5'}
>>> d3 = {'key4':'value4', 'key6':'value6'}

>>> dict(d1.items() | d2.items() | d3.items())
{'key4': 'value4', 'key5': 'value5', 'key2': 'value2', 'key6': 'value6', 'key3': 3, 'key1': 'value1'}

The merging occurs right-to left.

If the values are not hashable this will raise a TypeError.

>>> d1 = {'key1':'value1', 'key2':'value2', 'key3':3}
>>> d2 = {'key3':'value3-new', 'key5': []}  # list is unhashable

>>> d1.items() | d2.items()
TypeError: unhashable type:'list'

All standard set operations are possible on dict_keys and dict_items.

Copying

To make a copy of an existing dictionary you can use .copy(). This results in an identical dictionary which is a distinct object.

>>> d1 = {'key1':'value1', 'key2':'value2', 'key3':3}

>>> d2 = d1.copy()
>>> d2
{'key1':'value1', 'key2':'value2', 'key3':3}

>>> id(d1) == id(d2)
False

You can also make a copy of a dictionary by passing an existing dictionary to the dict constructor. This is functionally equivalent to .copy().

>>> d1 = {'key1':'value1', 'key2':'value2', 'key3':3}

>>> d2 = dict(d1)
>>> d2
{'key1':'value1', 'key2':'value2', 'key3':3}

>>> id(d1) == id(d2)
False

In both cases these are shallow copies meaning nested objects within the dictionary are not also copied. Changes to this nested objects will also be reflected in the original dictionary.

>>> d1 = {'key1':'value1', 'key2':'value2', 'key3':{'nested':'dictionary'}}

>>> d2 = d1.copy()
>>> d2
{'key1': 'value1', 'key2': 'value2', 'key3': {'nested': 'dictionary'}}

>>> id(d1) == id(d2)
False

>>> id(d1['key3']) == id(d2['key3'])
True

>>> d2['key3']['nested'] = 'I changed in d1'
>>> d1
{'key1': 'value1', 'key2': 'value2', 'key3': {'nested': 'I changed in d1'}}

If you want nested objects to also be copied, you need to create a deepcopy of your dictionary.

>>> d1 = {'key1':'value1', 'key2':'value2', 'key3':{'nested':'dictionary'}}

>>> from copy import deepcopy
>>> d2 = deepcopy(d1)
>>> d2
{'key1': 'value1', 'key2': 'value2', 'key3': {'nested': 'dictionary'}}

>>> id(d1) == id(d2)
False

>>> id(d1['key3']) == id(d2['key3'])
False

>>> d2['key3']['nested'] = ['I did not change in d1']
>>> d1
{'key1': 'value1', 'key2': 'value2', 'key3': {'nested': 'dictionary'}}

Since a deepcopy copies all nested objects it is slower and uses more memory. Only use it when it's actually neccessary.


  1. A hash is a reproducible, compact, representation of an original value. Reproducible means that hashing the same input will always produce the same output. This is essential for dictionary keys where hashes are used to store and look up values: if the hash changed each time we hashed the key, we'd never find anything! 

Continue reading

Dictionary Views & Set Operations

The keys, values and items from a dictionary can be accessed using the .keys(), .values() and .items() methods. These methods return view objects which provide a view on the source dictionary. The view objects dict_keys and dict_items support set-like operations (the latter only when all values are hashable) which ... More

Discussion