Dictionaries vs Arrays in Python - Deep Dive

Introduction

In this guide, we'll walk you through the differences between two of Python's most popular data structures - Dictionaries and Arrays. Each of these provide a specific way of arranging your data, with pros and cons for certain tasks and knowing when to use which will allow you to leverage the built-in functionalities.

Note: This guide assumes Python 3.x, and most of it is oriented at versions after that. We will, however, also note some key differences for Python 2.x.

Guide to Python Arrays

An Array is one of the fundamental data structures in computer science - a sequence of 0..n elements, where each element has an index.

Most arrays have a fixed size, so they take a chunk of memory every time a new one is created:

Here, we've got a simple array consisting of 7 elements. Indexing typically starts at 0, and each element has a positional index that we can use to access it. This makes the array's access time complexity O(1).

Most of Python's arrays are dynamically typed, which means that the objects of an array have a type, but the array itself is not restricted to only one type - you can have an array consisting of an integer, a string, an object, or even of another array that's heterogeneously mixed as well.

There are 6 important types of arrays in Python: list, tuple, str, bytes, bytearray, and array.array.

When talking about each of them, there are a few key properties we'll take into account:

  • Whether they're dynamic or not dynamic
  • Whether they're statically or dynamically typed
  • Whether they're mutable or immutable

Python Lists

A list in Python is dynamic (non-fixed size), dynamically typed (elements not restricted to a single type), and mutable (elements can be changed in-place).

In Python, a list is defined by declaring its elements within squared brackets []. Let's go ahead and define a list:

my_list = [1, 2, 3, "Mark", "John", "Emma"]
print(my_list)

It contains a few integers and a few strings, denoting names. Since lists are dynamically typed, this is allowed:

[1, 2, 3, 'Mark', 'John', 'Emma']

Since lists are dynamic, we can change the number of elements by adding a new one, for example:

my_list.append(4)
my_list.append("Peter")
print(my_list)

This results in our list having 8 elements, instead of the 6 we defined in the beginning:

[1, 2, 3, 'Mark', 'John', 'Emma', 4, 'Peter']

Now, let's try replacing an element and adding a new one. We'll check the ID of the list (reference in memory) to confirm that it's not switched out under the hood with a new copy that contains either added elements or replaced ones:

my_list = [1, 2, 3, "Mark", "John", "Emma", 4, "Peter"]
# Print original list and its ID
print('Original list: ', my_list)
print('ID of object in memory: ', id(my_list))

# Modify existing element and add a new one
my_list[4] = "Anna"
my_list.append("Dan")

# Print changed list and its ID
print('Changed list: ', my_list)
print('ID of object in memory: ', id(my_list))

Running this code results in:

Original list:  [1, 2, 3, 'Mark', 'John', 'Emma', 4, 'Peter']
ID of object in memory:  140024176315840
Changed list:  [1, 2, 3, 'Mark', 'Anna', 'Emma', 4, 'Peter', 'Dan']
ID of object in memory:  140024176315840

The fact that my_list points to the same object in memory (140024176315840) further goes to show how lists are mutable.

Python's lists can even store functions in a sequence:

def f1():
    return "Function one"

def f2():
    return "Function two"

def f3():
    return "Function three"

list_of_functions = [f1, f2, f3]
print(list_of_functions)

Which will result in:

[<function f1 at 0x0000016531807488>, <function f2 at 0x00000165318072F0>, <function f3 at 0x0000016531807400>]

Our output consists of functions at the given addresses. Now let's try and access a function and run it:

print(list_of_functions[0]())

Since the first element of this list is f1(), we'd expect its appropriate print() statement to run:

Function one

Lists are the most commonly used type of arrays in Python. They are easy to use and intuitive. Additionally, their time complexity for accessing elements is O(1).

Python Tuples

A tuple in Python is non-dynamic (fixed size), dynamically typed (elements not restricted to a single type), and immutable (elements cannot be changed in-place).

In addition to that, we use regular brackets () when defining them:

my_tuple = (1, 2, 3, "Mark", "John", "Emma")
print(my_tuple)

Since tuples are dynamically typed, we can have elements of different types present within them:

(1, 2, 3, 'Mark', 'John', 'Emma')

Since tuples are non-dynamic, they have a fixed size, and we can't append() elements to them in-place, since this changes their size. Thus, tuples don't have an append() method.

We can, however, create a new tuple consisting of smaller tuples, which again is of fixed size:

my_tuple = (1, 2, 3)
another_tuple = ("Mark", "John", "Emma")
print('Original tuple: ', my_tuple)
print('ID of object in memory: ', id(my_tuple))

my_tuple = my_tuple + another_tuple
print('New tuple: ', my_tuple)
print('ID of object in memory: ', id(my_tuple))

We've assigned the same variable reference to a new object created to contain both of these tuples together - even though the reference variable is the same, it points to a totally different object in memory:

Original tuple:  (1, 2, 3)
ID of object in memory:  139960147395136

New tuple:  (1, 2, 3, 'Mark', 'John', 'Emma')
ID of object in memory:  139960147855776

The time complexity for accessing items in a tuple is also O(1).

Python Strings

In Python 3, the str type (short for String) is overhauled from Python 2. In Python 2, it used to represent both text and bytes, but since Python 3 - these two are totally different data types.

A string in Python is non-dynamic (fixed size), statically typed (elements restricted to a single type), and immutable (elements cannot be changed in-place).

A sequence of bytes (in human-readable characters), enclosed within parentheses "" is used to define a string:

my_str = "qwerty"
print(my_str)

This will result in:

qwerty

We can access elements via standard array indexing, but can't change them:

print(my_str[0])
my_str[0] = "p"

This will result in:

q
TypeError: 'str' object does not support item assignment

In fact - strings are recursive. When we declare a string using characters - a string for each character is formed, which is then added to a list of strings that constitute another string.

my_str has a length of 5, and is made up of five individual strings, of length 1:

my_str = "abcde"
print(len(my_str)) # Check the length of our str
print(type(my_str)) # Check the type of our str

print(my_str[0]) # Letter 'a'
print(len(my_str[0])) # Check the length of our letter
print(type(my_str[0])) # Check the type of our letter 'a'

This results in:

5
<class 'str'>
a
1
<class 'str'>

Both our 'character' and string are of the same class - str.

Similar to tuples, we can concatenate strings - which results in a new string consisting of the two smaller ones:

my_str = "qwerty"
my_str2 = "123"

result = my_str + my_str2
print(result)

And the result is:

qwerty123

Again, strings only support characters and we cannot mix in other types:

my_str = "qwerty"
my_str2 = 123

result = my_str + my_str2
print(result)

Which will result in:

TypeError: can only concatenate str (not "int") to str

However, int, as well as every other type can be casted (converted) into a string representation:

my_str = "qwerty"
my_str2 = str(123) # int 123 is now casted to str

result = my_str + my_str2
print(result)

This will result in:

qwerty123

With this method, you can get away with printing, for example, ints and strings in the same line:

my_str = "qwerty"
print("my_str's length is: " + len(my_str)) # TypeError

print("my_str's length is: " + str(len(my_str))) # String concatenation resulting in 'my_str's length is: 6'

Python Bytes

Bytes in Python are non-dynamic (fixed size), statically typed (elements restricted to a single type), and immutable (elements cannot be changed in-place).

A bytes object consists of multiple single bytes or integers, ranging from 0 to 255 (8-bit).

Defining a bytes object is slightly different from other arrays since we explicitly have to cast a tuple into bytes:

my_bytes = bytes((0, 1, 2))
print(my_bytes)

This will result in:

b'\x00\x01\x02'

If the tuple contains elements of different types, a TypeError is thrown:

my_bytes = bytes((0, 1, 2, 'string'))
TypeError: 'str' object cannot be interpreted as an integer

When working with str objects, an array of bytes must be encoded with a charset, otherwise, it'll be ambiguous as to what they represent:

my_str = "This is a string"

my_bytes = bytes(my_str) # this will result in an error TypeError: string argument without an encoding

my_bytes = bytes(my_str, 'utf-8')
print(my_bytes) # this will print out my_str normally

If you're unfamiliar with how encoding bytes works - read our guide on How to Convert Bytes to String in Python.

Furthermore, a bytes array of integers can be mutable when casted to another array type called the bytearray.

Python Bytearray

A bytearray in Python is dynamic (non-fixed size), statically typed (elements restricted to a single type), and mutable (elements can be changed in-place).

my_byte_array = bytearray((0, 1, 2))

Now, we can try to add elements to this array, as well as change an element:

my_byte_array = bytearray((0, 1, 2))
print(my_byte_array)
print("ByteArray ID: ", id(my_byte_array))

my_byte_array.append(3)
print(my_byte_array)
print("ByteArray ID: ", id(my_byte_array))

my_byte_array[3] = 50
print(my_byte_array)
print("ByteArray ID: ", id(my_byte_array))

This results in:

bytearray(b'\x00\x01\x02')
ByteArray ID:  140235112668272

bytearray(b'\x00\x01\x02\x03')
ByteArray ID:  140235112668272

bytearray(b'\x00\x01\x022')
ByteArray ID:  140235112668272

These all have the same object ID - pointing to the same object in memory being changed.

Free eBook: Git Essentials

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

A bytearray can be casted back to a bytes array; though, keep in mind that it's an expensive operation that takes O(n) time.

Python array.array

So far, we've been working with built-in types. However, another type of array exists, in the array module.

This array is dynamic (non-fixed size), statically typed (elements restricted to a single type), and mutable (can be changed in-place). We need to explicitly note the type we'll be using in an array and these types are C-style types: 32-bit integers, floating point numbers, doubles, etc.

Each of these has a marker - i for integers, f for floats, and d for doubles. Let's make an integer array via the array module:

import array

my_array = array.array("i", (1, 2, 3, 4))

Some of the more used C-like types:

Guide to Python Dictionaries

The Dictionary is a central data structure in Python. It stores data in key-value pairs.

Due to this, it can also be called a map, hash map, or a lookup table.

There are a few different variants of a dictionary:

  • dict
  • collections.defaultdict
  • collections.OrderedDict
  • collections.ChainMap

Dictionaries rely on hash values, that identify keys for the lookup operation. A hashtable contains many hash values which never change during the lifetime of a hashtable.

To learn more about dictionaries in Python, read our "Guide to Dictionaries in Python".

Hashable Type and Hash Values

Every object has a hash value, and the hash() method can be used to retrieve it. This value isn't constant and is calculated at runtime, though if a == b, hash(a) will always be equal to hash(b):

random_string = "This is a random string"
a = 23
b = 23.5
print(hash(random_string))
print(hash(a))
print(hash(b))

This code will result in something along the lines of:

4400833007061176223
23
1152921504606846999

Numeric values that are equal have the same hash value, regardless of their type:

a = 23
b = 23.0
print(hash(a))
print(hash(b))

Results in:

23
23

This mechanism is what makes dictionaries blazingly fast in Python - unique identifiers for each element, giving them a lookup time of O(1).

Python Dictionary

The contents of a dictionary (dict type) are defined within curly braces {}. The syntax resembles JSON, given the key-value pairs:

my_dict = {
    "name": "Mike James",
    "age": 32,
    "country": "United Kingdom"
}

A dictionary can have an arbitrary number of pairs and keys should be hashable without any duplicate keys (duplicate keys will result in the same hash). In such cases, the first key will be rejected and the dictionary will only actually contain the second key.

Since dictionaries are mutable, we can add a new key-value pair just by 'accessing' a non-existent key and setting its value:

my_dict["countries_visited"] = ["Spain", "Portugal", "Russia"]
print(my_dict)

This will result in:

{'name': 'Mike James', 'age': 34, 'country': 'United Kingdom', 'countries_visited': ['Spain', 'Portugal', 'Russia']}

Python's core dict will probably solve most of your problems, but if not, there are a few dictionary types that can be imported from a library called collections.

Python DefaultDict

A problem that you can encounter when using a dict is trying to access the value of a key that doesn't exist.

For example, in our previous demonstration, if we accessed print(my_dict["zip_code"]), we would get a KeyError: zip_code as zip_code doesn't exist.

This is when defaultdict comes into play, as it requests a default_factory - a function that returns the default value if a key is not present. This way, a defaultdict can never raise a KeyError:

from collections import defaultdict 

# default_factory
def safe_function():
    return "Value not defined"

my_dict = defaultdict(safe_function)
my_dict["name"] = "Mark James"
my_dict["age"] = 32

print(my_dict["country"]) # This will output Value not defined and not raise a KeyError

This, as expected, results in:

Value not defined

Defining defaultdict values is different from the core dict class because every key-value pair must be defined 'manually' which is more tedious than the JSON-like syntax.

Python ChainMap

This type of dictionary allows us to connect multiple dictionaries into one - to chain them. When accessing data, it will look for a key one by one until it finds the first correct one:

from collections import ChainMap

my_dict1 = {
    "name": "Mike James",
    "age": 32
}

my_dict2 = {
    "name": "James Mike",
    "country": "United Kingdom",
    "countries_visited": ["Spain", "Portugal", "Russia"]
}

my_dict_result = ChainMap(my_dict1, my_dict2)
print(my_dict_result)

This results in a ChainMap:

ChainMap({'name': 'Mike James', 'age': 32}, {'name': 'James Mike', 'country': 'United Kingdom', 'countries_visited': ['Spain', 'Portugal', 'Russia']})

We can also define duplicate keys. 'name' is present in both dictionaries. However, when we try to access the 'name' key:

print(my_dict_result['name'])

It finds the first matching key:

Mike James

Also, keep in mind that these can still raise a KeyError since we are now working with a core dict.

Python OrderedDict

Note: As of Python 3.6, dictionaries are insertion-ordered by default.

The OrderedDict is used when you'd like to maintain the order of insertion of key-value pairs in a dictionary. dict doesn't guarantee this, and you may end up with a different order of insertion than chronological.

If this isn't an important thing - you can comfortably use a dictionary. If this is important, though, such as when dealing with dates, you'll want to use an OrderedDict instead:

from collections import OrderedDict

ordered_dict = OrderedDict()
ordered_dict['a'] = 1
ordered_dict['b'] = 2
ordered_dict['c'] = 3
ordered_dict['d'] = 4
  
print(ordered_dict)

This results in:

OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4)])

Note: Even though dict objects preserve the insertion order as of Python 3.6 - use OrderedDict if insertion order is required. Your code won't guarantee insertion order across other Python versions (prior ones) if you use a regular dict.

Dictionary Methods vs Array Methods

Now that we got the hang of things, we should cover all the methods that these two types have implemented in them. There are four basic operations that could be done to data: access (get), update, add, and delete.

Let's define an array and dictionary that we'll be experimenting with:

example_dict = {
    "id": 101,
    "name": "Marc Evans",
    "date_of_birth": "13.02.1993.",
    "city": "Chicago",
    "height": 185,
}

example_array = [1, 2, 3, "red", "green", "yellow", "blue", 4]

Getting Data

Dictionary

There are multiple ways to access data in a dictionary:

  • Referring to a key name - my_dict["key_name"]:

    print(example_dict["name"]) 
    # Output: Marc Evans
    
  • Calling the get() method - my_dict.get("key_name"):

    print(example_dict.get("city")) 
    # Output: Chicago
    
  • Accessing all keys in a dictionary - my_dict.keys() - returns a list of keys:

    print(example_dict.keys()) 
    # Output: dict_keys(['id', 'name', 'date_of_birth', 'city', 'height'])
    
  • Accessing all values in a dictionary - my_dict.values() - returns a list of values:

    print(example_dict.values()) 
    # Output: dict_values([101, 'Marc Evans', '13.02.1993.', 'Chicago', 185])
    
  • Accessing all key-value pairs: my_dict.items() - returns a tuple of key-value pairs:

    print(example_dict.items()) 
    # Output: dict_items([('id', 101), ('name', 'Marc Evans'), ('date_of_birth', '13.02.1993.'), ('city', 'Chicago'), ('height', 185)]
    
Array

There is only one way to get data from an array:

  • By referring to an element's index - my_array[index_number]:

    print(example_array[3]) 
    # Output: red
    

Updating Data

Dictionary

There are 2 ways to update data in a dictionary:

  • Directly setting a new value to a certain key - my_dict["key"] = new_value:

    example_dict["height"] = 190
    print(example_dict["height"]) 
    # Output: 190
    
  • Calling the update() method - my_dict.update({"key": new_value}) - method's arguments must be a dictionary:

    example_dict.update({"height": 190})
    print(example_dict["height"]) 
    # Output: 190
    
Array

If an array is mutable, it can be changed in a similar fashion as getting data:

  • By referring to an element's index and setting a different value: my_array[index_number] = new_value

    example_array[3] = "purple" 
    print(example_array) 
    # Output: [1, 2, 3, 'purple', 'green', 'yellow', 4, 'blue']
    

Add data

Dictionary

There are 2 ways to add data to a dictionary:

  • Setting a value to a new key, which will automatically create a key-value pair and add it: my_dict["new_key"] = value:

    example_dict["age"] = 45
    print(example_dict) 
    # Output: {'id': 101, 'name': 'Marc Evans', 'date_of_birth': '13.02.1993.', 'city': 'Chicago', 'height': 185, 'age': 45}
    
  • Calling the update() method - my_dict.update({"new_key": value}):

    example_dict.update({"age": 45}) 
    
Array

There are a couple of ways to add data to an array (though, an array must be mutable):

  • Calling the append() method - my_array.append(new_element) - it adds new_element to the end of my_array:

    example_array.append("gray")
    print(example_array) 
    # Output: [1, 2, 3, "purple", "green", "yellow", "blue", 4, "gray"]
    
  • Calling a method insert() - my_array.insert(index_number, new_element) - inserts a new_element at the position index_number:

    example_array.insert(0, 0) 
    print(example_array)
    # Output: [0, 1, 2, 3, "purple", "green", "yellow", "blue", 4, "gray"]
    
  • Calling the extend() method - my_array.extend(my_array2) - inserts elements of my_array2 to the end of my_array:

    example_array2 = [5, 6]
    example_array.extend(example_array2)
    print(example_array)
    # Output: [0, 1, 2, 3, "purple", "green", "yellow", "blue", 4, "gray", 5, 6]
    

Deleting Data

Dictionary

There are multiple ways to delete data from a dictionary:

  • Calling a method pop() - my_dict.pop("key_name") - takes the name of the key to be deleted

    example_dict.pop("name")
    print(example_dict)
      
    # {'id': 101, 'date_of_birth': '13.02.1993.', 'city': 'Chicago', 'height': 185}
    
  • Calling the popitem() method - my_dict.popitem() - in Python 3.7+, it deletes the last added key-value pair, and in Python versions below 3.7 it deletes a random key-value pair:

    example_dict.popitem()
    print(example_dict)
      
    #{'id': 101, 'name': 'Marc Evans', 'date_of_birth': '13.02.1993.', 'city': 'Chicago'}
    
  • Using del keyword - del my_dict["key_name"]

    del example_dict['name']
    print(example_dict)
      
    # {'id': 101, 'date_of_birth': '13.02.1993.', 'city': 'Chicago', 'height': 185}
      
    # del dict deletes the entire dictionary
    del example_dict
    print(example_dict)
      
    # NameError: name 'example_dict' is not defined
    
  • Calling the clear() method - my_dict.clear() - it empties the dictionary, but it will still exist as an empty one {}

    example_dict.clear()
    print(example_dict)
      
    # {}
    
Array

There are a few ways to delete data from an array:

  • Calling a method pop() - my_array.pop(index_number) - deletes an element at the specified index_number:

    example_array.pop(2)
    print(example_array)
      
    # [1, 2, 'red', 'green', 'yellow', 'blue', 4]
    
  • Calling the remove() method - my_array.remove(value) - deletes the first item with the specified value:

    example_array.remove(2)
    print(example_array)
      
    # [1, 3, 'red', 'green', 'yellow', 'blue', 4]
    
  • Calling a method clear() - my_array.clear() - just like in a dictionary, it removes all the elements from an array, leaving an empty one []:

    example_array.clear()
    print(example_array)
      
    # []
    

Conclusion

In this comprehensive guide, we embarked on a deep dive into Python's diverse data structures, specifically focusing on arrays and dictionaries. Our exploration led us through the world of Python's list, tuple, string, byte, bytearray, and array.array, each showcasing their unique strengths and use cases. On the dictionary side, we navigated the intricacies of hashable types, Python's native dictionary, and other dictionary-like structures such as defaultdict, ChainMap, and OrderedDict.

The choice between dictionaries and arrays isn't purely binary; rather, it's contingent upon the problem at hand. Dictionaries are exceptionally efficient when it comes to associating keys with values and quickly retrieving data using those keys. They are versatile, allowing for the use of various data types as keys, provided they are hashable. On the other hand, arrays (and array-like structures) are sequential and indexed, making them ideal for ordered data, numerical operations, and when the order of elements is paramount.

Furthermore, the methods associated with both structures serve distinct purposes, from data retrieval and updating to addition and deletion. As developers, understanding the subtleties of these methods empowers us to write cleaner, more efficient code.

In summary, both arrays and dictionaries are foundational to Python and have their specific niches. As with any tool, understanding when and how to use them is key. Whether you're performing iterative operations on a list or mapping keys to values in a dictionary, Python's rich standard library has you covered.

Last Updated: September 29th, 2023
Was this article helpful?

Improve your dev skills!

Get tutorials, guides, and dev jobs in your inbox.

No spam ever. Unsubscribe at any time. Read our Privacy Policy.

Ā© 2013-2024 Stack Abuse. All rights reserved.

AboutDisclosurePrivacyTerms