Dictionaries vs Arrays in Python - Deep Dive - Stack Abuse

Dictionaries vs Arrays in Python - Deep Dive

Introduction

In this guide, we'll take a look at two of Python's most popular data structures - Dictionaries and Arrays. Each of these provide a specific way of arranging your data, with pros and cons for certain tasks, and knowing when to use which will allow you to leverage the built-in functionalities.

Note: This guide assumes Python 3.x, and most of it is oriented at versions after that. We will, however, also note some key differences for Python 2.x.

Guide to Python Arrays

An Array is one of the fundamental data structures in computer science - a sequence of 0..n elements, where each element has an index.

Most arrays have a fixed size, so they take a chunk of memory every time a new one is created:

array indexing

Here, we've got a simple array consisting of 7 elements. Indexing typically starts at 0, and each element has a positional index that we can use to access it. This makes the array's access time complexity a O(1).

Most of Python's arrays are dynamically typed, which means that the objects of an array have a type, but the array itself is not restricted to only one type - you can have an array consisting of an integer, a string and an object, or even of another array that's heterogenously mixed as well.

There are 6 important types of arrays in Python: list, tuple, str, bytes, bytearray and array.array.

When talking about each of them, there are a few key properties we'll take into account:

  • Whether they're dynamic or not dynamic
  • Whether they're statically or dynamically typed
  • Whether they're mutable or immutable

Python Lists

A list in Python is dynamic (non-fixed size), dynamically typed (elements not restricted to a single type) and mutable (elements can be changed in-place).

In Python, a list is defined by declaring its elements within squared brackets []. Let's go ahead and define a list:

myList = [1, 2, 3, "Mark", "John", "Emma"]
print(myList)

It contains a few integers and a few strings, denoting names. Since lists are dynamically typed, this is allowed:

[1, 2, 3, 'Mark', 'John', 'Emma']    

Since lists are dynamic, we can change the number of elements by adding a new one, for example:

myList.append(4)
myList.append("Peter")
print(myList)

This results in our list having 8 elements, instead of the 6 we've defined in the beginning:

[1, 2, 3, 'Mark', 'John', 'Emma', 4, 'Peter']

Now, let's try replacing an element and adding a new one. We'll check the ID of the list (reference in memory) to confirm that it's not switched out under the hood with a new copy that contains either added elements or replaced ones:

myList = [1, 2, 3, "Mark", "John", "Emma", 4, "Peter"]
# Print original list and its ID
print('Original list: ', myList)
print('ID of object in memory: ', id(myList))

# Modify existing element and add a new one
myList[4] = "Anna"
myList.append("Dan")

# Print changed list and its ID
print('Changed list: ', myList)
print('ID of object in memory: ', id(myList))

Running this code results in:

Original list:  [1, 2, 3, 'Mark', 'John', 'Emma', 4, 'Peter']
ID of object in memory:  140024176315840
Changed list:  [1, 2, 3, 'Mark', 'Anna', 'Emma', 4, 'Peter', 'Dan']
ID of object in memory:  140024176315840

The fact that myList points to the same object in-memory (140024176315840) further goes to show how lists are mutable.

Note: Python's lists can even store functions in a sequence:

def f1():
    return "Function one"

def f2():
    return "Function two"

def f3():
    return "Function three"

listOfFunctions = [f1, f2, f3]
print(listOfFunctions)

Which will result in:

[<function f1 at 0x0000016531807488>, <function f2 at 0x00000165318072F0>, <function f3 at 0x0000016531807400>]

Our output consists of functions at the given addresses. Now let's try and access a function and run it:

print(listOfFunctions[0]())

Since the first element of this list is f1(), we'd expect its appropriate print() statement to run:

Function one

Lists are the most commonly used type of arrays in Python. They are easy to use and intuitive. Additionally, their time complexity for accessing elements is O(1).

Python Tuples

A tuple in Python is non-dynamic (fixed size), dynamically typed (elements not restricted to a single type) and immutable (elements cannot be changed in-place).

In addition to that, we use regular brackets () when defining them:

myTuple = (1, 2, 3, "Mark", "John", "Emma")
print(myTuple)

Since tuples are dynamically typed, we can have elements of different types present within them:

(1, 2, 3, 'Mark', 'John', 'Emma')

Since tuples are non-dynamic, they have a fixed size, and we can't append() elements to them in-place, since this changes their size. Thus, tuples don't have an append() method.

We can, however, create a new tuple consisting of smaller tuples, which again is of fixed size:

myTuple = (1, 2, 3)
anotherTuple = ("Mark", "John", "Emma")
print('Original tuple: ', myTuple)
print('ID of object in memory: ', id(myTuple))

myTuple = myTuple + anotherTuple
print('New tuple: ', myTuple)
print('ID of object in memory: ', id(myTuple))

We've assigned the same variable reference to a new object created to contain both of these tuples together - even though the reference variable is the same, it points to a totally different object in memory:

Original tuple:  (1, 2, 3)
ID of object in memory:  139960147395136

New tuple:  (1, 2, 3, 'Mark', 'John', 'Emma')
ID of object in memory:  139960147855776

The time complexity for accessing items in a tuple is also O(1).

Python Strings

In Python 3, the str type (short for String) is overhauled from Python 2. In Python 2, it used to represent both text and bytes, but since Python 3 - these two are totally different data types.

A string in Python is non-dynamic (fixed size), statically typed (elements restricted to a single type) and immutable (elements cannot be changed in-place).

A sequence of bytes (in human-readable characters), enclosed within parentheses "" is used to define a string:

myStr = "qwerty"
print(myStr)

This will result in:

qwerty

We can access elements via standard array indexing, but can't change them:

print(myStr[0])
myStr[0] = "p"

This will result in:

q
TypeError: 'str' object does not support item assignment

In fact - strings are recursive. When we declare a string using characters - a string for each character is formed, which is then added to a list of strings that constitute another string.

myStr has the length of 5, and is made up of five individual strings, of length 1:

myStr = "abcde"
print(len(myStr)) # Check the length of our str
print(type(myStr)) # Check the type of our str

print(myStr[0]) # Letter 'a'
print(len(myStr[0])) # Check the length of our letter
print(type(myStr[0])) # Check the type of our letter 'a'

This results in:

5
<class 'str'>
a
1
<class 'str'>

Both our 'character' and string are of the same class - str.

Similar to tuples, we can concatenate strings - which results in a new string consisting of the two smaller ones:

myStr = "qwerty"
myStr2 = "123"

result = myStr + myStr2
print(result)

And the result is:

qwerty123

Again, strings only support characters and we cannot mix in other types:

myStr = "qwerty"
myStr2 = 123

result = myStr + myStr2
print(result)

Which will result in:

TypeError: can only concatenate str (not "int") to str

However, int, as well as every other type can be casted (converted) into a string representation:

myStr = "qwerty"
myStr2 = str(123) # int 123 is now casted to str

result = myStr + myStr2
print(result)

This will result in:

qwerty123

With this method you can get away with printing, for example, ints and strings in the same line:

myStr = "qwerty"
print("myStr's length is: " + len(myStr)) # TypeError

print("myStr's length is: " + str(len(myStr))) # String concatenation resulting in 'myStr's length is: 6'

Python Bytes

Bytes in Python are non-dynamic (fixed size), statically typed (elements restricted to a single type) and immutable (elements cannot be changed in-place).

A bytes object consists of multiple single bytes or integers, ranging from 0 to 255 (8-bit).

Defining a bytes object is slightly different from other arrays since we explicitly have to cast a tuple into bytes:

myBytes = bytes((0, 1, 2))
print(myBytes)

This will result in:

b'\x00\x01\x02'

If the tuple contains elements of different types, a TypeError is thrown:

myBytes = bytes((0, 1, 2, 'string'))
TypeError: 'str' object cannot be interpreted as an integer

When working with str's, an array of bytes must be encoded with a charset, otherwise it'll be ambiguous as to what they represent:

myStr = "This is a string"

myBytes = bytes(myStr) # this will result in an error TypeError: string argument without an encoding

myBytes = bytes(myStr, 'utf-8')
print(myBytes) # this will print out myStr normally

If you're unfamiliar with how encoding bytes works - read our guide on How to Convert Bytes to String in Python.

Furthermore, a bytes array of integers can be mutable when casted to another array type called the bytearray.

Python Bytearray

A Bytearray in Python is dynamic (non-fixed size), statically typed (elements restricted to a single type) and mutable (elements can be changed in-place).

myByteArray = bytearray((0, 1, 2))

Now, we can try to add elements to this array, as well as change an element:

myByteArray = bytearray((0, 1, 2))
print(myByteArray)
print("ByteArray ID: ", id(myByteArray))

myByteArray.append(3)
print(myByteArray)
print("ByteArray ID: ", id(myByteArray))

myByteArray[3] = 50
print(myByteArray)
print("ByteArray ID: ", id(myByteArray))

This results in:

bytearray(b'\x00\x01\x02')
ByteArray ID:  140235112668272

bytearray(b'\x00\x01\x02\x03')
ByteArray ID:  140235112668272

bytearray(b'\x00\x01\x022')
ByteArray ID:  140235112668272

These all have the same object ID - pointing to the same object in-memory being changed.

A bytearray can be casted back to a bytes array; though, keep in mind that it's an expensive operation which takes O(n) time.

Python array.array

So far, we've been working with built-in types. However, another type of array exists, in the array module.

This array is dynamic (non-fixed size), statically typed (elements restricted to a single type) and mutable (can be changed in-place). We need to explicitly note the type we'll be using in an array and these types are C-style types: 32-bit integers, floating point numbers, doubles, etc.

Each of these have a marker - i for integers, f for floats and d for doubles. Let's make an integer array via the array module:

import array

myArray =  array.array("i", (1, 2, 3, 4))

Some of the more used C-like types:

c types

Guide to Python Dictionaries

Better understand your data with visualizations

  •  30-day no-questions refunds
  •  Beginner to Advanced
  •  Updated regularly (update June 2021)
  •  New bonus resources and guides

The Dictionary is a central data structure in Python. It stores data in key-value pairs.

Due to this, it can also be called a map, hash map or a lookup table.

There is a few different variants of a dictionary:

  • dict
  • collections.defaultdict
  • collections.OrderedDict
  • collections.ChainMap

Dictionaries rely on hash values, that identify keys for the lookup operation. A hashtable contains many hash values which never change during the lifetime of a hashtable.

Hashable Type and Hash Values

Every object has a hash value, and the hash() method can be used to retrieve it. This value isn't constant and is calculated at runtime, though if a == b, hash(a) will always be equal to hash(b):

randomString = "This is a random string"
a = 23
b = 23.5
print(hash(randomString))
print(hash(a))
print(hash(b))

This code will result in something along the lines of:

4400833007061176223
23
1152921504606846999

Note: Numeric values that are equal have the same hash value, regardless of their type:

a = 23
b = 23.0
print(hash(a))
print(hash(b))

Results in:

23
23

This mechanism is what makes dictionaries blazingly fast in Python - unique identifiers for each element, giving them a lookup time of O(1).

Python Dictionary

The contents of a dictionary (dict type) are defined within curly braces {}. The syntax resembles JSON, given the key-value pairs:

myDict = {
	"name": "Mike James",
    "age": 32,
    "country": "United Kingdom"
}

A dictionary can have an arbitrary number of pairs and keys should be hashable without any duplicate keys (duplicate keys will result in the same hash). In such cases, the first key will be rejected and the dictionary will only actually contain the second key.

Since dictionaries are mutable, we can add a new key-value pair just by 'accessing' a non-existant key and setting its value:

myDict["countries_visited"] = ["Spain", "Portugal", "Russia"]
print(myDict)

This will result in:

{'name': 'Mike James', 'age': 34, 'country': 'United Kingdom', 'countries_visited': ['Spain', 'Portugal', 'Russia']}

Python's core dict will probably solve most of your problems, but if not, there are a few of dictionary types that can be imported from a library called collections.

Python DefaultDict

A problem that you can encounter when using a dict is trying to access the value of a key that doesn't exist.

For example, in our previous demonstration if we accessed print(myDict["zip_code"]), we would get a KeyError: zip_code as zip_code doesn't exist.

This is when defaultdict comes to play, as it requests a default_factory - a function which returns the default value if a key is not present. This way, a defaultdict can never raise a KeyError:

from collections import defaultdict 

def safe_function(): # default_factory
    return "Value not defined"

myDict = defaultdict(safe_function)
myDict["name"] = "Mark James"
myDict["age"] = 32

print(myDict["country"]) # This will output Value not defined and not raise a KeyError

This, as expected, results in:

Value not defined

Defining defaultdict values is different from the core dict class because every key-value pair must be defined 'manually' which is more tedious than the JSON-like syntax.

Python ChainMap

This type of dictionary allows us to connect multiple dictionaries into one - to chain them. When accessing data, it will look for a key one by one until it finds the first correct one:

from collections import ChainMap

myDict1 = {
	"name": "Mike James",
    "age": 32
}

myDict2 = {
    "name": "James Mike",
    "country": "United Kingdom",
    "countries_visited": ["Spain", "Portugal", "Russia"]    
}

myDictResult = ChainMap(myDict1, myDict2)
print(myDictResult)

This results in a ChainMap:

ChainMap({'name': 'Mike James', 'age': 32}, {'name': 'James Mike', 'country': 'United Kingdom', 'countries_visited': ['Spain', 'Portugal', 'Russia']})

Note: We can define duplicate keys. 'name' is present in both dictionaries. Though, when we try to access the 'name' key:

print(myDictResult['name'])

It finds the first matching key:

Mike James

Also keep in mind that these can still raise a KeyError, since we are now working with a core dict.

Python OrderedDict

Note: As of Python 3.6, dictionaries are insertion-ordered by default.

The OrderedDict is used when you'd like to maintain the order of insertion of key-value pairs in a dictionary. dict doesn't guarantee this, and you may end up with a different order of insertion than chronological.

If this isn't an important thing - you can comfortably use a dictionary. If this is important, though, such as when dealing with dates, you'll want to use an OrderedDict instead:

from collections import OrderedDict

orderedDict = OrderedDict()
orderedDict['a'] = 1
orderedDict['b'] = 2
orderedDict['c'] = 3
orderedDict['d'] = 4
  
print(orderedDict)

This results in:

OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4)])

Note: Even though dict objects preserve the insertion order as of Python 3.6 - use OrderedDict if insertion order is required. Your code won't guarantee insertion order across other Python versions (prior ones) if you use a regular dict.

Dictionary Methods vs Array Methods

Now that we got in the hang of things, we should cover all the methods that these two types have implemented in them. There is four basic operations that could be done to data: access (get), update, add, delete.

Let's define an array and dictionary that we'll be experimenting on:

exampleDict = {
	"id": 101,
    "name": "Marc Evans",
    "date_of_birth": "13.02.1993.",
    "city": "Chicago",
    "height": 185,
}

exampleArray = [1, 2, 3, "red", "green", "yellow", "blue", 4]

Getting Data

Dictionary: There are multiple ways to access data in a dictionary:

  • Referring to a key name - myDict["key_name"]:

    • print(exampleDict["name"]) 
      # Output: Marc Evans
      
  • Calling the get() method - myDict.get("key_name"):

    • print(exampleDict.get("city")) 
      # Output: Chicago
      
  • Accessing all keys in a dictionary - myDict.keys() - returns a list of keys:

    • print(exampleDict.keys()) 
      # Output: dict_keys(['id', 'name', 'date_of_birth', 'city', 'height'])
      
  • Accessing all values in a dictionary - myDict.values() - returns a list of values:

    • print(exampleDict.values()) 
      # Output: dict_values([101, 'Marc Evans', '13.02.1993.', 'Chicago', 185])
      
  • Accessing all key-value pairs: myDict.items() - returns a tuple of key-value pairs:

    • print(exampleDict.items()) 
      # Output: dict_items([('id', 101), ('name', 'Marc Evans'), ('date_of_birth', '13.02.1993.'), ('city', 'Chicago'), ('height', 185)]
      

Array: There is only one way to get data from an array:

  • By referring to an element's index - myArray[index_number]:

    • print(exampleArray[3]) 
      # Output: red
      

Updating Data

Dictionary: There are 2 ways to update data in a dictionary:

  • Directly setting a new value to a certain key - myDict["key"] = new_value:

    • exampleDict["height"] = 190
      print(exampleDict["height"]) 
      # Output: 190
      
  • Calling the update() method - myDict.update({"key": new_value}) - method's arguments must be a dictionary:

    • exampleDict.update({"height": 190})
      print(exampleDict["height"]) 
      # Output: 190
      

Array: If an array is mutable, it can be changed in a similar fashion as getting data:

  • By referring to an element's index and setting a different value: myArray[index_number] = new_value

    • exampleArray[3] = "purple" 
      print(exampleArray) 
      # Output: [1, 2, 3, 'purple', 'green', 'yellow', 4, 'blue']
      

Add data

Dictionary: There are 2 ways to add data to a dictionary:

  • Setting a value to a new key, which will automatically create a key-value pair and add it: myDict["new_key"] = value:

    • exampleDict["age"] = 45
      print(exampleDict) 
      # Output: {'id': 101, 'name': 'Marc Evans', 'date_of_birth': '13.02.1993.', 'city': 'Chicago', 'height': 185, 'age': 45}
      
  • Calling the update() method - myDict.update({"new_key": value}):

    • exampleDict.update({"age": 45}) 
      

Array: There are a couple of ways to add data to an array (though, an array must be mutable):

  • Calling the append() method - myArray.append(new_element) - it adds new_element to the end of myArray:

    • exampleArray.append("grey")
      print(exampleArray) 
      # Output: [1, 2, 3, "purple", "green", "yellow", "blue", 4, "grey"]
      
  • Calling a method insert() - myArray.insert(index_number, new_element) - inserts a new_element at the position index_number:

    • exampleArray.insert(0, 0) 
      print(exampleArray)
      # Output: [0, 1, 2, 3, "purple", "green", "yellow", "blue", 4, "grey"]
      
  • Calling the extend() method - myArray.extend(myArray2) - inserts elements of myArray2 to the end of myArray:

    • exampleArray2 = [5, 6]
      exampleArray.extend(exampleArray2)
      print(exampleArray)
      # Output: [0, 1, 2, 3, "purple", "green", "yellow", "blue", 4, "grey", 5, 6]
      

Deleting Data

Dictionary: There are multiple ways to delete data from a dictionary:

  • Calling a method pop() - myDict.pop("key_name") - takes name of the key to be deleted

    • exampleDict.pop("name")
      print(exampleDict)
      
      # {'id': 101, 'date_of_birth': '13.02.1993.', 'city': 'Chicago', 'height': 185}
      
  • Calling the popitem() method - myDict.popitem() - in Python 3.7+, it deletes the last added key-value pair and in Python versions below 3.7 it deletes a random key-value pair:

    • exampleDict.popitem()
      print(exampleDict)
      
      #{'id': 101, 'name': 'Marc Evans', 'date_of_birth': '13.02.1993.', 'city': 'Chicago'}
      
  • Using del keyword - del myDict["key_name"]

    • del exampleDict['name']
      print(exampleDict)
      
      # {'id': 101, 'date_of_birth': '13.02.1993.', 'city': 'Chicago', 'height': 185}
      
      # del dict deletes the entire dictionary
      del exampleDict
      print(exampleDict)
      
      # NameError: name 'exampleDict' is not defined
      
  • Calling the clear() method - myDict.clear() - it empties the dictionary, but it will still exist as an empty one {}

    • exampleDict.clear()
      print(exampleDict)
      
      # {}
      

Array: There are a few ways to delete data from an array:

  • Calling a method pop() - myArray.pop(index_number) - deletes an element at the specified index_number:

    • exampleArray.pop(2)
      print(exampleArray)
      
      # [1, 2, 'red', 'green', 'yellow', 'blue', 4]
      
  • Calling the remove() method - myArray.remove(value) - deletes the first item with the specified value:

    • exampleArray.remove(2)
      print(exampleArray)
      
      # [1, 3, 'red', 'green', 'yellow', 'blue', 4]
      
  • Calling a method clear() - myArray.clear() - just like in dictionary, it removes all the elements from an array, leaving an empty one []:

    • exampleArray.clear()
      print(exampleArray)
      
      # []
      
Last Updated: June 15th, 2021

Improve your dev skills!

Get tutorials, guides, and dev jobs in your inbox.

No spam ever. Unsubscribe at any time. Read our Privacy Policy.

Want a remote job?

    Prepping for an interview?

    • Improve your skills by solving one coding problem every day
    • Get the solutions the next morning via email
    • Practice on actual problems asked by top companies, like:
     
     
     

    Better understand your data with visualizations

    •  30-day no-questions refunds
    •  Beginner to Advanced
    •  Updated regularly (update June 2021)
    •  New bonus resources and guides

    © 2013-2021 Stack Abuse. All rights reserved.