Introduction
Imagine you have a playlist of your favorite songs on your phone. This playlist is a list where each song is placed in a specific order. You can play the first song, skip to the second, jump to the fifth, and so on. This playlist is a lot like an array in computer programming.
Arrays stand as one of the most fundamental and widely used data structures.
In essence, an array is a structured way to store multiple items (like numbers, characters, or even other arrays) in a specific order, and you can quickly access, modify, or remove any item if you know its position (index).
In this guide, we'll give you a comprehensive overview of the array data structure. First of all, we'll take a look at what arrays are and what are their main characteristics. We'll then transition into the world of Python, exploring how arrays are implemented, manipulated, and applied in realworld scenarios.
Understanding the Array Data Structure
Arrays are among the oldest and most fundamental data structures used in computer science and programming. Their simplicity, combined with their efficiency in certain operations, makes them a staple topic for anyone delving into the realm of data management and manipulation.
An array is a collection of items, typically of the same type, stored in contiguous memory locations.
This contiguous storage allows arrays to provide constanttime access to any element, given its index. Each item in an array is called an element, and the position of an element in the array is defined by its index, which usually starts from zero.
For instance, consider an array of integers: [10, 20, 30, 40, 50]
. Here, the element 20
has an index of 1
:
There are multiple advantages of using arrays to store our data. For example, due to their memory layout, arrays allow for O(1) (constant) time complexity when accessing an element by its index. This is particularly beneficial when we need random access to elements. Additionally, arrays are stored in contiguous memory locations, which can lead to better cache locality and overall performance improvements in certain operations. Another notable advantage of using arrays is that, since arrays have a fixed size once declared, it's easier to manage memory and avoid unexpected overflows or outofmemory errors.
Note: Arrays are especially useful in scenarios where the size of the collection is known in advance and remains constant, or where random access is more frequent than insertions and deletions.
On the other side, arrays come with their own set of limitations. One of the primary limitations of traditional arrays is their fixed size. Once an array is created, its size cannot be changed. This can lead to issues like wasted memory (if the array is too large) or the need for resizing (if the array is too small). Besides that, inserting or deleting an element in the middle of an array requires shifting of elements, leading to O(n) time complexity for these operations.
To sum this all up, let's illustrate the main characteristics of arrays using the song playlist example from the beginning of this guide. An array is a data structure that:

Is Indexed: Just like each song on your playlist has a number (1, 2, 3, ...), each element in an array has an index. But, in most programming languages, the index starts at 0. So, the first item is at index 0, the second at index 1, and so on.

Has Fixed Size: When you create a playlist for, say, 10 songs, you can't add an 11th song without removing one first. Similarly, arrays have a fixed size. Once you create an array of a certain size, you can't add more items than its capacity.

Is Homogeneous: All songs in your playlist are music tracks. Similarly, all elements in an array are of the same type. If you have an array of integers, you can't suddenly store a text string in it.

Has Direct Access: If you want to listen to the 7th song in your playlist, you can jump directly to it. Similarly, with arrays, you can instantly access any element if you know its index.

Contiguous Memory: This is a bit more technical. When an array is created in a computer's memory, it occupies a continuous block of memory. Think of it like a row of adjacent lockers in school. Each locker is next to the other, with no gaps in between.
Python and Arrays
Python, known for its flexibility and ease of use, offers multiple ways to work with arrays. While Python does not have a native array data structure like some other languages, it provides powerful alternatives that can function similarly and even offer extended capabilities.
At first glance, Python's list might seem synonymous with an array, but there are subtle differences and nuances to consider:
List  Array 

A builtin Python data structure  Not native in Python  they come from the `array` module 
Dynamic size  Fixed (predefined) size 
Can hold items of different data types  Hold items of the same type 
Provide a range of builtin methods for manipulation  Need to import external modules 
O(1) time complexity for access operations  O(1) time complexity for access operations 
Consume more memory  More memory efficient 
Looking at this table, it comes naturally to ask  "When to use which?". Well, if you need a collection that can grow or shrink dynamically and can hold mixed data types, Python's list is the way to go. However, for scenarios requiring a more memoryefficient collection with elements of the same type, you might consider using Python's array
module or external libraries like NumPy.
The array Module in Python
When most developers think of arrays in Python, they often default to thinking about lists. However, Python offers a more specialized array structure through its builtin array
module. This module provides a spaceefficient storage of basic Cstyle data types in Python.
While Python lists are incredibly versatile and can store any type of object, they can sometimes be overkill, especially when you only need to store a collection of basic data types, like integers or floats. The array
module provides a way to create arrays that are more memory efficient than lists for specific data types.
Creating an Array
To use the array
module, you first need to import it:
from array import array
Once imported, you can create an array using the array()
constructor:
arr = array('i', [1, 2, 3, 4, 5])
print(arr)
Here, the 'i'
argument indicates that the array will store signed integers. There are several other type codes available, such as 'f'
for floats and 'd'
for doubles.
Accessing and Modifying Elements
You can access and modify elements in an array just like you would with a list:
print(arr[2]) # Outputs: 3
And now, let's modify the element by changing it's value to 6
:
arr[2] = 6
print(arr) # Outputs: array('i', [1, 2, 6, 4, 5])
Array Methods
The array
module provides several methods to manipulate arrays:

append()
 Adds an element to the end of the array:arr.append(7) print(arr) # Outputs: array('i', [1, 2, 6, 4, 5, 7])

extend()
 Appends iterable elements to the end:arr.extend([8, 9]) print(arr) # Outputs: array('i', [1, 2, 6, 4, 5, 7, 8, 9])

pop()
 Removes and returns the element at the given position:arr.pop(2) print(arr) # Outputs: array('i', [1, 2, 4, 5, 7, 8, 9])

remove()
: Removes the first occurrence of the specified value:arr.remove(2) print(arr) # Outputs: array('i', [1, 4, 5, 7, 8, 9])

reverse()
: Reverses the order of the array:arr.reverse() print(arr) # Outputs: array('i', [9, 8, 7, 5, 4, 1])
Note: There are more methods than we listed here. Refer to the official Python documentation to see a list of all available methods in the array
module.
While the array
module offers a more memoryefficient way to store basic data types, it's essential to remember its limitations. Unlike lists, arrays are homogeneous. This means all elements in the array must be of the same type. Also, you can only store basic Cstyle data types in arrays. If you need to store custom objects or other Python types, you'll need to use a list or another data structure.
NumPy Arrays
NumPy, short for Numerical Python, is a foundational package for numerical computations in Python. One of its primary features is its powerful Ndimensional array object, which offers fast operations on arrays, including mathematical, logical, shape manipulation, and more.
NumPy arrays are more versatile than Python's builtin
array
module and are a staple in data science and machine learning projects.
Why Use NumPy Arrays?
The first thing that comes to mind is performance. NumPy arrays are implemented in C and allow for efficient memory storage and faster operations due to optimized algorithms and the benefits of contiguous memory storage.
While Python's builtin arrays are onedimensional, NumPy arrays can be multidimensional, making them ideal for representing matrices or tensors.
Check out our handson, practical guide to learning Git, with bestpractices, industryaccepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
Finally, NumPy provides a vast array of functions to operate on these arrays, from basic arithmetic to advanced mathematical operations, reshaping, splitting, and more.
Note: When you know the size of the data in advance, preallocating memory for arrays (especially in NumPy) can lead to performance improvements.
Creating a NumPy Array
To use NumPy, you first need to install it (pip install numpy
) and then import it:
import numpy as np
Once imported, you can create a NumPy array using the array()
function:
arr = np.array([1, 2, 3, 4, 5])
print(arr) # Outputs: [1 2 3 4 5]
You can also create multidimensional arrays:
matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(matrix)
This will give us:
[[1 2 3]
[4 5 6]
[7 8 9]]
Besides these basic ways we can create arrays, NumPy provides us with other clever ways we can create arrays. One of which is the arange()
method. It creates arrays with regularly incrementing values:
arr = np.arange(10)
print(arr) # Outputs: [0 1 2 3 4 5 6 7 8 9]
Another one is the linspace()
method, which creates arrays with a specified number of elements, spaced equally between specified beginning and end values:
even_space = np.linspace(0, 1, 5)
print(even_space) # Outputs: [0. 0.25 0.5 0.75 1. ]
Accessing and Modifying Elements
Accessing and modifying elements in a NumPy array is intuitive:
print(arr[2]) # Outputs: 3
arr[2] = 6
print(arr) # Outputs: [1 2 6 4 5]
Doing pretty much the same for multidimensional arrays:
print(matrix[1, 2]) # Outputs: 6
matrix[1, 2] = 10
print(matrix)
Will change the value of the element in the second row (index 1
) and the third column (index 2
):
[[1 2 3]
[4 5 20]
[7 8 9]]
Changing the Shape of an Array
NumPy offers many functions and methods to manipulate and operate on arrays. For example, you can use the reshape()
method to change the shape of an array. Say we have a simple array:
import numpy as np
# Create a 1D array
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
print("Original Array:")
print(arr)
# Output:
# Original Array:
#[ 1 2 3 4 5 6 7 8 9 10 11 12]
And we want to reshape it to a 3x4 matrix. All you need to do is use the reshape()
method with desired dimensions passed as arguments:
# Reshape it to a 3x4 matrix
reshaped_arr = arr.reshape(3, 4)
print("Reshaped Array (3x4):")
print(reshaped_arr)
This will result in:
Reshaped Array (3x4):
[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]]
Matrix Multiplication
The numpy.dot()
method is used for matrix multiplication. It returns the dot product of two arrays. For onedimensional arrays, it is the inner product of the arrays. For 2dimensional arrays, it is equivalent to matrix multiplication, and for ND, it is a sum product over the last axis of the first array and the secondtolast of the second array.
Let's see how it works. First, let's compute the dot product of two 1D arrays (the inner product of the vectors):
import numpy as np
# 1D array dot product
vec1 = np.array([1, 2, 3])
vec2 = np.array([4, 5, 6])
dot_product_1d = np.dot(vec1, vec2)
print("Dot product of two 1D arrays:")
print(dot_product_1d) # Outputs: 32 (1*4 + 2*5 + 3*6)
This will result in:
Dot product of two 1D arrays:
32
32
is, in fact, the inner product of the two arrays  (14 + 25 + 3*6). Next, we can perform matrix multiplication of two 2D arrays:
# 2D matrix multiplication
mat1 = np.array([[1, 2], [3, 4]])
mat2 = np.array([[2, 0], [1, 3]])
matrix_product = np.dot(mat1, mat2)
print("Matrix multiplication of two 2D arrays:")
print(matrix_product)
Which will give us:
Matrix multiplication of two 2D arrays:
[[ 4 6]
[10 12]]
NumPy arrays are a significant step up from Python's builtin lists and the array
module, especially for scientific and mathematical computations. Their efficiency, combined with the rich functionality provided by the NumPy library, makes them an indispensable tool for anyone looking to do numerical operations in Python.
Advice: This is just a quick overview of what you can do with the NumPy library. For more information about the library, you can read our "NumPy Tutorial: A Simple ExampleBased Guide"
Conclusion
Arrays, a cornerstone of computer science and programming, have proven their worth time and again across various applications and domains. In Python, this fundamental data structure, through its various incarnations like lists, the array
module, and the powerful NumPy arrays, offers developers a blend of efficiency, versatility, and simplicity.
Throughout this guide, we've journeyed from the foundational concepts of arrays to their practical applications in Python. We've seen how arrays, with their memorycontiguous nature, provide rapid access times, and how Python's dynamic lists bring an added layer of flexibility. We've also delved into the specialized world of NumPy, where arrays transform into powerful tools for numerical computation.