Splitting strings and lists are common programming activities in Python and other languages. Sometimes we have to split our data in peculiar ways, but more commonly - into even chunks.
The language does not have a built-in function to do this and in this tutorial, we'll take a look at how to split a list into even chunks in Python.
For most cases, you can get by using generators:
def chunk_using_generators(lst, n): for i in range(0, len(lst), n): yield lst[i:i + n]
Though, there are other interesting ways to do this, each with their own pros and cons!
Split a List Into Even Chunks of N Elements
A list can be split based on the size of the chunk defined. This means that we can define the size of the chunk. If the subset of the list doesn't fit in the size of the defined chunk, fillers need to be inserted in the place of the empty element holders. We will be using
None in those cases.
Let's create a new file called
chunk_based_on_size.py and add the following contents:
def chunk_based_on_size(lst, n): for x in range(0, len(lst), n): each_chunk = lst[x: n+x] if len(each_chunk) < n: each_chunk = each_chunk + [None for y in range(n-len(each_chunk))] yield each_chunk print(list(chunk_based_on_size([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], 7)))
chunk_based_on_size() function takes the arguments:
lst for the list and
chunk_size for a number to split it by. The function iterates through the list with an increment of the chunk size
n. Each chunk is expected to have the size given as an argument. If there aren't enough elements to make a split of the same size, the remaining unused elements are filled with
Running this script returns the following list of lists:
python3 chunk_based_on_size.py [[1, 2, 3, 4, 5, 6, 7], [8, 9, 10, 11, 12, 13, None]]
The list has been split into equal chunks of 7 elements each.
Python has utilities to simplify this process. We can use the
zip_longest function from
itertools to simplify the previous function. Let's create a new file
chunk_using_itertools.py and add the following code:
from itertools import zip_longest def chunk_using_itertools(lst): iter_ = iter(lst) return list(zip_longest(iter_, iter_, iter_, iter_, iter_, iter_, iter_)) print(chunk_using_itertools([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]))
This code iterates elements and returns a chunk of the desired length - based on the arguments you provide. We've put 7
iter_ arguments here. The
zip_longest() function aggregates and returns elements from each iterable. In this case, it would aggregate the elements from the list that's iterated 7 times in one go. This then creates numerous iterators that contain 7 sequential elements, which are then converted to a list and returned.
When you execute this snippet, it'll result in:
python3 chunk_using_itertools.py [[1, 2, 3, 4, 5, 6, 7], [8, 9, 10, 11, 12, 13, None]]
This shorter function produces the same input. However, it's much more limited as we have to manually write how many elements we want in the code, and it's a bit awkward to just put a bunch of
iter_s in the
The best solution would be using generators. Let's create a new file,
def chunk_using_generators(lst, n): for i in range(0, len(lst), n): yield lst[i:i + n] print(list(chunk_using_generators([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], 7)))
This generator yields a sublist containing
n elements. At the end, it would have yielded a sublist for every chunk. Running this code produces this output:
python3 chunk_using_generators.py [[1, 2, 3, 4, 5, 6, 7], [8, 9, 10, 11, 12, 13]]
This solution works best if you don't need padding with
None or otherwise.
Split a List Into a N Even Chunks
In the previous section, we split the list based on the size of individual chunks so that each chunk has the same amount of elements. There's another way to interpret this problem. What do we do when we want to split a list not based on the number of elements in each chunk, but on the number of chunks we want to be created?
For example, instead of splitting a list into chunks where every chunk has 7 elements, we want to split a list into 7 even chunks. In this case, we may not know the size of each chunk.
The logic is similar to the previous solutions, however, the size of the chunk is the ceiling value of the length of the list divided by the number of chunks required. Similar to the previous code samples, if a chunk happens to have vacant spots, those will be filled by the filler value
import math def chunk_based_on_number(lst, chunk_numbers): n = math.ceil(len(lst)/chunk_numbers) for x in range(0, len(lst), n): each_chunk = lst[x: n+x] if len(each_chunk) < n: each_chunk = each_chunk + [None for y in range(n-len(each_chunk))] yield each_chunk print(list(chunk_based_on_number([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13], chunk_numbers=7)))
Free eBook: Git Essentials
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
We determine how many lists we need to create and store that value in
n. We then create a sublist for the two elements at a time, padding the output in case our chunk size is smaller than the desired length.
When we execute that file we'll see:
python3 chunk_based_on_number.py [[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12], [13, None]]
As seen in the output above, the list has been split into 7 individual lists of equal sizes, based on the argument
In this article, we have seen some of the ways by which a list can be split into even-sized chunks and lists based on custom methods and by using the built-in modules.
The solutions mentioned in this tutorial, are not limited to the ones defined here, but there are multiple other creative ways by which you can split your list into even-chunks too.