Introduction
In Python, a set is a data structure that stores unordered items. The set items are also unindexed. Like a list, a set allows the addition and removal of elements. However, there are a few unique characteristics that define a set and separate it from other data structures:
- A set does not hold duplicate items
- The elements of the set are immutable, that is, they cannot be changed, but the set itself is mutable, that is, it can be changed
- Since set items are not indexed, sets don't support any slicing or indexing operations.
In this guide, we'll be taking a look at how to create and use sets in Python, alongside some of the common operations you'd run against them.
How To Create a Set in Python
A set can hold any number of items and the items can be of different types (heterogeneous collection) such as integers, strings, tuples, etc.
Note: A set does not accept mutable elements, such as lists and dictionaries.
We can create a set by passing all the set elements inside curly braces {}
and separate the elements using commas (,
):
num_set = {1, 2, 3, 4, 5, 6}
print(num_set)
This will result in:
{1, 2, 3, 4, 5, 6}
We just created a set of numbers. We can also create a set of string values:
string_set = {"Nicholas", "Michelle", "John", "Mercy"}
print(string_set)
Resulting in:
{'Michelle', 'Nicholas', 'John', 'Mercy'}
Note: Notice how elements in the output are not ordered in the same way we added them to the set. The reason for this is that set items are not ordered. If you run the same code again, you're likely to get an output with the elements arranged in a different order.
We can also create a set with elements of different types:
mixed_set = {2.0, "Nicholas", (1, 2, 3)}
print(mixed_set)
Let's verify this yields a valid set:
{2.0, 'Nicholas', (1, 2, 3)}
All the elements of the set above belong to different types. We can also create a set from a list. This can be done by calling the Python's built-in set()
method:
num_set = set([1, 2, 3, 4, 5, 6])
print(num_set)
This results in:
{1, 2, 3, 4, 5, 6}
As stated above, sets do not hold duplicate items. Suppose our list had duplicate items:
num_set = set([1, 2, 3, 1, 2])
print(num_set)
The set will store only unique values from the list:
{1, 2, 3}
The set has essentially removed the duplicates and returned only one of each duplicate item. This also happens when we are creating a set from scratch:
num_set = {1, 2, 3, 1, 2}
print(num_set)
Again, the set has removed the duplicates and returned only one of the duplicate items:
{1, 2, 3}
If you want to create an empty set and use empty curly braces ({}
), you'll create an empty dictionary rather than an empty set:
x = {}
print(type(x)) # <class 'dict'>
To create an empty set in Python we simply call on the set()
method without passing any values:
x = set()
print(type(x)) # <class 'set'>
How To Access Set Items in Python
Python does not provide us with a way of accessing an individual set item using the subscripting notation (set[index]
). However, we can use a for
loop to iterate through all the items of a set:
months = set(["Jan", "Feb", "March", "Apr", "May", "June", "July", "Aug", "Sep", "Oct", "Nov", "Dec"])
for m in months:
print(m)
This will print each element in a months
set:
March
Feb
Dec
Jan
May
Nov
Oct
Apr
June
Aug
Sep
July
We can also check for the presence of an element in a set using the in
keyword:
months = set(["Jan", "Feb", "March", "Apr", "May", "June", "July", "Aug", "Sep", "Oct", "Nov", "Dec"])
print("May" in months)
Since May
is present in the months
set, this will return True
:
True
Similarly, searching for an element that doesn't exist in the set returns False
:
months = set(["Jan", "Feb", "March", "Apr", "May", "June", "July", "Aug", "Sep", "Oct", "Nov", "Dec"])
print("Nicholas" in months)
This will result in:
False
How To Add Items to a Python Set
Python allows us to add new items to a set using the add()
method:
months = set(["Jan", "March", "Apr", "May", "June", "July", "Aug", "Sep", "Oct", "Nov", "Dec"])
months.add("Feb")
print(months)
The item Feb
will be successfully added to the set:
{'Oct', 'Dec', 'Feb', 'July', 'May', 'Jan', 'June', 'March', 'Sep', 'Aug', 'Nov', 'Apr'}
If it was a set of numbers, we would not have passed the new element within quotes as we had to do for a string:
num_set = {1, 2, 3}
num_set.add(4)
print(num_set)
Which will add 4
to the num_set
:
{1, 2, 3, 4}
In the next section, we will be discussing how to remove elements from sets.
How To Remove Items From a Python Set
Python naturally allows us to remove an item from a set, but we can't remove it via an index because set elements are not indexed. The items can be removed using either the discard()
or remove()
methods, with a reference to that specific element.
Note: Keep in mind that the discard()
method will not raise an error if the item is not found in the set. However, if the remove()
method is used and the item is not found, an error will be raised.
discard()
Let's demonstrate how to remove an element using the discard()
method:
num_set = {1, 2, 3, 4, 5, 6}
num_set.discard(3)
print(num_set)
The element 3
will be removed from the set:
{1, 2, 4, 5, 6}
remove()
Similarly, the remove()
method can be used as follows:
num_set = {1, 2, 3, 4, 5, 6}
num_set.remove(3)
print(num_set)
This will yield the same result:
{1, 2, 4, 5, 6}
Removing Non-Existent Elements?
Now, let us try to remove an element that does not exist in the set. Let's first use the discard()
method:
num_set = {1, 2, 3, 4, 5, 6}
num_set.discard(7)
print(num_set)
Running the code above won't affect the set in any way:
{1, 2, 3, 4, 5, 6}
Now, let's see what happens when we use the remove()
method in the same scenario:
num_set = {1, 2, 3, 4, 5, 6}
num_set.remove(7)
print(num_set)
In this case, trying to remove a non-existing element will raise an error:
Traceback (most recent call last):
File "C:\Users\admin\sets.py", line 2, in <module>
num_set.remove(7)
KeyError: 7
pop()
With the pop()
method, we can remove and return an element. Since the elements are unordered, we cannot tell or predict the item that will be removed:
num_set = {1, 2, 3, 4, 5, 6}
print(num_set.pop())
This will return the removed element from the set:
1
You can use the same method to remove an element and return the elements that are remaining in the set:
num_set = {1, 2, 3, 4, 5, 6}
num_set.pop()
print(num_set)
Which will print out the elements remaining in the set:
{2, 3, 4, 5, 6}
clear()
Python's clear()
method helps us remove all elements from a set:
num_set = {1, 2, 3, 4, 5, 6}
num_set.clear()
print(num_set)
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
The output is an empty set()
with no elements in it:
set()
Union of Python Sets
Suppose we have two sets, A and B. The union of the two sets is a set with all the elements from both sets. Such an operation is accomplished using Python's union()
method.
For example, let's assume we have two sets containing month names:
months_a = set(["Jan", "Feb", "March", "Apr", "May", "June"])
months_b = set(["July", "Aug", "Sep", "Oct", "Nov", "Dec"])
all_months = months_a.union(months_b)
print(all_months)
After running this code, the all_months
set will contain the union of sets months_a
and months_b
:
{'Oct', 'Jan', 'Nov', 'May', 'Aug', 'Feb', 'Sep', 'March', 'Apr', 'Dec', 'June', 'July'}
A union can also be performed on more than two sets, and all their elements will be combined into a single set:
x = {1, 2, 3}
y = {4, 5, 6}
z = {7, 8, 9}
output = x.union(y, z)
print(output)
This will result in:
{1, 2, 3, 4, 5, 6, 7, 8, 9}
During the union operation, duplicates are ignored, and only one of the duplicate items is shown:
x = {1, 2, 3}
y = {4, 3, 6}
z = {7, 4, 9}
output = x.union(y, z)
print(output)
This will result in the set containing only unique values from the starting sets:
{1, 2, 3, 4, 6, 7, 9}
The |
operator can also be used to find the union of two or more sets:
months_a = set(["Jan","Feb", "March", "Apr", "May", "June"])
months_b = set(["July", "Aug", "Sep", "Oct", "Nov", "Dec"])
print(months_a | months_b)
This will yield the same result as using union()
method:
{'Feb', 'Apr', 'Sep', 'Dec', 'Nov', 'June', 'May', 'Oct', 'Jan', 'July', 'March', 'Aug'}
If you want to perform a union on more than two sets, separate the set names using the |
operator:
x = {1, 2, 3}
y = {4, 3, 6}
z = {7, 4, 9}
print(x | y | z)
This will result in:
{1, 2, 3, 4, 6, 7, 9}
Intersection of Python Sets
Suppose you have two sets, A and B. Their intersection is a set with elements that are present both in A and B.
The intersection operation in sets can be achieved using either the &
operator or the intersection()
method:
x = {1, 2, 3}
y = {4, 3, 6}
print(x & y)
The only common element is 3
:
{3}
The same can also be achieved with the intersection()
method:
x = {1, 2, 3}
y = {4, 3, 6}
z = x.intersection(y)
print(z)
This will also result in:
{3}
Difference Between Python Sets
Suppose you have two sets A and B. The difference between A and B (A - B) is the set with all elements that are in A but not in B. Consequently, (B - A) is the set with all the elements in B but not in A.
To determine set differences in Python, we can use either the difference()
method or the -
operator:
set_a = {1, 2, 3, 4, 5}
set_b = {4, 5, 6, 7, 8}
diff_set = set_a.difference(set_b)
print(diff_set)
The code above calculates the difference between set_a
and set_b
, hence they form our output:
{1, 2, 3}
The minus operator (-
) can also be used to find the difference between the two sets as shown below:
set_a = {1, 2, 3, 4, 5}
set_b = {4, 5, 6, 7, 8}
print(set_a - set_b)
Which will result in the same output as using the difference()
method:
{1, 2, 3}
The symmetric difference of the sets A and B is the set with all elements that are in A and B except the elements that are common in both sets. It is determined using the Python's symmetric_difference()
method or the ^
operator:
set_a = {1, 2, 3, 4, 5}
set_b = {4, 5, 6, 7, 8}
symm_diff = set_a.symmetric_difference(set_b)
print(symm_diff)
This will result in:
{1, 2, 3, 6, 7, 8}
As we've stated before, the symmetric difference can also be found using the ^
operator:
set_a = {1, 2, 3, 4, 5}
set_b = {4, 5, 6, 7, 8}
print(set_a ^ set_b)
Which will yield the same output as before:
{1, 2, 3, 6, 7, 8}
Comparison of Python Sets
We can compare sets depending on the elements they have. This way, we can tell whether a set is a superset or a subset of another set. The result from such a comparison will be either True
or False
.
To check whether set A is a subset of set B, we can use the following operation:
A <= B
To check whether B is a superset of A, we can use the following operation:
B >= A
For example:
months_a = set(["Jan", "Feb", "March", "Apr", "May", "June"])
months_b = set(["Jan", "Feb", "March", "Apr", "May", "June", "July", "Aug", "Sep", "Oct", "Nov", "Dec"])
subset_check = months_a <= months_b
superset_check = months_b >= months_a
print(subset_check)
print(superset_check)
The months_a
is the subset of the months_b
which is, on the other hand, the superset of the months_a
. Therefore, running the code above will yield:
True
True
The subset and superset can also be checked using issubset()
and issuperset()
methods as shown below:
months_a = set(["Jan","Feb", "March", "Apr", "May", "June"])
months_b = set(["Jan","Feb", "March", "Apr", "May", "June", "July", "Aug", "Sep", "Oct", "Nov", "Dec"])
subset_check = months_a.issubset(months_b)
superset_check = months_b.issuperset(months_a)
print(subset_check)
print(superset_check)
Which yields the same output in the example above:
True
True
Python Set Methods
In the following sections, we will discuss some of the most commonly used set methods provided by Python that we have not already discussed.
copy()
This method returns a copy of the set in question:
string_set = {"Nicholas", "Michelle", "John", "Mercy"}
x = string_set.copy()
print(x)
The output shows that x
is a copy of the set string_set
:
{'John', 'Michelle', 'Nicholas', 'Mercy'}
isdisjoint()
This method checks whether the sets in question have an intersection or not. If the sets don't have common items, this method returns True
, otherwise it returns False
:
names_a = {"Nicholas", "Michelle", "John", "Mercy"}
names_b = {"Jeff", "Bosco", "Teddy", "Milly"}
x = names_a.isdisjoint(names_b)
print(x)
The two sets don't have common items, hence the output is True
:
True
len()
This method returns the length of a set, which is the total number of elements in the set:
names_a = {"Nicholas", "Michelle", "John", "Mercy"}
print(len(names_a))
The output shows that the set has a length of 4:
4
Python Frozen Set
Frozenset is a class with the characteristics of a set, but once its elements have been assigned, they cannot be changed. Tuples can be seen as immutable lists, while frozen sets can be seen as immutable sets.
Note: Sets are mutable and unhashable, which means we cannot use them as dictionary keys. Frozen sets are hashable and we can use them as dictionary keys.
To create frozen sets, we use the frozenset()
method. Let us create two “frozensets”, X
and Y
:
X = frozenset([1, 2, 3, 4, 5, 6])
Y = frozenset([4, 5, 6, 7, 8, 9])
print(X)
print(Y)
This will result in:
frozenset({1, 2, 3, 4, 5, 6})
frozenset({4, 5, 6, 7, 8, 9})
The “frozensets” support the use of Python set methods like copy()
, difference()
, symmetric_difference()
, isdisjoint()
, issubset()
, intersection()
, issuperset()
, and union()
.
Conclusion
The guide provides a detailed introduction to sets in Python. The mathematical definition of sets is the same as the definition of sets in Python. A set is simply a collection of unordered items. The set itself is mutable, but the set elements are immutable. However, we can add and remove elements from a set freely. In most data structures, elements are indexed. However, set elements are not indexed. This makes it impossible for us to perform operations that target specific set elements.