Python: How to Remove a Character from a String - Stack Abuse

Python: How to Remove a Character from a String

Introduction

In this guide, we'll take a look at how to remove a character from a string in Python.

Strings, and especially user-generated input may contain unwanted characters, such as special characters in a username field we don't want to store. In those cases, we might prefer to remove specific characters from a given string.

The most common way to remove a character from a string is with the replace() method, but we can also utilize the translate() method, and even replace one or more occurrences of a given character.

Remove Character in Python Using replace()

The string class provides a replace() method, that replaces a character with another. It's worth noting that his function returns a new string with the characters replaced, since strings are immutable. The original string remains unchanged, but the object in memory is lost unless we keep a reference to it alive. Typically, you'll assign the returned value either to the same pointer or a new one.

The method replaces all occurrences of a character, with a new one. For example, any_string.replace('a', 'b') will replace all occurrences of 'a' in any_string with the character 'b'. To remove a character from a string via replace(), we'll replace it with an empty character:

original_string = "stack abuse"
# Removing character 'a' and replacing with an empty character
new_string = original_string.replace('a', '')
print("String after removing the character 'a':", new_string)

Once we run this code, we're greeted with:

String after removing the character 'a': stck buse

Remove Character in Python Using translate()

Python strings have a translate() method which replaces the characters with other characters specified in a translation table.

A translation table is just a dictionary of key-value mappings where each key will be replaced with a value.

For this method to work, we have to specify the Unicode value for the strings, which we can get via the ord() function.

For example, any_string.ranslate({ord('a'):ord('z'), ord('b'):ord('y')}) will replace occurrences of 'a' with 'z' and 'b' with 'y'.

To remove a character from a string using translate(), you'll need to map the Unicode value of the character with None in the translation table:

original_string = "stack abuse"
# removing character 'a'
new_string = original_string.translate({ord('a'): None})
print("String after removing the character 'a':", new_string)

This code results in:

String after removing the character 'a': stck buse

Remove a Number of Occurrences of a Character

The replace() and translate() methods replace all the occurrences of a given character with another. However, the replace() method takes an optional argument count. If it is given, it only replaces count numer of occurrences of the given character.

Let's try only removing the first 'a' from the string, instead of all occurrences:

original_string = "stack abuse"
# removing character 's'
new_string = original_string.replace('a',  '', 1)
print("String after removing the character 'a':", new_string)

The output of the above code will look like this:

String after removing the character 'a': stck abuse

As the count is set to 1, only the first occurrence of 'a' is replaced - this is useful when you want to remove one and only one character.

Manually Create a New String Without a Character

A somewhat esoteric, but straightforward technique would be to create an empty string and loop through the original string. In the loop, we'll write every character into the new string except the one to remove.

This is actually what happens under the hood, with some extra validation. Since Python is implemented in C, we can take a peak at the stringobject.c source code, which defines the replace() method, which ultimately calls either replace_single_character() or replace_single_character_in_place():

    start = self_s;
    end = self_s + self_len;
    while (count-- > 0) {
        next = findchar(start, end-start, from_c);
        if (next == NULL)
            break;

        if (next == start) {
            /* replace with the 'to' */
            Py_MEMCPY(result_s, to_s, to_len);
            result_s += to_len;
            start += 1;
        } else {
            /* copy the unchanged old then the 'to' */
            Py_MEMCPY(result_s, start, next-start);
            result_s += (next-start);
            Py_MEMCPY(result_s, to_s, to_len);
            result_s += to_len;
            start = next+1;
        }
    }
    /* Copy the remainder of the remaining string */
    Py_MEMCPY(result_s, start, end-start);

    return result;

To gain an appreciation for how much logic is abstracted behind simple, intuitive, high-level APIs, we can do this process manually:

def remove_character(original_string, character, occurrence_num):
    new_string = ""
    for char in original_string:
        if char == character and occurrence_num > 0:
            occurrence_num = occurrence_num-1
            continue
        else:
            new_string += char
    return new_string                
                
                
string = 'stack abuse'
print(remove_character(string, 'a', 0))
print(remove_character(string, 'a', 1))
print(remove_character(string, 'a', 2))

The above piece of code will produce the following output:

Free eBook: Git Essentials

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

stack abuse
stck abuse
stck buse

We we can see - our own method performs in much the same way the replace() method does, but it's a lot less efficient:

print("Time taken by manual method: {}"
    .format(timeit.timeit("remove_character('stack abuse', 'a', 1)", "from __main__ import remove_character")))
    
print("Time taken by replace(): {}"
    .format(timeit.timeit("'stack abuse'.replace('a', '', 1)")))

Timing these methods results in:

Time taken by manual method: 1.3785062030074187
Time taken by replace(): 0.13279212499037385

Conclusion

In this tutorial, we explored how we can remove characters from a string in Python. We have seen how to use the replace() and translate() methods to remove characters by replacing them with an empty string or their Unicode with None.

Later, we have used replace() to remove a predefined number of occurrences of the given character, and even the good old for loop. The translate() method is useful if we have to remove a set of characters, as we can give it a translation table. And the replace() method is handy if we want to remove a number of occurrences of a given character.

Last Updated: July 11th, 2021
Was this article helpful?

Improve your dev skills!

Get tutorials, guides, and dev jobs in your inbox.

No spam ever. Unsubscribe at any time. Read our Privacy Policy.

Naazneen JatuAuthor

Freelance Python Developer

Want a remote job?

    Prepping for an interview?

    • Improve your skills by solving one coding problem every day
    • Get the solutions the next morning via email
    • Practice on actual problems asked by top companies, like:
     
     
     

    Better understand your data with visualizations

    •  30-day no-questions refunds
    •  Beginner to Advanced
    •  Updated regularly (update June 2021)
    •  New bonus resources and guides

    © 2013-2021 Stack Abuse. All rights reserved.