Remove Punctuation From a String in Java

Remove Punctuation From a String in Java

During textual processing, whether you're searching for certain words and making pattern matching rules, counting the frequency of elements, etc. - punctuation can throw a wrench in your plans.

Oftentimes, you'll want to remove stopwords, punctuation, digits or otherwise some category of characters, depending on what your end goal is.

In this short tutorial, we'll take a look at how to remove punctuation from a string in Java.

Remove Punctuation from String with RegEx (Regular Expressions)

Regular Expressions are a very natural fit here, both because they're likely going to be part of other processing parts, and because they're efficient pattern matchers! In Java, the regular expression for matching punctuation is \p{Punct} or a shorthand \p{P}.

You'll have to escape the first backslash in a string, so removing all punctuation is equivalent to matching them and replacing with an empty character:

String.replaceAll("\\p{P}", "")

Let's apply it to a simple sentence:

String text = "Hi! This is, in effect, a synthetic sentence. It's meant to have several punctuation characters!";
String clean = text.replaceAll("\\p{P}", "");
System.out.println(clean);

This results in:

Hi This is in effect a synthetic sentence Its meant to have several punctuation characters

Let's take a look at what characters are treated as punctuation here:

String text = "!#$%&'()*+,-./:;<=>[email protected][]^_`{|}~";
String clean = text.replaceAll("\\p{P}", "");
System.out.println(clean);

With these special characters - which are left after removing punctuation?

$+<=>^`|~

Remove Punctuation from String without RegEx

If you don't want to employ regular expressions, you can do a manual check while iterating through each character of a string. Remember to use a StringBuffer instead of a String while doing this, since strings are immutable and copies need to be made every time you want to add a character - so you'd be creating string.length number of strings in memory.

StringBuffer is mutable, and can be easily converted into an immutable string at the end of the process:

Free eBook: Git Essentials

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

public static String removePunctuations(String s) {
    StringBuffer buffer = new StringBuffer();
    for (Character c : s.toCharArray()) {
        if(Character.isLetterOrDigit(c))
            buffer.append(c);
    }
    return buffer.toString();
}

Let's create a string and clean it:

String text = "Hello! \nHere are some special characters: !#$%&'()*+,-./:;<=>[email protected][]^_`{|}~ \nWhere are they? :(\n";
System.out.println(text);
String clean = removePunctuations(text);
System.out.println(clean);
Hello! 
Here are some special characters: !#$%&'()*+,-./:;<=>[email protected][]^_`{|}~ 
Where are they? :(

HelloHerearesomespecialcharactersWherearethey

While this process is more customizable, it only checks for letters and digits. You can check manually for character codes alternatively, and only exclude some punctuation characters instead - and leave in whitespaces, linebreaks, etc.

Conclusion

In this short tutorial, we took a look at how you can remove punctuation or certain special characters from a string in Java, using regular expressions or a manual check in an enhanced for loop.

Last Updated: October 12th, 2022
Was this article helpful?

Improve your dev skills!

Get tutorials, guides, and dev jobs in your inbox.

No spam ever. Unsubscribe at any time. Read our Privacy Policy.

David LandupAuthor

Entrepreneur, Software and Machine Learning Engineer, with a deep fascination towards the application of Computation and Deep Learning in Life Sciences (Bioinformatics, Drug Discovery, Genomics), Neuroscience (Computational Neuroscience), robotics and BCIs.

Great passion for accessible education and promotion of reason, science, humanism, and progress.

Make Clarity from Data - Quickly Learn Data Visualization with Python

Learn the landscape of Data Visualization tools in Python - work with Seaborn, Plotly, and Bokeh, and excel in Matplotlib!

From simple plot types to ridge plots, surface plots and spectrograms - understand your data and learn to draw conclusions from it.

Want a remote job?

    © 2013-2022 Stack Abuse. All rights reserved.

    DisclosurePrivacyTerms