Java Regular Expressions - Validate Phone Number

Regular Expressions (RegEx) are a powerful tool and help us match patterns in a flexible, dynamic and efficient way, as well as to perform operations based on the results.

In this tutorial, we'll take a look at how to validate an phone number in Java, using Regular Expressions (RegEx).

If you'd like to read more about Regular Expressions and the regex package, read out Guide to Regular Expressions in Java!

Validating Phone Numbers in Java with RegEx

Phone numbers aren't easy to validate, and they're notoriously flexible. Different countries have different formats, and some countries even use multiple formats and country codes.

To validate a phone number with Regular Expressions, you'll have to make a couple of assertions that generalize well to phone numbers, unless you want to write many different expressions and validate through a list of them.

These assertions will depend on your project, its localization and the countries you wish to apply it to - but keep in mind that with an international project, you might have to be loose on the restrictions lest you end up not allowing a valid phone number through to the system.

For standard US phone number validation, we can use the lengthy, fairly robust expression of:

^(\+\d{1,2}\s)?\(?\d{3}\)?[\s.-]?\d{3}[\s.-]?\d{4}$

The 4 groups in the expression correspond to the country code, area number, subscriber number and extension. The expressions in-between the groups are there to handle a wide-variety of different formatting you an see:

123-456-7890
(123) 456-7890
etc...

You could also build a different expression by imposing a set of rules for groups (depending on the country) and the formats. you may expect coming in.

You could technically go as simple as checking whether there's 10 numbers in the string (12 if you count the country code as well), and it should be able to validate some of the phone numbers, but such simplicity would require additional validation either on the front-end, or the ability to adapt if an accepted phone number isn't actually valid:

Pattern simplePattern = Pattern.compile("^\\+\\d{10,12}$");
Pattern robustPattern = Pattern.compile("^(\\+\\d{1,2}\\s)?\\(?\\d{3}\\)?[\\s.-]?\\d{3}[\\s.-]?\\d{4}$");

String phoneNumber = "+12 345 678 9012";

Matcher simpleMatcher = simplePattern.matcher(phoneNumber);
Matcher robustMatcher = robustPattern.matcher(phoneNumber);

if (simpleMatcher.matches()) {
    System.out.println(String.format("Simple Pattern matched for string: %s", phoneNumber));
}
if(robustMatcher.matches()) {
    System.out.println(String.format("Robust Pattern matched for string: %s", phoneNumber));
}

The first matcher, here, will first and foremost have issues with the whitespaces present in the string, and would match otherwise. The robust pattern doesn't have any issues with this, though:

Robust Pattern matched for string: +12 345 678 9012

The robust matcher here would be able to match for several formats:

Pattern robustPattern = Pattern.compile("^(\\+\\d{1,2}\\s)?\\(?\\d{3}\\)?[\\s.-]?\\d{3}[\\s.-]?\\d{4}$");

List<String> phoneNumbers = List.of(
        "+12 345 678 9012",
        "+123456789012",
        "345.678.9012",
        "345 678 9012"
);

for (String number : phoneNumbers) {
    Matcher matcher = robustPattern.matcher(number);
    if(matcher.matches()) {
        System.out.println(String.format("Robust Pattern matched for string: %s", number));
    }
}

This would result in:

Robust Pattern matched for string: +12 345 678 9012
Robust Pattern matched for string: 345.678.9012
Robust Pattern matched for string: 345 678 9012

Combining Multiple Regular Expressions

Regular Expressions tend to get messy and long. At a certain point, they become cluttered enough that you can't readily change them or interpret them.

Free eBook: Git Essentials

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

Instead of having a single, universal, robust Regular Expression that encompasses all edge-cases, countries, etc. - you can opt to use several different patterns that will reasonably cover the phone numbers coming in. To make things simpler - you can chain these expressions via the | operator, to check whether the phone number matches any of the patterns:

Pattern robustPattern = Pattern
        .compile(
                // Robust expression from before
                "^(\\+\\d{1,2}\\s)?\\(?\\d{3}\\)?[\\s.-]?\\d{3}[\\s.-]?\\d{4}$"
                // Area code, within or without parentheses,
                // followed by groups of 3-4 numbers with or without hyphens
                + "| ((\\(\\d{3}\\) ?)|(\\d{3}-))?\\d{3}-\\d{4}"
                // (+) followed by 10-12 numbers
                + "|^\\+\\d{10,12}"
);

List<String> phoneNumbers = List.of(
        "+12 345 678 9012",
        "+123456789012",
        "345.678.9012",
        "345 678 9012"
);

for (String number : phoneNumbers) {
    Matcher matcher = robustPattern.matcher(number);
    if(matcher.matches()) {
        System.out.println(String.format("Pattern matched for string: %s", number));
    }
}

This results in:

Pattern matched for string: +12 345 678 9012
Pattern matched for string: +123456789012
Pattern matched for string: 345.678.9012
Pattern matched for string: 345 678 9012

Conclusion

Phone numbers are tricky - that's a fact. Regular Expressions are a really versatile and powerful tool and can address the issue of validating phone numbers, but it is admittedly a messy process.

Certain countries follow different standards, while some countries adopt and use multiple formats at the same time, making it hard to write a general-purpose expression to match phone numbers universally.

In this short article, we've taken a look at how to match phone numbers with Regular Expressions in Java, with a few different expressions.

Last Updated: November 23rd, 2021
Was this article helpful?

Improve your dev skills!

Get tutorials, guides, and dev jobs in your inbox.

No spam ever. Unsubscribe at any time. Read our Privacy Policy.

David LandupAuthor

Entrepreneur, Software and Machine Learning Engineer, with a deep fascination towards the application of Computation and Deep Learning in Life Sciences (Bioinformatics, Drug Discovery, Genomics), Neuroscience (Computational Neuroscience), robotics and BCIs.

Great passion for accessible education and promotion of reason, science, humanism, and progress.

© 2013-2024 Stack Abuse. All rights reserved.

AboutDisclosurePrivacyTerms