Java Regular Expressions - How to Validate Emails

Introduction

Regular Expressions (RegEx) are a powerful tool and help us match patterns in a flexible, dynamic and efficient way, as well as to perform operations based on the results.

In this short guide, we'll take a look at how to validate email addresses in Java with Regular Expressions.

If you'd like to read more about Regular Expressions and the regex package, read out Guide to Regular Expressions in Java!

Validating Email Addresses in Java

Validating email addresses isn't hard - there's not much diversity in the email world, though, there are a few ways you can go about it.

Regular Expressions are expressive so you can add more and more constraints based on how you want to validate the emails, just by adding more matching rules.

Typically, you can boil things down to a pretty simple RegEx that will fit most email address patterns.

You can disregard the organization type (.com, .org, .edu), host (gmail, yahoo, outlook), or other parts of an email address, or even enforce them.

In the proceeding sections, we'll take a look at a few different Regular Expressions, and which email formats they support or reject.

General-Purpose Email Regular Expression

A general purpose email format is:

[email protected]

The organizationtype is by convention, 3 characters - edu, org, com, etc. There are quite a few hosts, even custom ones, so really, this could be any sequence of characters - even aaa.

That being said, for pretty loose validation (but still a fully valid one) we can check whether the String contains 4 groups:

  • Any sequence of characters - name
  • The @ symbol
  • Any sequence of characters - host
  • Any 2-3-character letter sequence - organization type (io, com, etc).

This nets us a Regular Expression that looks like:

(.*)(@)(.*)(.[a-z]{2,3})

To additionally make sure they don't contain any whitespaces at all, we can add a few \S checks:

(\S.*\S)(@)(\S.*\S)(.\S[a-z]{2,3})

That being said, to validate an email address in Java, we can simply use the Pattern and Matcher classes:

String email = "[email protected]";

Pattern pattern = Pattern.compile("(\\S.*\\S)(@)(\\S.*\\S)(.\\S[a-z]{2,3})");
Matcher matcher = pattern.matcher(email);

if (matcher.matches()) {
    System.out.println("Full email: " + matcher.group(0));
    System.out.println("Username: " + matcher.group(1));
    System.out.println("Hosting Service: " + matcher.group(3));
    System.out.println("TLD: " + matcher.group(4));
}

This results in:

Full email: [email protected]
Username: someone
Hosting Service: gmail
TLD: com

Alternatively, you can use the built-in matches() method of the String class (which just uses a Pattern and Matcher anyway):

String email = "[email protected]";

if(email.matches("(\\S.*\\S)(@)(\\S.*\\S)(.\\S[a-z]{2,3})")) {
    System.out.println(String.format("Email '%s' is valid!", email));
}

Which results in:

Email '[email protected]' is valid!

Awesome! This general-purpose RegEx will take care of pretty much all generic input and will check whether an email follows the generic form that all emails follow.

Free eBook: Git Essentials

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

For the most part - this will work quite well, and you won't need much more than this. You won't be able to detect spam emails with this, such as:

[email protected]

However, you will enforce a certain form.

Note: To enforce certain hosts or domains, simply replace the .* and/or .[a-z]{2,3} with actual values, such as gmail, io and .edu.

Robust Email Validation Regex

What does a robust email RegEx look like? Chances are - you won't like it, unless you enjoy looking at Regular Expressions, which isn't a particularly common hobby.

Long story short, this is what it looks like:

(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=^_`{|}~-]+)*
|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]
|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")
@
(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?
|\[(?:(?:(2(5[0-5]|[0-4][0-9])
|1[0-9][0-9]|[1-9]?[0-9]))\.){3}(?:(2(5[0-5]|[0-4][0-9])
|1[0-9][0-9]|[1-9]?[0-9])|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]
|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

This is the RFC5322-compliant Regular Expression that covers 99.99% of input email addresses.*

Explaining it with words is typically off the table, but visualizing it helps a lot:

*Image and claim are courtesy of EmailRegex.com.

That being said, to create a truly robust email verification Regular Expression checker in Java, let's substitute the loose one with this:

String email = "[email protected]";

Pattern pattern = Pattern.compile("(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|\"(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21\\x23-\\x5b\\x5d-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])*\")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\\[(?:(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9]))\\.){3}(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])|[a-z0-9-]*[a-z0-9]:(?:[\\x01-\\x08\\x0b\\x0c\\x0e-\\x1f\\x21-\\x5a\\x53-\\x7f]|\\\\[\\x01-\\x09\\x0b\\x0c\\x0e-\\x7f])+)\\])");
Matcher matcher = pattern.matcher(email);

if (matcher.matches()) {
    System.out.println(String.format("Email '%s' is valid!", matcher.group(0)));
}

Needless to say - this works:

Email '[email protected]' is valid!

This doesn't check whether the email exists (can't check that unless you try to send the email to the address) so you're always stuck with that possibility. And of course, even this regex will note that odd email addresses such as:

[email protected]

... are fully valid.

Conclusion

In this short guide, we've taken a look at how to perform email validation in Java with Regular Expressions.

Any sort of validation really typically depends on your specific project, but there are some loose/general-purpose forms you can enforce and match for.

We've built a simple general-purpose form which will work most of the time, followed by a greatly robust Regular Expression as detailed by RFC5322.

Last Updated: May 12th, 2023
Was this article helpful?

Improve your dev skills!

Get tutorials, guides, and dev jobs in your inbox.

No spam ever. Unsubscribe at any time. Read our Privacy Policy.

David LandupAuthor

Entrepreneur, Software and Machine Learning Engineer, with a deep fascination towards the application of Computation and Deep Learning in Life Sciences (Bioinformatics, Drug Discovery, Genomics), Neuroscience (Computational Neuroscience), robotics and BCIs.

Great passion for accessible education and promotion of reason, science, humanism, and progress.

© 2013-2024 Stack Abuse. All rights reserved.

AboutDisclosurePrivacyTerms