Whether you are creating a registration form for your website or you just need to delete all invalid email addresses from your mailing list, you can't help but perform the process of email validation.
You need to validate if an email address is real by checking whether if it meets the required form and can receive email messages. That must be performed efficiently and safely.
That is where
email-validator comes in. It is an easy to use, yet robust, Python library used to validate email addresses.
In this guide, we'll go over the basics of this library, discover when and why you could use it, as well as when not to. We'll go over these with practical examples that will help you understand how to use
What is email-validator?
As we've previously stated,
email-validator is a robust Python library that validates email addresses. It performs two types of validation - syntax validation and deliverability validation. That is important because the email address must meet the required form and have a resolvable domain name at the same time to be considered valid.
Syntax validation ensures that a string representation of an email address is of the form , such as
Deliverability validation ensures that the syntactically correct email address has the domain name (the string after the
@ sign -
stackabuse.com) that can be resolved.
In simple terms, it ensures that the validated email address can send and receive email messages.
On top of that,
email-validator has a small bonus for us, if the email address is valid,
email-validator can return its normalized form, so that we can store it in a database in a proper way. On the other hand, if an email address is invalid,
email-validator will give us a clear and human-readable error message to help us understand why the passed email address is not valid.
In its simplest form, the normalization of an email address implies lowercasing the domain of an email address (the sequence after the
@ sign), because it is case-insensitive.
In more complex cases of normalization, where the domain part includes some Unicode characters, normalization covers a variety of conversions between Unicode and ASCII characters. The problem lies in the fact that different Unicode strings can look and mean the same to the end-user, so the normalization should ensure that those strings will be recorded in the same way because they actually represent the same domain.
It is important to mention that this library is not designed to work with an email address that doesn't meet the form of
For example, it won't properly validate the
To:line in an email message (for example,
To: Example Name <[email protected]>).
email-validator vs RegEx for Email Validation
We usually use some kind of Regular Expression (RegEx) to validate the correct form of email addresses and it is a great choice if you only need to make sure that some email address meets the required form. It is a well-known technique, easy to write and maintain, and doesn't consume too much computing power to execute.
If you'd like to read more about validating email addresses with RegEx - read our Python: Validate Email Address with Regular Expressions!
On the other hand, email address validation sometimes can be a lot more complex. A string containing an email address may meet the specified form of an email address, but still cannot be considered a proper email address, because the domain doesn't resolve.
[email protected]meets the specified form of an email address, but isn't valid because the domain name (
ssstackabuse.com) doesn't exist, therefore doesn't resolve and the example email address can't send and receive email messages.
On the other hand,
[email protected], meets both requirements for a valid email address. It meets the desired form and the domain name resolves. Therefore, it can be considered a valid email address.
In that case, the
email-validator provides a superior solution - it performs both syntax and deliverability validation with one simple function call, so there is no need to bother with making sure that the email address can actually send and receive emails. It would be impossible to code both of those verifications using just Regular Expressions.
Note: It's factually impossible to guarantee whether an email will be received, or not, without sending an email and observing the result. You can, however, check if it could receive an email as a categorical possibility.
Those two things make a strong case in favor of
email-validator against Regular Expressions. It is easier to use and still can perform more tasks more efficiently.
How to Install email-validator?
email-validator library is available on PyPI, so the installation is pretty straightforward via
pip install email-validator pip3 install email-validator
And now you have the
email-validator ready to use in a Python script.
Validate Email Address with email-validator?
The core of the
email-validator library is its
validate_email() method. It takes a string representation of an email address as the argument and performs validation on that address. If the passed email address is valid, the
validate_email() method will return an object containing a normalized form of the passed email address, but in the case of an invalid email address, it will raise the
EmailNotValidError with a clear and human-readable error message that will help us understand why the passed email address is not valid.
EmailNotValidError is actually just an abstract class, which is used to detect that the error in a validation process occurred, hence, it is not used to represent and describe actual errors.
For that purpose,
EmailNotValidError class has two subclasses describing actual errors that occurred. The first one is
EmailSynaxError which is raised when a syntax validation fails, meaning that the passed email doesn't meet the required form of an email address. The second one is
EmailUndeliverableError which is raised when a deliverability validation fails, meaning that the domain name of the passed email address doesn't exist.
Now we can finally take a look at how to use the
validate_email() method. Of course, the first step is to import it to our script, and then we are ready to use it:
from email_validator import validate_email testEmail = "[email protected]" emailObject = validate_email(testEmail) print(emailObject.email)
Since the passed
testEmail is a valid email address, the previous code will output the normalized form of the email address stored in
Free eBook: Git Essentials
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
Note: In the previous example, the output is the same as the original address from the
testEmail because it was originally normalized. If you pass the unnormalized form of an email to the
validate_email() method, the returned email address will be normalized, as expected.
If we change the original
"[email protected]", the previous code will still have the same output, because it's normalized:
On the other hand, if we pass the invalid email address to the
validate_email() method, the previous code will prompt us with the corresponding error message. The following example of
testEmail will pass the syntax validation, but fail the deliverability validation because the domain
ssstackabuse.com doesn't exist:
testEmail = "[email protected]"
In this case, the previous code will prompt a long error amongst which is:
>> ... >> raise EmailUndeliverableError("The domain name %s does not exist." % domain_i18n) email_validator.EmailUndeliverableError: The domain name ssstackabuse.com does not exist.
Based on this prompt, we can conclude that the passed email is invalid because its domain name does not exist. The corresponding messages will also be prompted in the case of syntactically invalid emails so that we can easily conclude that the passed email address doesn't meet the required form of an email address.
You could extract a more user-friendly and human-readable error message from this as well, automatically. To extract just the error message from the previous prompt, we need to rewrite the previous code as follows:
from email_validator import validate_email, EmailNotValidError testEmail = "examplestackabuse.com" try: # Validating the `testEmail` emailObject = validate_email(testEmail) # If the `testEmail` is valid # it is updated with its normalized form testEmail = emailObject.email print(testEmail) except EmailNotValidError as errorMsg: # If `testEmail` is not valid # we print a human readable error message print(str(errorMsg))
This code will output just a simple error message extracted from the previous prompt:
The domain name ssstackabuse.com does not exist.
Note: We've taken advantage of the
EmailNotValidError class. We've tried to execute the email validation in the
try block and ensured that the error will be caught in the
except block in case of failing the validation. There is no need to catch
EmailUndeliverableError individually, because both of them are subclasses of the caught
EmailNotValidError class, and the type of error can be easily determined by the printed error message.
validate_email() - Optional Arguments
By default, the
validate_email() method accepts only one argument - the string representation of the email address that needs to be validated, but can accept a few other keyword arguments:
- allow_smtputf8 - the default value is
True, if set to
validate_email()won't validate internationalized email addresses, just ones that have a domain name consisting of ASCII characters only (no UTF-8 characters are allowed in a domain name in that case).
- check_deliverability - the default value is
True, if set to
False, no deliverability validation is performed .
- allow_empty_local - the default value is
False, if set to
True, the empty local part of an email address will be allowed (i.e.
@stackabuse.comwill be considered as the valid email address).
The ValidatedEmail Object
You've probably noticed that we've been accessing the normalized form of an email address by
emailObject.email. That is because the
validate_email() method returns the
ValidatedEmail object (in previous examples, it was stored in the
emailObject variable) when a valid email address is passed as the argument.
ValidatedEmail object contains multiple attributes which describe different parts of the normalized email address. The
. notation -
Generally, we can access any attribute of the
ValidatedEmailobject by using
variableNameis the variable used to store the
For example, let's say that we've validated the
[email protected] with the
validate_email() method. The resulting
ValidatedEmail object will contain some interesting and useful attributes as described in the following table:
|Attribute Name||Example Value||Description|
|[email protected]||Normalized form of an email address.|
|ascii_email||[email protected]||ASCII only form of
|local_part||example||The string before the
|ascii_local_part||example||If there are no internationalized characters, this attribute is set to ASCII only form of
|domain||stackabuse.com||The string after the
|ascii_domain||stackabuse.com||ASCII only form of
|smtputf8||True||A boolean value. If the
Note: ASCII variants of mentioned attributes are generated using the Punycode encoding syntax. It is an encoding syntax used to transform a Unicode string into an ASCII string for use with Internationalized Domain Names in Applications (IDNA).
All in all, the
email-validator is a great tool for validating email addresses in Python.
In this guide, we've covered all the important aspects of using this library, so that you have a comprehensive view of it. You should be able to understand when and how to use the
email-validator, as well as when to pick some alternative tool.