Introduction
Whether you are creating a registration form for your website or you just need to delete all invalid email addresses from your mailing list, you can't help but perform the process of email validation.
You need to validate if an email address is real by checking whether it meets the required form and can receive email messages. That must be performed efficiently and safely.
That is where email-validator
comes in. It is an easy to use, yet robust, Python library used to validate email addresses.
In this guide, we'll go over the basics of this library, discover when and why you could use it, as well as when not to. We'll go over these with practical examples that will help you understand how to use email-validator
.
What is email-validator?
As we've previously stated, email-validator
is a robust Python library that validates email addresses. It performs two types of validation - syntax validation and deliverability validation. That is important because the email address must meet the required form and have a resolvable domain name at the same time to be considered valid.
Syntax validation ensures that a string representation of an email address is of the form , such as [email protected]
.
Deliverability validation ensures that the syntactically correct email address has the domain name (the string after the @
sign - stackabuse.com
) that can be resolved.
In simple terms, it ensures that the validated email address can send and receive email messages.
On top of that, email-validator
has a small bonus for us, if the email address is valid, email-validator
can return its normalized form, so that we can store it in a database in a proper way. On the other hand, if an email address is invalid, email-validator
will give us a clear and human-readable error message to help us understand why the passed email address is not valid.
In its simplest form, the normalization of an email address implies lowercasing the domain of an email address (the sequence after the @
sign), because it is case-insensitive.
In more complex cases of normalization, where the domain part includes some Unicode characters, normalization covers a variety of conversions between Unicode and ASCII characters. The problem lies in the fact that different Unicode strings can look and mean the same to the end-user, so the normalization should ensure that those strings will be recorded in the same way because they actually represent the same domain.
It is important to mention that this library is not designed to work with an email address that doesn't meet the form of [email protected]
.
For example, it won't properly validate the
To:
line in an email message (for example,To: Example Name <[email protected]>
).
email-validator vs RegEx for Email Validation
We usually use some kind of Regular Expression (RegEx) to validate the correct form of email addresses and it is a great choice if you only need to make sure that some email address meets the required form. It is a well-known technique, easy to write and maintain, and doesn't consume too much computing power to execute.
If you'd like to read more about validating email addresses with RegEx - read our Python: Validate Email Address with Regular Expressions!
On the other hand, email address validation sometimes can be a lot more complex. A string containing an email address may meet the specified form of an email address, but still cannot be considered a proper email address, because the domain doesn't resolve.
For instance,
[email protected]
meets the specified form of an email address, but isn't valid because the domain name (ssstackabuse.com
) doesn't exist, therefore doesn't resolve and the example email address can't send and receive email messages.
On the other hand, [email protected]
, meets both requirements for a valid email address. It meets the desired form and the domain name resolves. Therefore, it can be considered a valid email address.
In that case, the email-validator
provides a superior solution - it performs both syntax and deliverability validation with one simple function call, so there is no need to bother with making sure that the email address can actually send and receive emails. It would be impossible to code both of those verifications using just Regular Expressions.
Note: It's factually impossible to guarantee whether an email will be received, or not, without sending an email and observing the result. You can, however, check if it could receive an email as a categorical possibility.
Those two things make a strong case in favor of email-validator
against Regular Expressions. It is easier to use and still can perform more tasks more efficiently.
How to Install email-validator?
The email-validator
library is available on PyPI, so the installation is pretty straightforward via pip
or pip3
:
$ pip install email-validator
$ pip3 install email-validator
And now you have the email-validator
ready to use in a Python script.
Validate Email Address with email-validator?
The core of the email-validator
library is its validate_email()
method. It takes a string representation of an email address as the argument and performs validation on that address. If the passed email address is valid, the validate_email()
method will return an object containing a normalized form of the passed email address, but in the case of an invalid email address, it will raise the EmailNotValidError
with a clear and human-readable error message that will help us understand why the passed email address is not valid.
EmailNotValidError
is actually just an abstract class, which is used to detect that an error in a validation process occurred, hence, it is not used to represent and describe actual errors.
For that purpose, EmailNotValidError
class has two subclasses describing actual errors that occurred. The first one is EmailSynaxError
which is raised when a syntax validation fails, meaning that the passed email doesn't meet the required form of an email address. The second one is EmailUndeliverableError
which is raised when a deliverability validation fails, meaning that the domain name of the passed email address doesn't exist.
Now we can finally take a look at how to use the validate_email()
method. Of course, the first step is to import it to our script, and then we are ready to use it:
from email_validator import validate_email
testEmail = "[email protected]"
emailObject = validate_email(testEmail)
print(emailObject.email)
Since the passed testEmail
is a valid email address, the previous code will output the normalized form of the email address stored in testEmail
variable:
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
[email protected]
Note: In the previous example, the output is the same as the original address from the testEmail
because it was originally normalized. If you pass the unnormalized form of an email to the validate_email()
method, the returned email address will be normalized, as expected.
If we change the original testEmail
to "[email protected]"
, the previous code will still have the same output, because it's normalized:
[email protected]
On the other hand, if we pass the invalid email address to the validate_email()
method, the previous code will prompt us with the corresponding error message. The following example of testEmail
will pass the syntax validation, but fail the deliverability validation because the domain ssstackabuse.com
doesn't exist:
testEmail = "[email protected]"
In this case, the previous code will prompt a long error amongst which is:
>> ...
>> raise EmailUndeliverableError("The domain name %s does not exist." % domain_i18n)
email_validator.EmailUndeliverableError: The domain name ssstackabuse.com does not exist.
Based on this prompt, we can conclude that the passed email is invalid because its domain name does not exist. The corresponding messages will also be prompted in the case of syntactically invalid emails so that we can easily conclude that the passed email address doesn't meet the required form of an email address.
You could extract a more user-friendly and human-readable error message from this as well, automatically. To extract just the error message from the previous prompt, we need to rewrite the previous code as follows:
from email_validator import validate_email, EmailNotValidError
testEmail = "examplestackabuse.com"
try:
# Validating the `testEmail`
emailObject = validate_email(testEmail)
# If the `testEmail` is valid
# it is updated with its normalized form
testEmail = emailObject.email
print(testEmail)
except EmailNotValidError as errorMsg:
# If `testEmail` is not valid
# we print a human readable error message
print(str(errorMsg))
This code will output just a simple error message extracted from the previous prompt:
The domain name ssstackabuse.com does not exist.
Note: We've taken advantage of the EmailNotValidError
class. We've tried to execute the email validation in the try
block and ensured that the error will be caught in the except
block in case of failing the validation. There is no need to catch EmailSyntaxError
or EmailUndeliverableError
individually, because both of them are subclasses of the caught EmailNotValidError
class, and the type of error can be easily determined by the printed error message.
validate_email() - Optional Arguments
By default, the validate_email()
method accepts only one argument - the string representation of the email address that needs to be validated, but can accept a few other keyword arguments:
- allow_smtputf8 - the default value is
True
, if set toFalse
thevalidate_email()
won't validate internationalized email addresses, just ones that have a domain name consisting of ASCII characters only (no UTF-8 characters are allowed in a domain name in that case). - check_deliverability - the default value is
True
, if set toFalse
, no deliverability validation is performed . - allow_empty_local - the default value is
False
, if set toTrue
, the empty local part of an email address will be allowed (i.e.@stackabuse.com
will be considered as the valid email address).
The ValidatedEmail Object
You've probably noticed that we've been accessing the normalized form of an email address by emailObject.email
. That is because the validate_email()
method returns the ValidatedEmail
object (in previous examples, it was stored in the emailObject
variable) when a valid email address is passed as the argument.
The ValidatedEmail
object contains multiple attributes which describe different parts of the normalized email address. The email
attribute contains the normalized form of the validated email address, therefore, we need to access it using the .
notation - emailObject.email
.
Generally, we can access any attribute of the
ValidatedEmail
object by usingvariableName.attributeName
(wherevariableName
is the variable used to store theValidatedEmail
object).
For example, let's say that we've validated the [email protected]
with the validate_email()
method. The resulting ValidatedEmail
object will contain some interesting and useful attributes as described in the following table:
Attribute Name | Example Value | Description |
---|---|---|
[email protected] | Normalized form of an email address. | |
ascii_email | [email protected] | ASCII only form of email attribute. If the local_part contains any kind of internationalized characters, this attribute will be set to None . |
local_part | example | The string before the @ sign in the normalized form of the email address. |
ascii_local_part | example | If there are no internationalized characters, this attribute is set to ASCII only form of local_part attribute. Otherwise, it is set to None . |
domain | stackabuse.com | The string after the @ sign in the normalized form of the email address. If it contains non-ASCII characters, the smptutf8 attribute must be True . |
ascii_domain | stackabuse.com | ASCII only form of domain attribute. |
smtputf8 | True | A boolean value. If the allow_smtputf8=False argument is passed to the validate_email() method, this argument is False and True otherwise. |
Note: ASCII variants of mentioned attributes are generated using the Punycode encoding syntax. It is an encoding syntax used to transform a Unicode string into an ASCII string for use with Internationalized Domain Names in Applications (IDNA).
Conclusion
All in all, the email-validator
is a great tool for validating email addresses in Python.
In this guide, we've covered all the important aspects of using this library, so that you have a comprehensive view of it. You should be able to understand when and how to use the email-validator
, as well as when to pick some alternative tool.