Introduction
In this tutorial, we'll take a look at how to parse Datetime with parsedatetime
in Python.
To use the parsedatetime
package we first need to install it using pip:
$ pip install parsedatetime
Should pip install parsedatetime
fail, the package is also open-source and available on GitHub.
Convert String to Python's Datetime Object with parsedatetime
The first, and most common way to use parsedatetime
is to parse a string into a datetime
object. First, you'll want to import the parsedatetime
library, and instantiate a Calendar
object, which does the actual input, parsing and manipulation of dates:
import parsedatetime
calendar = parsedatetime.Calendar()
Now we can call the parse()
method of the calendar
instance with a string as an argument. You can put in regular datetime-formatted strings, such as 1-1-2021
or human-readable values such as tomorrow
, yesterday
, next year
, last week
, lunch tomorrow
, etc... We can also use 'End of Day'
structures with tomorrow eod
Let's convert a datetime and human-readable string to a datetime
object using parsedatetime
:
import parsedatetime
from datetime import datetime
calendar = parsedatetime.Calendar()
print(calendar.parse('tomorrow'))
print(calendar.parse('1-1-2021'))
This results in two printed tuples:
(time.struct_time(tm_year=2021, tm_mon=3, tm_mday=19, tm_hour=9, tm_min=0, tm_sec=0, tm_wday=4, tm_yday=78, tm_isdst=-1), 1)
(time.struct_time(tm_year=2021, tm_mon=1, tm_mday=1, tm_hour=18, tm_min=5, tm_sec=14, tm_wday=3, tm_yday=77, tm_isdst=0), 1)
This isn't very human-readable... The returned tuple for each conversion consists of the struct_time
object, which contains information like the year, month, day of month, etc. The second value is the status code - an integer denoting how the conversion went.
0
means unsuccessful parsing, 1
means successful parsing to a date
, 2
means successful parsing to a time
and 3
means successful parsing to a datetime
.
Let's parse this output:
print(calendar.parse('tomorrow')[0].tm_mday)
print(calendar.parse('1-1-2021')[0].tm_mday)
This code results in:
19
1
Then again, we're only getting the day of the month here. Usually, we'd like to output something similar to a YYYY-mm-dd HH:mm:ss
format, or any variation of that.
Thankfully, we can easily use the time.struct_time
result and generate a regular Python datetime
with it:
import parsedatetime
from datetime import datetime
calendar = parsedatetime.Calendar()
time_structure_tomorrow, parse_status_tomorrow = calendar.parse('tomorrow')
time_structure_2021, parse_status_2021 = calendar.parse('1-1-2021')
print(datetime(*time_structure_tomorrow[:6]))
print(datetime(*time_structure_2021[:6]))
The datetime()
constructor doesn't need all of the information from the time structure provided by parsedatetime
, so we sliced it.
This code results in:
2021-03-19 09:00:00
2021-01-01 18:11:06
Keep in mind that the datetime
on the 1st of January took the time of execution into consideration.
Handling Timezones
Sometimes, your application might have to take the timezones of your end-users into consideration. For timezone-support, we usually use the Pytz package, though, you can use other packages as well.
Let's install Pytz via pip
:
$ pip install pytz
Now, we can import the parsedatetime
and pytz
packages into a script, and create a standard Calendar
instance:
import parsedatetime
import pytz
from pytz import timezone
calendar = parsedatetime.Calendar()
Let's take a look at the supported timezones, by printing out all_timezones
:
print(pytz.all_timezones)
This code will result in a huge list of all available timezones:
['Africa/Abidjan', 'Africa/Accra', 'Africa/Addis_Ababa', 'Africa/Algiers', ...]
Let's choose one of these, such as the first one, and pass it in as the tzinfo
argument of Calendar
's parseDT()
function. Other than that, we'll want to supply a datetimeString
argument, which is the actual string we want to parse:
datetime_object, status = calendar.parseDT(datetimeString='tomorrow', tzinfo=timezone('Africa/Abidjan'))
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
This method returns a tuple of a Datetime
object, and the status code of the conversion, which is an integer - 1
meaning "successful", and 0
meaning "unsuccessful".
Let's go ahead and print the datetime_object
:
print(datetime_object)
This code results in:
2021-03-16 09:00:00+00:00
Calendar.parseDate()
While Calendar.parse()
is a general-level parsing method, that returns a tuple with the status code and time.struct_time
, the parseDate()
method is a method dedicated to short-form string dates, and simply returns a human-readable result:
import parsedatetime
calendar = parsedatetime.Calendar()
result = calendar.parseDate('5/5/91')
print(result)
The result
now contains the calculated struct_time
value of the date we've passed in:
(1991, 5, 5, 14, 31, 18, 0, 74, 0)
But, what do we do when we want to parse the 5th of May 2077? We can try to run the following code:
import parsedatetime
calendar = parsedatetime.Calendar()
result = calendar.parseDate('5/5/77')
print(result)
However, this code will result in:
(1977, 5, 5, 14, 36, 21, 0, 74, 0)
Calendar.parseDate()
mistook the short-form date, for a more realistic 1977
. We can solve this in two ways:
- Simply specify the full year -
2077
:
import parsedatetime
calendar = parsedatetime.Calendar()
result = calendar.parseDate('5/5/2077')
print(result)
- Use a
BirthdayEpoch
:
import parsedatetime
constants = parsedatetime.Constants()
constants.BirthdayEpoch = 80
# Pass our new constants to the Calendar
calendar = parsedatetime.Calendar(constants)
result = calendar.parseDate('5/5/77')
print(result)
This code will result in:
(2077, 5, 5, 14, 39, 47, 0, 74, 0)
You can access the contents of the parsedatetime
library through the Constants
object. Here, we've set the BirthdayEpoch
to 80
.
BirthdayEpoch
controls how the package handles two-digit years, such as 77
. If the parsed value is lesser than the value we've set for the BirthdayEpoch
- it'll add the parsed value to 2000
. Since we've set the BirthdayEpoch
to 80
, and parsed 77
, it converts it to 2077
.
Otherwise, it'll add the parsed value to 1900
.
Calendar.parseDateText()
Another alternative to dealing with the issue of mistaken short-form dates is to, well, use long-form dates. For long-form dates, you can use the parseDateText()
method:
import parsedatetime
result2 = calendar.parseDateText('May 5th, 1991')
print(result2)
This code will result in:
(1991, 5, 5, 14, 31, 46, 0, 74, 0)
Using Locales
Finally, we can use parsedatetime
with locale information. The locale information comes from either PyICU or the previously used Constants
class.
The Constants
inner class has a lot of attributes, just like the BirthdayEpoch
attribute. Two of these are localeID
and userPyICU
.
Let's try setting the localeId
to Spanish and set the usePyICU
to False
since we won't use it:
import parsedatetime
constants = parsedatetime.Constants(localeID='es', usePyICU=False)
calendar = parsedatetime.Calendar(constants)
result, code = calendar.parse('Marzo 28')
print(result)
This results in:
(time.struct_time(tm_year=2021, tm_mon=3, tm_mday=28, tm_hour=15, tm_min=0, tm_sec=5, tm_wday=0, tm_yday=74, tm_isdst=0), 1)
The method returns a struct_time
, so we can easily convert it into a datetime
:
print(datetime(*result[:6]))
This results in:
2021-03-28 22:08:40
Conclusion
In this tutorial, we've gone over several ways to parse datetime using the parsedatetime
package in Python.
We went over the conversion between strings and datetime
objects through parsedatetime
, as well as handling timezones with pytz
and locales, using the Constants
instance of the parsedatetime
library.