What the Royal Astronomical Society in 1884 Tells Us About Python Today
on
pytz is a ubiquitous library in the Python ecosystem for supporting localized datetimes in Python. Without using pytz or something like it, there’s no way to program around localized datetimes whose offset from GMT varies over time. More concretely, there’s no way to write daylight saving time-aware programs in Python without a third-party library like pytz. (Spoiler alert: you shouldn’t use pytz at all though; keep reading.)
To support this functionality, pytz constructs tzinfo
objects (Python’s
weak abstraction for time zones) that load the definitions from the
standard IANA time zone database, tzdb.
These objects can then be used with the standard library’s datetime
object
to create localized datetimes. Here’s an example…
from datetime import datetime
import pytz
# the localized datetime for 2019 May 21 12:30pm, NYC/Eastern Time
print(datetime(2019, 5, 21, 12, 30,
tzinfo=pytz.timezone('America/New_York')))
… except that it’s actually a buggy program! The printed datetime is
2019-05-21 12:30:00-04:56
That -04:56
is the GMT offset of the localized time; it says that the
constructed datetime
object represents 12:30pm on 2019 May 21, which is
correct, but that this time is 4 hours and 56 minutes behind GMT, which is
incorrect.
Naturally, we’d expect that offset to be -4:00
, i.e., Eastern Daylight Time, which is
the offset governing the 'America/New_York'
zone at 2019-05-21 12:30
.
Failing that, we might expect it to be -5:00
, i.e., Eastern Standard (Winter)
Time. But -04:56
? What the heck is that??
Why, that’s the longitudinally derived local time of New York City Hall as reported to the Royal Astronomical Society in 1884!1
The first sentence in the report excerpt above gives the context: the establishment of standard times across the US and Canada for the railway companies to use (though not yet enshrined in US federal policy, that would be in 1918). The newly christened Eastern Time was five hours west of the Greenwich Mean Time that would be agreed upon as the international standard at a conference in DC later in 1884. The local time at New York City Hall was reported to be 3m 58.4s fast (i.e., east) of Eastern Time, and therefore -5h for Eastern Time plus 3m 58.4s to correct for NYC yields a GMT offset for NYC of -4h 56m.
You can see the assignment of this offset to the America/New_York
time zone
in the IANA tzdb here:
# From Paul Eggert (2014-09-06):
# Monthly Notices of the Royal Astronomical Society 44, 4 (1884-02-08), 208
# says that New York City Hall time was 3 minutes 58.4 seconds fast of
# Eastern time (i.e., -4:56:01.6) just before the 1883 switch. Round to the
# nearest second.
<...>
# Zone NAME GMTOFF RULES FORMAT [UNTIL]
Zone America/New_York -4:56:02 - LMT 1883 Nov 18 12:03:58
-5:00 US E%sT 1920
<...>
The last line above says that the offset for this zone is -4h 56m 02s, called the “local mean time” (LMT), up until 1883 Nov 18 12:03:58, which is the NYC local time of when GMT was standardized by US railways (though not yet officially by the federal government). Note that the local time of the change is 3m 58s after noon.2
Back to that buggy pytz code.
It’s a fairly common mistake to make. In fact, here’s a snapshot of the
69 results
Google returns for "pytz" "datetime" "4:56"
on Stack Overflow:
The bug is even common enough to warrant mention
on the pytz website as
a “gotcha” to avoid. The source of the bug is that the pytz tzinfo
object passed to the datetime
constructor is using the very first GMT offset defined for the zone in tzdb, which tends to be some longitudinally-derived LMT from before standardized timekeeping.3
What should pytz’s implementation do instead? The same thing it does when you construct a localized datetime
in pytz’s preferred, albeit non-standard, way: use the local datetime as an index into the tzdb zone definition. For example, this program will do what you expect:
from datetime import datetime
import pytz
# the *correct* localized datetime for 2019 May 21 12:30pm, NYC/Eastern Time
print(pytz.timezone('America/New_York').localize(
datetime(2019, 5, 21, 12, 30)))
Now the local datetime indexes into the America/New_York
definition by picking out the correct GMT offset, -4h:4
2019-05-21 12:30:00-04:00
It’s unclear why pytz’s developers and maintainers have left this buggy behavior
in pytz for, apparently, almost a decade. But one thing is certain:
you should not use pytz
for localized datetimes in Python today! Instead,
check out the much more reasonable, drop-in behavior of time zone objects in the
dateutil
library. It’s also more
actively maintained, now by Paul Ganssle, who has been advising the
Python community
on how to integrate time zones into the Python standard library.
Looking back at the America/New_York
definition from before, we see helpful commentary from the tzdb editor-in-chief. That’s how I derived the historical context above. I’m not the first
person to appreciate
such commentary in the tzdb.
Along those lines, there’s another curious provenance to the LMT that
tzdb assigns to a time zone for 19th century dates. If we replace
America/New_York
in the Python example with Europe/London
, we see another
peculiar GMT offset:
2019-05-21 12:30:00-00:01
Given that Greenwich Mean Time is defined as the local time on the longitude
that passes through the Royal Observatory in Greenwich, a borough of London,
we would expect the LMT of the Europe/London
time zone to be pretty close,
if not simply approximated as the exact same time. Where does this roughly
-1m offset come from then?
Looking at the commentary in the tzdb, one sees the following note from contributor Peter Ilieve:
On 17 Jan 1994 the Independent, a UK quality newspaper, had a piece about historical vistas along the Thames in west London. There was a photo and a sketch map showing some of the sightlines involved. One paragraph of the text described [an old stone obelisk marking a forgotten terrestrial meridian used as the basis for celestial calculations of local time. …]
I have a one inch to one mile map of London and my estimate of the stone’s position is 51° 28’ 30” N, 0° 18’ 45” W. The longitude should be within about ±2”. […]
[This yields GMTOFF = -0:01:15 for London LMT in the 18th century.]
Your program could evaluate to a number whose provenance is traced back to a hobbyist’s estimation of the precise location of “an old stone obelisk” based on an old newspaper photograph and a handy map. I believe that’s what they call in the formal semantics of memory models literature an out-of-thin-air value.
-
Notes on some Points connected with the Progress of Astronomy during the past Year, Monthly Notices of the Royal Astronomical Society, Volume 44, Issue 4, 8 February 1884, p. 208. ↩
-
The switch to the new standard time zones happened throughout the US at noon local standard time. Then the switch occurred in New York at noon Eastern Standard Time, which, due to the slight time difference, worked out to be 12:03:58 local time. Or at least that’s the story we tell (to computers, via tzdb) today. ↩
-
Pytz’s
tzinfo
objects are intialized with the very first GMT offset of the zone here. Then in its implementation of theutcoffset
method, called by theprint
method, it simply uses that same offset rather than indexing into the stored data with the datetime. ↩ -
Technically, it picks put the Eastern Standard Time offset of -5h, but along with that offset comes the
US
daylight saving time rule. That rule then says that the datetime falls within a DST period and so 1h should be added to the offset, resulting in -4h. ↩