Daylight Savings Time Misconfigurations between Computers Can Lead to Cumulative Data Errors

Symptom: my web app starting losing a day whenever I edited some data records. This app was working fine for a while, then, in March, it started losing time. Because my time was quantized to days, it looked like I was losing a day every time I saved the record.

The environment was a shared web hosting account, with PHP for the app, and a MySQL database behind it, on a separate machine.

According to a new law, starting in 2007, DST starts three weeks earlier than it has in the past.

I suspected that the problem is that one machine has the old style DST timezones, while the other machine has the new DST settings. So I created a test script to simulate the effect of multiple edits on a record.

Here are some results. The test is simple. I start with a date, like '2008-03-10', and convert it to a unix timestamp. Then, convert it back to a date, and insert it into the database. Then, I read the value back from the database, using mysql's unix_timestamp() function to perform the conversion. Then, calculate a difference between the first timestamp, and the second.

Then, I take that new timestamp and run the test on it, again. This is like a double-check, and also will have a cascading effect if the problem exists.

I started runs of this test on a few different dates.

One thing that cropped up, but wasn't related to this problem, was the fact that the database machine's time was set incorrectly. This led to biases in my thinking.

I (mistakenly) thought the test showed that the PHP date() is okay, but if you inserted using the plaintext date, it's converted internally to a timestamp, and then when you request that timestamp, it's off by one hour. This error only crops up in mid-March.

The output reads left to right. Each row shows the data being transformed, put into the database, extracted from the database, and then a difference between timestamps is calculated. You can see the cascading effect, too. (That's there because I wanted to simulate multiple changes to the data.)

2008-02-19
2008-02-19 1203400800 2008-02-19 insert into tt (dt) values ('2008-02-19') ** 1203400800 diff 0
2008-02-19 1203400800 2008-02-19 insert into tt (dt) values ('2008-02-19') ** 1203400800 diff 0
2008-02-19 1203400800 2008-02-19 insert into tt (dt) values ('2008-02-19') ** 1203400800 diff 0

2008-03-17
2008-03-17 1205733600 2008-03-17 insert into tt (dt) values ('2008-03-17') ** 1205730000 diff 3600
2008-03-16 1205647200 2008-03-16 insert into tt (dt) values ('2008-03-16') ** 1205643600 diff 3600
2008-03-15 1205560800 2008-03-15 insert into tt (dt) values ('2008-03-15') ** 1205557200 diff 3600
2008-03-14 1205474400 2008-03-14 insert into tt (dt) values ('2008-03-14') ** 1205470800 diff 3600
2008-03-13 1205388000 2008-03-13 insert into tt (dt) values ('2008-03-13') ** 1205384400 diff 3600
2008-03-12 1205301600 2008-03-12 insert into tt (dt) values ('2008-03-12') ** 1205298000 diff 3600
2008-03-11 1205215200 2008-03-11 insert into tt (dt) values ('2008-03-11') ** 1205211600 diff 3600
2008-03-10 1205128800 2008-03-10 insert into tt (dt) values ('2008-03-10') ** 1205125200 diff 3600
2008-03-09 1205042400 2008-03-09 insert into tt (dt) values ('2008-03-09') ** 1205042400 diff 0
2008-03-09 1205042400 2008-03-09 insert into tt (dt) values ('2008-03-09') ** 1205042400 diff 0
2008-03-09 1205042400 2008-03-09 insert into tt (dt) values ('2008-03-09') ** 1205042400 diff 0
2008-03-09 1205042400 2008-03-09 insert into tt (dt) values ('2008-03-09') ** 1205042400 diff 0
2008-03-09 1205042400 2008-03-09 insert into tt (dt) values ('2008-03-09') ** 1205042400 diff 0
2008-03-09 1205042400 2008-03-09 insert into tt (dt) values ('2008-03-09') ** 1205042400 diff 0

2008-04-19
2008-04-19 1208581200 2008-04-19 insert into tt (dt) values ('2008-04-19') ** 1208581200 diff 0
2008-04-19 1208581200 2008-04-19 insert into tt (dt) values ('2008-04-19') ** 1208581200 diff 0
2008-04-19 1208581200 2008-04-19 insert into tt (dt) values ('2008-04-19') ** 1208581200 diff 0

** select unix_timestamp(dt) as u from tt order by id desc limit 0,1

As noted above, I was biased by the misconfigured time on the database server. It led me to look for problems on the database side.

As it turns out, time conversion on the web server, on the left side, isn't correct. The timezone should change during DST. The time on the right side seems to adjust. If you count the length of days, by comparing the changing values in March, you see that there's a short day. Then the differences end. There's no short day in the left side.

The clock settings on the servers do not matter. What matters are the timezone configuration files. What I learned today is, on Linux and Unix, these files are called the zoneinfo database, or tz database. Each file in the database contains information about the timezone and DST for the timezone.

If the databases between machines don't match, you can be bitten by this gotcha.

The workaround turns out to be pretty simple.

Don't do this: UPDATE table SET d='2008-03-15' ...

Do this instead: UPDATE table set d=FROM_UNIXTIME(1205553600) ...

Use the timestamp in your front end, but, when you do conversions, keep all the conversions on the MySQL side.

Alternatively, you could just as easily do all the conversions on the PHP side. Either way, you have to do all your conversions on the same machine, so that you're using only one zoneinfo database to convert from Unix time to human-readable time, and vice versa.