When does 1 second != 9,192,631,770 vibrations of a Cesium 133 atom?
Google is implementing a system that effectively redefines how we measure a second. This is an effort by them to simplify some of their internal operations by allowing them to navigate around an obscure family of bugs in time calculation code. The cost is that they will be deliberately polluting global time reporting systems with inaccurate data. Given their size and footprint on the Internet, this impacts nearly everyone.
In a single blog post, they have made a bigger change to time calculation than any government in recent memory.
See their blog post announcing the smearing.
Apologies in advance, this is an obscure topic. I’ve done my best to explain what the problem is and to highlight the absurdity of their solution. I use hyperboles to illustrate the effect of their change.
Lastly, I understand that they are not proposing changing the SI definition of a second. However, changing the time on an authoritative Stratum One time server has an indistinguishable effect. Most people don’t have a Cesium clock at home and instead, rely on providers such as NIST or Google for their time.
When does a yard not equal a yard?
Let’s do a thought experiment. Pretend you’re looking at an American football field and someone asks you “how long is it?” You would probably answer something like 120 Yards. That would be a rational answer because that’s one of the key definitions of the playing field. A Yard is a known and accepted unit of length and there is zero ambiguity in its meaning.
Now we’re going to add a simple twist. Let’s pretend that a yard changes from 36 inches to 37. Again, if you ask someone how long a football field is, they will still say a hundred and twenty yards but the total number of inches will be different.
Now let’s add a more complex twist. Let’s say that a yard is normally defined as 36 inches but on December 31, it is defined as 37 inches. Then what happens? If a football field is built to be 4,320 inches long (120 x 36 inches) and you ask someone on December 1, they will say it is 120 yards. But what if you ask them on December 31? The field didn’t grow in length so it is less than 4,440 inches (120 x 37 inches) but the field is still defined by the NFL rules as 120 yards. So which is it?
This sounds like a stupid problem. Clearly, a yard has to be one unchanging length, right?
Now let’s consider that in the context of a second. A unit of length is required to compute speed (distance divided by time), velocity (speed & direction), acceleration (distance divided by time squared), volume (distance cubed) and a large percentage of all of the units of measure you use. You can see that changing a fundamental unit could start to be messy.
The fact is that we don’t have this problem because we don’t keep redefining a yard (or a second) and we certainly don’t define a yard based on a calendar date. That would just be ludicrous. Right?
But Google has proposed doing exactly this. Instead of altering a unit of length, they want to alter a unit of time. Once and a while. Depending on the date and some astronomical observations. Yes, this is as crazy as it sounds. Let me explain.
Their proposal is called “smeared time.” It really isn’t fair to call it a proposal because they are implementing it on all their technology and given their internet footprint, it will effectively impact every person who is in any influenced by technology. Unless you live in the words and hunt all your own food and don’t interact with civilization, Google’s plan will impact you.
Google’s plan is, on certain days of the year, to intentionally adulterate their definition of a second so that it doesn’t match what someone would measure using the SI unit definition. In other words, if you looked at a clock at Google, it would be running slow compared to a clock operated by the US Government.
Before I dive into the implications of their plan, we need to establish some context.
The second is a fundamental SI unit. That means that it is precisely defined, completely unambiguous, and cannot be decomposed into more fundamental units. For comparison, Speed is not a fundamental unit – it is Distance divided by Time.
Something cool happened in modern history. Humanity evolved a sufficient understand quantum phenonium at such a level that we could use it to set the definition of a base SI unit. Formally, a Second is defined as 9.192631770 x 10^9 cycles of radiation produced by the transition between two energy levels of a Cesium 133 atom. In other words, an atom of Cesium 133 emits radiation at 9,192,631,770Hz. So, if we count the wave peaks, every 9 Billion or so another second has passed.
The first clock in humanity was the sun. The second was a pendulum clock. The weight swung once per second and every 3,600 oscillations meant another hour had passed. That was a huge step forward for humanity but it required physical stability; you couldn’t take a grandfather clock on a boat. Next came the spring mechanism. Then Quartz. Then atomic phenonium. In each step, the thing that is oscillating is getting smaller. By moving to smaller and smaller references, we are eliminating the effect of physical forces on the frequency stability of the reference oscillator. The results are clocks that are increasingly more stable.
Today, the global definition of time is calculated by several observatories across the world. In the United States, the National Institute of Standards and Time in Colorado and the US Naval Observatory in Virginia are two of the largest contributors to global time. Laboratory all over the world each have dozens or hundreds of clocks using different technologies. They independently calculate their own estimate of global time and share their results with the rest of the world. The global time is simply an average of a bunch of these laboratories.
Solar vs. Computed Time
Solar time is a measure of time that is calculated using the relative position of the earth and the sun. It is essentially what humanity has been doing since we evolved a brain.
Solar time versus Computed time is an interesting distinction. Above I outlined how humankind can accurately keep count of the passage of time. There is an equally impressive feat of astronomy at play too. It turns out that the earth’s rotation is slowing down and we can measure this effect. And it isn’t even slowing down uniformly.
We can keep precise measurement of time. We can keep precise track of the rotation of the earth. It turns out that when you compare time passed as it is computed versus how it is observed astronomically, we can measure a difference. It is a nearly de minimum amount but it is measurable. Over time, that means that the solar time and computed time will drift out of sync.
For reasons that appear to be nothing more than historic, in 1972 humanity decided that we want to keep our computed time in sync with our solar time. This provides no practical benefit to us but it is an agreed upon convention and it is observed worldwide.
Leap seconds are added when the solar time and computed time drift by about a second. By convention, they are added to the end of either June 30th or December 31st based on UTC passage of months. They are adjusted globally regardless of the local time zone. This shim keeps computed and solar time within one second of each other.
We need to do another thought experiment for this. Let’s say that I wanted to tell my Grandma what time it is. I could look at my clock, write the time on a piece of paper, and mail it to her. If it always took exactly 5 days to get to her, she could simply take my letter, add the appropriate adjustment for propagation delay, and have an accurate measure of time.
Of course, even a casual observer of the US Post Office will notice that there can be a wide range in delivery times. Sometimes it may take 3 days to deliver a letter. Sometimes 6. With that uncertainty in the sending time, it makes snail mailing my Grandma the time impractical. What we need is either a system that has a known and reliable propagation delay (the mail will always take exactly 100 hours, 10 minutes, 15 seconds) or have a method to accommodate these variances.
There are two widely accepted ways to distribute time information: GPS and the Network Time Protocol. (There are more, but these are the two relevant ones for my purposes).
GPS is the same Global Positioning System that your cell phone uses. Without diving into how GPS works, it is sufficient to say that the require very precise and dependable signal delivery to measure your position on earth. Satellites broadcast the time, your phone compares the stated time the measured time and a distance is calculated. (The math is a lot more complex than that in reality.)
Most data centers have a dedicated (or several) GPS receivers. Clearly, the building isn’t moving, but they provide easy to access to a precise source of time information. (GPS is one of the key components in calculating the current global time).
Since not every device has or needs a GPS receiver, the other key technology used is called the Network Time Protocol. It is a reliable way to distribute time across an IP network. In a data center context, a GPS receiver sets the master clock on an NTP server and all machines in the data center sync to the NTP server. It’s worth noting that this isn’t an enterprise only technology. For most of their life, Window and Mac OS have had NTP support and all modern computers ship with it enabled and pre-configured.
NTP, Google, and Smeared Time
RFC 7164 is a proposal that appears to be the basis for Google Smeared Time. Except, that “smeared time” is clearly a better title. I’ll give that to Google.
As I said above, Google will make a second longer for 10 hours before and 10 hours after a leap second has happened. Effectively this will eliminate the need for 23:56:60 UTC (Normally it goes 58..59..00). By stretching the seconds before and after a leap second, they and pretend it never happened.
They have done this before. This isn’t the first time this has come up but their developer page only highlights the problem. The have used different methods in the past. Different companies handle it differently. This page appears to be an attempt to placate people but only underscores the absurdity of the situation.
Why is google doing this?
Leap seconds are tricky. They don’t happen regularly. Some leap seconds are 6 months apart and some are 7 years apart. Their insertion is determined by a global body of scientists and astronomers. But even more basic, most software doesn’t account for a 61-second minute. (Think back to the 37-inch yard). When an NTP server gives a time that ends in :60 – some software, notably the Java VM, freaks out and crashes.
Computers are faced with something that is difficult to calculate and whose occurrence is almost arbitrary.
Google’s stance, in my words, is that they would rather have a fluid definition of a second and screw up all of humanity than to try and solve bugs in software they can’t control. Maybe that is harsh but it is directionally correct. Instead of coming up with a globally accepted solution or fixing the underlying bugs, they are redefining time.
For them, it is easier to pretend that leap seconds don’t happen than to account for them.
The honest answer is probably no one other than me and a few people on the time-nuts forums.
The fundamental problem is that Google is massive. They have a huge infrastructure that other businesses use for their operations. They have widely used NTP servers that are open to the public. By implementing Smeared Time, they are effectively redefining a second. People count on them for reliable time measurement and they are intentionally adulterating their results simply to make their life easier.
So what is impacted? Directly it is:
- Ever Google NTP server
- Every Google API
- Every virtual machine on their Compute Engine (by default)
Indirectly, this touches nearly every facet of the internet. Many companies use Google’s Compute Engine directly. Every client of a Compute Engine customer is touched. Google APIs are used for authentication. For direct time calculation via NTP.
Think of it like this – if Google disappeared overnight – who would be trouble?
Because of Google’s size, their decision will impact a huge swath of the internet. In a single blog post, they made a bigger change to the world computation of time than NIST could. Because of the esoteric nature of this topic, they yield more power than any individual government in this space. Their decisions aren’t informed by scientists. They are simply informed by what is easiest for them and damn the consequences for everyone else.
“Don’t Cross The Streams. It Would Be Bad…”
Or, as Google phrases it, “don’t mix smearing and non-smearing time servers.” The truth is that this stems from Google’s unilateral decision to compute time differently. If everyone used the same method, we would be in a better spot. All time calculations would agree and though we would have a variable definition of a second (in a practical sense, not SI), we wouldn’t have any calculation conflicts.
But if you have a good NTP server and a Smeared NTP server – for 20 hours every leap second, they won’t agree on the time.
I really hope you don’t have anything super time sensitive. I doubt high-frequency traders use public NTP servers but the point still stands. Having one of the largest time providers in the words unilaterally decide to intentionally misreport time is astounding.
This is different
I get how Uber wants to push the laws on who is a contractor versus who is an employee. I get how Amazon want’s to push tax law to figure out exactly what is allowed. These are companies pushing the boundaries of arbitrary, local, and manmade laws.
Redefining time is different. That is akin to saying “Pi = 3”. To be clear – It is false that Indiana tried and failed to define Pi in biblical units. But the myth resonates with society because of it’s absurdity and almost believability. If Pi was set to 3, it would be intuitively obvious that people in Indiana would draw funny looking circles and their research would disagree with everyone else in the world.
But what Google is doing isn’t different than saying Pi=3. They are trying to change the definition of a fundamental unit in nature. A second is an arbitrary length of time. But it is one that humanity agrees on, it constant, and is the basis for about half of all compound units. Changing the definition of a second it the same as changing the definition of Pi or a yard. Making it change based on the calendar date is beyond comprehension.
I’ve made my point that I don’t like what Google is doing. But I do have two alternatives.
Software evolves in the Darwinian sense. If the JVM cannot reliably handle leap seconds and catastrophically fails, people notice and Java gets a bad reputation. Over time, either the code evolves to adapt to leap seconds or people select different technologies.
My first alternative is – “live with it.” I hope companies fix their code to handle leap seconds but no one can compel them to. If your mission critical infrastructure crashes because of leap seconds – then either the maker should fix the code or you should migrate away. There are strong financial incentives to have reliable and stable computers. Leap Second bugs may be edge cases but they happen and either we fix them or we live with them.
Think back to the solar time versus computed time distinction. If our days shift by a second, or even a whole minute, what’s the harm? Crops will still get the same amount of sunlight. Nothing will really change. That’s because humanity already used computed time exclusively.
So why don’t we introduce a “Leap Minute” and just be done with it for 200~400 years. If we aim to keep our solar and computed times within one minute of each other, then we should adjust much less frequently. This doesn’t strictly solve the problem, but it drastically mitigates it.
The idea of replacing a leap second with a leap minute isn’t new. It has been proposed in the past and is considered from time to time.
This is the logical extreme of the above alternative. If there really is no value to keeping solar and computed time in sync – why should we keep inserting adjustments? Why not let them drift and never adjust?
In thousands of years the sun may no longer be at its Zenith at noon but who cares? People in thousands of years will know that as their normal. The sun will still rise and set like it always has and we will still get the same amount of light every day (aside from changes in the earth’s rotational speed).