Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Network Technology

June 30th Leap Second Could Trigger Unexpected Issues 233

dkatana writes: On January 31, 2013, approximately 400 milliseconds before the official release of the EIA Natural Gas Report, trading activity exploded in Natural Gas Futures. It is believed that was the result of some fast computer trading systems being programmed to act, and have a one-second advance access to the report. On June 30th a leap second will be added to the Network Time Protocol (NTP) to keep it synchronized with the slowly lengthening solar day. In this article, Charles Babcock gives a detailed account of the issues, and some disturbing possibilities: The last time a second needed to be added to the day was on June 30, 2012. For Qantas Airlines in Australia, it was a memorable event. Its systems, including flight reservations, went down for two hours as internal system clocks fell out of synch with external clocks.

The original author of the NTP protocol, Prof. David Mills at the University of Delaware, set a direct and simple way to add the second: Count the last second of June 30 twice, using a special notation on the second count for the record. Google will use a different approach: Over a 20-hour period on June 30, Google will add a couple of milliseconds to each of its NTP servers' updates. By the end of the day, a full second has been added. As the NTP protocol and Google timekeepers enter the first second of July, their methods may differ, but they both agree on the time.

But that could also be problematic. In adding a second to its NTP servers in 2005, Google ran into timekeeping problems on some of its widely distributed systems. The Mills sleight-of-hand was confusing to some of its clusters, as they fell out of synch with NTP time. Does Google's smear approach make more sense to you, or does Mills's idea of counting the last second twice work better? Do you have a better idea of how to handle this?
This discussion has been archived. No new comments can be posted.

June 30th Leap Second Could Trigger Unexpected Issues

Comments Filter:
  • Doesn't matter (Score:5, Informative)

    by StormShaman ( 603879 ) on Friday June 19, 2015 @12:21PM (#49946661)

    The only problem mentioned is that they fall out of sync with each other. If they're both otherwise fine, just pick one. Sounds like the disadvantages of either one aren't as big as the disadvantage of them not working well together.

  • Google is right (Score:5, Interesting)

    by Rob MacDonald ( 3394145 ) on Friday June 19, 2015 @12:25PM (#49946699)
    Typically when dealing with NTP you do not want big swings. In fact, a system using NTP that's too far out of sync, won't sync back up correctly. One that is slightly out of sync will slowly come back in sync over a period of time, hours or days even. Both approaches could work, they really could, but I think adding a few milliseconds here and there is a better way to get this done as long as the systems don't fall too far behind. I work with Avaya voice equipment and we've been warning people about this for months and months. We've provided instructions on several methods to ensure this doesn't cripple your system, but it all depends on how your NTP is setup. I also foresee issues with just adding an extra second to the day, this is not going to work for a bunch of systems and will actually throw them out of sync compared to googles approach. One of the solutions we've "provided" is to disable NTP shortly before the time roll over, then enable it once it's July. That's a pain in the butt, but if you can afford the few minutes of service interruption, it solves all of the issues right there, you turn it off when it's synced, turn it back on and it syncs to the new time. The real issues come in, for my field at least, with logging, this is going to throw a wrench into sys logs if it's not taken care of, and with some of the platforms, it will literally cripple the system.
    • Their method has a name in NTP parlance, it is called slew.

      See man page ntpd(8).

    • "Typically when dealing with NTP you do not want big swings."

      A second is not a "big swing" in general computation parlance. People working on near-RT systems already know -or should know, how to cope with leap seconds.

      "In fact, a system using NTP that's too far out of sync, won't sync back up correctly."

      Five to fifteen seconds at least. We are talking a different league, almost a different sport here.

      "I work with Avaya voice equipment and we've been warning people about this for months and months. We've p

      • by 0123456 ( 636235 )

        People working on near-RT systems already know -or should know, how to cope with leap seconds.

        Hey, guess what, those guys they outsourced the software development to in the third world... don't. And they know they'll have moved on to a new job by the time the next leap second happens.

        The rest of us have to deal with their hardware not doing what it's supposed to when the leap second hits.

      • NTP would typically slew a 1-second difference, so Google is not out-of-line to add the second at the beginning of the day and slew their systems over the course of the day. Google uses lots of vector clocks in their distributed systems, they may have calculated that slewing over the course of the day introduces fewer time differences between machines than counting the final second twice (due to drift, which is inevitable on any NTP slave, corrected by "frequency discipline" and error estimates).

    • Typically when dealing with NTP you do not want big swings.

      This is a solved problem, though (sibling points out the reason why: slew.) In practice, this is also a known conditions, especially with virtual machines (doubly so with VMWare-hosted VMs [virtualizationadmin.com]). This is because VM's time-slice the physical CPU, so the keeping time on the VM's OS clock is very imperfect anyway.

    • The problem is systems which are poorly designed, and cannot properly handle leap seconds. That includes every POSIX system. Handling a leap second is fundamentally no different than handling a leap year. You have a minute with 61 seconds instead of 60, just like you have a month with 29 days instead of 28. But despite leap seconds existing since long before POSIX, the definers provided no means of enumerating a 61 second minute.

      Counting the same second twice or changing the length of a second, both are do
  • by at10u8 ( 179705 ) on Friday June 19, 2015 @12:27PM (#49946717)
    A problem for sysadmins is that the status quo of the standards requires that we choose which standard we want to violate. We can violate the specification of UTC by not counting 23:59:60 or we can violate POSIX by counting it or we can violate POSIX and the SI second by not actually keeping the system clock on UTC using smeared seconds that are not suitable for tracking projectiles and other real-time applications. This problem is old, 50 years old, as seen in the 3 plots on this web page [ucolick.org].
  • by Anonymous Coward on Friday June 19, 2015 @12:29PM (#49946737)

    I understand the desire to change things, but putting some social media Share link in place of the Read More link goes against the kind of website Slashdot is.

    Please restore the original layout. Thanks.

    • by enigma32 ( 128601 ) on Friday June 19, 2015 @12:37PM (#49946831)

      +1 - Mod parent up.

      • by Art3x ( 973401 ) on Friday June 19, 2015 @12:57PM (#49947019)

        I understand the desire to change things, but putting some social media Share link in place of the Read More link goes against the kind of website Slashdot is.

        Please restore the original layout. Thanks.

        +1 - Mod parent up.

        +2. In a Slashdot comment, we must add links and formatting by typing HTML by hand. You would therefore think we know how to copy and paste a web address from Slashdot to Facebook, if that's what we really want to do. We don't need an icon to do it for us.

        If you're going to add icons, switch the places for Share and Comments. Put the Share link to the right of the heading. Put the Comments link at the bottom. To me it seems more logical that way, it puts the Comments link back where it was.

        • by war4peace ( 1628283 ) on Friday June 19, 2015 @01:22PM (#49947307)

          The way they changed the design is clickbait of sorts.
          People trained their muscle memory to click that area to load more of the story or comments. Now they click and yell in frustration.
          That's a really shitty way of luring people. Shame on you, Dice!

          • by weilawei ( 897823 ) on Friday June 19, 2015 @03:49PM (#49948619)

            I'm willing to accept that layouts change and I'll need to look in a new place--but the new location is actually terrible usability. Here's why:

            First, I read the headline. Then, I read the summary. I'm moving down the page, and I'm scrolling the page, too. So, now I'm at the end of the summary, and the headline for any story with a long summary is now out of the window. Now, I need to scroll back up to see how many comments or to click to view those comments. Extra work, even if the summary isn't long.

            Fitts' Law [wikipedia.org] applies here. They've made the target smaller in diameter, and placed it further away effectively. That means the difficulty of clicking to view comments is noticeably harder.

    • by GoodNewsJimDotCom ( 2244874 ) on Friday June 19, 2015 @01:20PM (#49947293)
      I thought Slashdot was dead. I thought they killed the comments until someone told me where to look.
    • I understand the desire to change things, but putting some social media Share link in place of the Read More link goes against the kind of website Slashdot is.

      Not only that, but even though they've added a new numeric post count inside of a little speech bubble... if you click on that, you don't get taken to the comments! You still get taken to the top of the page, and have to scroll down to get to the comments.

      I realize Taco and the others are long gone, but doesn't anyone on the Slashdot staff even bother to look at the pages after a design change has been made?

    • Ditto. This is an awful change.

    • by caseih ( 160668 )

      Seconded! The share button is something I will never use and the lack of the read more link makes the web page a lot hard to use. Hope you'll do the right thing and stop screwing with things for change sake. Stop trying to bring the beta site back!

  • Sync (Score:4, Informative)

    by Espectr0 ( 577637 ) on Friday June 19, 2015 @12:30PM (#49946749) Journal

    We have 600 machines in my company's network distributed over 20 cities in our country. The servers are all located on our main branch and are connected through slow WAN frame relay links (up to 4Mbps)

    We have time differences between machines, sometimes up to 3 or 4 minutes, and we don't seem to have issues. I find it strange than a possible 1 second different could cause so much issues.

    Perhaps the Google method is better because the adjustment will take place during the day and not at the last second.

    • Re:Sync (Score:5, Informative)

      by 0123456 ( 636235 ) on Friday June 19, 2015 @12:37PM (#49946827)

      I find it strange than a possible 1 second different could cause so much issues.

      It's not the time difference that causes problems per se, it's time going backwards. You presumably missed the fact that many Java servers crashed over the last leap second because of a kernel bug that screwed up their internal timers?

      We had problems last time due to faults reported by external hardware when it saw the time jump backwards. I'll be at my desk when it happens this time to deal with any problems that come up this time.

      And, given the chaos every leap second causes, hopefully we can finally convince the 'experts' to stop fiddling with time.

    • This.

      With the exception of highly precise equipment, if your systems crash & burn because of 1 second differences, you're doing something wrong.

      • by ceoyoyo ( 59147 )

        If you've got equipment that needs such high precision and you're synchronizing it with NTP rather than an internal standard, you're doing something wrong.

        • Agreed. Slew is acceptable in using NTP. Slew is often more than 1 second.

          • by 0123456 ( 636235 )

            Agreed. Slew is acceptable in using NTP. Slew is often more than 1 second.

            If I remember correctly, NTP has a leap-second flag which indicates that the time should jump by a second at midnight. It doesn't slew, at least on Linux... otherwise different machines would be reporting different times until they'd all slewed back to what they should be.

            • by ceoyoyo ( 59147 )

              Slew can be used in NTP for any clock adjustment, not just leap seconds. Linux does use slew (as opposed to step) to make clock adjustments. In the special case of leap seconds, it uses step, rather than slew.

              • by TheCarp ( 96830 )

                I believe the proper phrase would be CAN use slew. Its actually a command line option on startup of ntp.

                I only know this because a particular piece of software I have had to install requires it and will refuse to install if its not set.

                • Re:Sync (Score:5, Informative)

                  by ceoyoyo ( 59147 ) on Friday June 19, 2015 @02:36PM (#49948049)

                  I'm not sure exactly what arguments each Linux distribution uses, but this is from the man page on ntpd:

                  -x
                  Normally, the time is slewed if the offset is less than the step threshold, which is 128 ms by default, and stepped if above the threshold. This option forces the time to be slewed in all cases. If the step threshold is set to zero, all offsets are stepped, regardless of value and regardless of the -x option. In general, this is not a good idea, as it bypasses the clock state machine which is designed to cope with large time and frequency errors Note: Since the slew rate is limited to 0.5 ms/s, each second of adjustment requires an amortization interval of 2000 s. Thus, an adjustment of many seconds can take hours or days to amortize. This option can be used with the -q option.

                  My reading of that is that the normal adjustment uses slew. Step is used only when there's a big discrepancy, and you can use -x to use slew even in that case.

    • Comment removed based on user account deletion
      • Similarly, a five-minute offset will prohibit logins and group policy updates. If your Windows Time Service configuration is pushed via group policy, you're kinda screwed, and you'll need to have a local admin on site to nudge the clock in the right direction.

        • We experience this issues when the motherboard battery dies and resets the computer's date to year 2000 or such. Since most users aren't admins, the machines can't receive the correct time on their accounts therefore we logon with our admin accounts and the time corrects itself.

          But for 3-4 minutes we don't have issues.

        • Comment removed based on user account deletion
        • by TheCarp ( 96830 )

          5 minutes? That is nothing, we had a bug in one of our builds were we forgot to set hardware clock. Turns out our blade vendor was sending us systems with the clock set years in the past, so after build, the system would boot with hardware time, refuse to sync the clock due to the enormous skew, and then refuse logins because..... our ldap ssl certificates were not YET valid!

  • by ErikTheRed ( 162431 ) on Friday June 19, 2015 @12:43PM (#49946875) Homepage

    even if it means re-defining the second or decoupling official time measurements from planetary movement. Leap days, leap seconds, etc., are silly hacks that belong in a bygone era.

    • by ceoyoyo ( 59147 )

      We have at least three of them. GPS, LORAN and TAI standards do not include leap seconds. They drift ahead of UTC, but they're designed for applications that need good synchronization without having to worry about things like leap seconds. UTC is designed so that the sun will always be up during the day and down at night.

      If synchronization is what you want, use one of the standards designed for it.

      • UTC is designed so that the sun will always be up during the day and down at night.

        There have been 25 leap seconds since 1972. At that rate, it will take around 6000 years for UTC to be even an hour different from TAI. Leap seconds don't have any appreciable impact on the sun always being up during the day.

        I don't think anyone really cares about whether we use UTC or some other system, though. The problem is that software/hardware vendors have all been using the wrong time standard -- nobody but astronomers actually has a reason to want UTC. We just need to get developers to use a dif

        • by ceoyoyo ( 59147 )

          *always* be up during the day.

          If you think a leap second is a pain, you should try switching to a new calendar. Some people can actually think past the next quarterly report.

          Personally, I think leap seconds are a great idea because they expose shoddily made software and hardware. If you (think you) need sub-second synchronization and just tossed in an NTP server instead of implementing a proper synchronization mechanism, you likely cut corners somewhere else too.

          • If you think a leap second is a pain, you should try switching to a new calendar. Some people can actually think past the next quarterly report.

            There's a difference between not looking past the next quarterly report, and not worrying about a completely unrealistic scenario -- in this case, that my software will still be running 50,000 years from now when there actually is a disagreement in date between UTC and another standard.

            Personally, I think leap seconds are a great idea because they expose shoddily made software and hardware.

            Introducing pointless complexity to try to "catch" poor software or hardware is a bad idea. Engineering is a hard enough job without purposefully making it harder.

    • by Bengie ( 1121981 )
      Exactly! Once we start colonizing other plants, time will never be in sync with the Sun for all places. Lets just agree on a rate and an epoch. The rate at which time de-synchs with the visible day is so slow, it'll spread over generations and at some point in the future, 12am local time will be "noon", but who cares. What's way off in the future and it will be normal for those people.
  • Oh, excellent. Not only are the trains now running on time, they’re running on metric time. Remember this moment, people, eighty past two on April 47th, it’s the dawn of an enlightened Springfield.
  • Ignore it. How much does it impact humanity if the clock noon drifts a tiny bit from solar noon? We're looking at an impact of shifting noon by about a minute over the course of an average human's lifespan. The impact of ignoring it means that people who rely on sundials are left to solve the sync problem on their own, and that's a whole lot less of an impact than NTP.

    Other systems that synchronize with natural phenomena, such as automated irrigation systems or automated lighting systems, can be adjusted

    • by heypete ( 60671 )

      I recently took a private tour of the time and frequency lab at METAS (the Swiss Federal Institute of Metrology) and got to observe their atomic clocks, ask the people there some questions, etc.

      The scientist in charge of the lab wishes everyone would use TAI for time distribution. TAI has no leap seconds and differs from GPS time by a constant 19 seconds. If TAI was used, computers would never have to worry about leap seconds internally and things would be greatly simplified.

      Computers don't care what time i

      • by mbone ( 558574 )

        I recently took a private tour of the time and frequency lab at METAS (the Swiss Federal Institute of Metrology) and got to observe their atomic clocks, ask the people there some questions, etc.

        The scientist in charge of the lab wishes everyone would use TAI for time distribution. TAI has no leap seconds and differs from GPS time by a constant 19 seconds.

        Yes, because the Air Force people setting up GPS time didn't understand why that was a fundamental difference between UTC and TAI (GPS - UTC was zero when the time scale was established).

    • by Layzej ( 1976930 )

      We're looking at an impact of shifting noon by about a minute over the course of an average human's lifespan.

      Maybe not even that since leap seconds can be both inserted and removed as required depending on climatic and geological events. It could very well be a wash.

    • by mbone ( 558574 )

      Ignore it. How much does it impact humanity if the clock noon drifts a tiny bit from solar noon? We're looking at an impact of shifting noon by about a minute over the course of an average human's lifespan. The impact of ignoring it means that people who rely on sundials are left to solve the sync problem on their own, and that's a whole lot less of an impact than NTP.

      Other systems that synchronize with natural phenomena, such as automated irrigation systems or automated lighting systems, can be adjusted by their owners.

      If some purist insists that we have to fix it, let's agree to fix it once per century, and let the people 100 years from now figure out if it's important enough to them to worry about.

      GPS does not ignore it in the slightest. GPS is a big user of UT1 data and predicts, and has driven a lot of work in the Earth rotation field. However, what GPS does not do is use UTC as a very approximate version of UT1.

      People doing celestial navigation at sea do typically assume that UTC as a very approximate version of UT1, and that's why there are leap seconds at all (to keep the celestial navigation error to the kilometer level). As the use of celestial navigation declines, so does the need for leap s

  • by frovingslosh ( 582462 ) on Friday June 19, 2015 @12:50PM (#49946965)
    At least it is just a second. That sudden extra hour of daylight in the spring is really bad for my rose bushes.
  • by mbone ( 558574 ) on Friday June 19, 2015 @12:50PM (#49946967)

    There is exactly one correct way to do this.

    2015-06-30T23:59:59
    2015-06-30T23:59:60
    2015-07-01T00:00:00

    David Mills approach is not correct, but will generally work and limits the pain to 1 second.

    Anything else is just stupid. We've only been doing this since 1972. You would think people would get with the program by now.

    • There's another exactly one correct way to do it. Lengthen the nanosecond to be in tune with the Earth's revolution around the sun instead of counting periods of the radiation corresponding to the transition between the two hyperfine levels of the ground state of the cesium 133 atom.
    • by arcade ( 16638 )

      David Mills approach is a hack around a broken standard, namely POSIX.1. It's a good hack.

      Your solution is obvious and correct, but isn't possible to implement while being POSIX compliant.

      We all suffer from a broken standard. It's not possible to be both posix compliant and doing this correctly.

  • Anyone who's worked with time zones even a little bit knows that catastrophic failures aren't "unexpected" at all.

  • The whole problem strikes me as one of human preferences, not technical requirements. There's absolutely no reason not to use our atomic clocks and just count number of seconds since some starting point. The desire to have the sun directly overhead at "noon" is a human one, divorced from any technical requirement. All of science, computing, networking, telecommunications, would be much happier if we didn't continually redefine time like this.

    So let watch manufacturers and clock-app manufacturers deal w

    • by mcelrath ( 8027 ) on Friday June 19, 2015 @01:03PM (#49947093) Homepage
      Also this is an awesome graph, and illustrates that the Earth is a horrible clock: https://upload.wikimedia.org/w... [wikimedia.org]
      • On the other hand, if you plot periods of the radiation corresponding to the transition between the two hyperfine levels of the ground state of the cesium 133 atom versus the Earth cycle around the sun, it will show that cesium 133 is a horrible clock. It's even worse, if you have two cesium 133 atoms that accelerate differently, they will go out of sync! We luckily only have one Earth, so it cannot go out of sync with itself.
    • Exactly.

      File timestamps should be in linear time (GPS, TAI, whatever).

      What gets displayed to you as a human is in your local time - timezone + planetary adjustment - so it matches the time on the wallclock. Do you as a human really care about the LSB in the file time? For those rare times when you do, you'll use linear time.

  • Blame posix for making all the goddamn pthread *_timedlock() calls take an absolute real time instead of a monotonic clock.

    In anycase, I'm not even going to bother doing anything fancy. I'll let the system suddenly be one second off and then correct itself over the next hour. I'm certainly not going to do something stupid like letting the seconds field increment to 60. Having the ntp base time even go through these corrections is already dumb enough. Base time should be some absolute measure and leap s

  • by ei4anb ( 625481 ) on Friday June 19, 2015 @01:13PM (#49947217)
    When you change a forum against the wishes of the users you risk the Digg effect. Please undo the "Share" change.
  • My concern with allowing a second to happen twice is that time-scheduled events that just happen to coincide with that second might execute twice. Depending on the circumstances, the results could vary from unnoticeable to completely bizarre and damaging.

  • Instead of having a special case every few years, how about going ahead and making a millisecond of adjustment every day as needed? The adjustments could start with 0 or 1 milliseconds, and as the oceans slosh us ever slower, we could start making 1 or 2 millisecond adjustments every day at midnight.

    Would also keep the stars better aligned to the official time.

  • by arcade ( 16638 ) on Friday June 19, 2015 @02:55PM (#49948199) Homepage

    First off, the problem with leap seconds and unix is that unix time isn't UTC. Unix time is defined as seconds since epoch, ignoring leapseconds. Unix time is 'lossy' in that a the moment a leapsecond occurs can't be differentiated from the second before it. More information about that here: https://en.wikipedia.org/wiki/... [wikipedia.org]

    The problem is that POSIX.1 is plain stupid when it comes to leapsecond.

    The correct solution to this problem would be as follows:
    1. Fix POSIX.1 to define unix time as TAI.
    2. Implement conversion routines i gettimeofday and other relevant functions.
    3. Use a handy store for leapseconds.

    Now, number 3 here is a bit tricky. Purists would probably want this in the TZ database or somesuch. This is well and good, but has the problem that the TZ files need to be packaged and updated on all the servers. If I remember correctly (please correct me if I'm wrong) Java is shipped with its own TZ files, and might also need them updated separately. Due to this, I think the most maintainable and portable way to do this across unixes would be to simply have an /etc/leapseconds file which lists the leapseconds since epoch. It does, however, depend on unix time being defined as TAI first.

  • by denbesten ( 63853 ) on Friday June 19, 2015 @03:40PM (#49948559)

    There have been 35 leap seconds in the past 42 years. In very round numbers, we could have....

    1 leap millisecond 3 times per day,
    1 leap second every year or so,
    1 leap minute every 50 years or so,
    1 leap hour every 3000 years or so.

  • Every single time a leap second comes up in the future, we have these panic-stricken articles predicting doom and gloom for some services.

    If you haven't figured out how to deal with leap seconds that have been an issue since the '70s, I say your service DESERVES to crash and burn, and you DESERVE to spend long and stressful hours dealing with the mess.

    Leap seconds aren't a surprise to ANYONE with a functioning brain cell.

"What is wanted is not the will to believe, but the will to find out, which is the exact opposite." -- Bertrand Russell, _Sceptical_Essays_, 1928

Working...