NTP at NIST Boulder Has Lost Power

(lists.nanog.org)

100 points | by lpage 3 hours ago

11 comments

  • themafia 1 hour ago
    > Facility operators anticipated needing to shutdown the heat-exchange infrastructure providing air cooling to many parts of the building, including some internal networking closets. As a result, many of these too were preemptively shutdown with the result that our group lacks much of the monitoring and control capabilities we ordinarily have

    Having a parallel low bandwidth, low power, low waste heat network infrastructure for this suddenly seems useful.

  • Animats 2 hours ago
    NIST campus status: Due to elevated fire risk and a power outage for the Boulder area, the DOC Boulder Labs campus is CLOSED on December 19 for onsite business and no public access is permitted; previously approved accesses are revoked.[1]

    WWV still seems to be up, including voice phone access.

    NIST Boulder has a recorded phone number for site status, and it says that as of December 20, the site is closed with no access.

    NIST's main web site says they put status info on various social media accounts, but there's no announcement about this.

    [1] https://www.nist.gov/campus-status

  • glkindlmann 1 hour ago
    Of the various internet .+P, NTP is one I never learned about as a student, so now I'm looking at its web page [1] by its creator David L. Mills (1938-2024). I've found one video of him giving a retrospective of his extensive internet work; he talks about NTP at 34:51 [2] and later at 56:26 [3].

    [1] https://www.eecis.udel.edu/~mills/ntp.html

    [2] https://youtu.be/08jBmCvxkv4?si=WXJCV_v0qlZQK3m4&t=2092

    [3] https://youtu.be/08jBmCvxkv4?si=K80ThtYZWcOAxUga&t=3386

    • torcete 31 minutes ago
      In [3] he mentions that one can use NTP to observe frequency deviations and use it as an early warning system for fire and AC failure. That really intrigues me. Can you actually? Has this ever been implemented?
    • ssl-3 1 hour ago
      HN discussion shortly after Dave Mills died, early in 2024: https://news.ycombinator.com/item?id=39051246
  • amelius 16 minutes ago
    This makes me wonder, if you take the average time of all wristwatches on the planet, accounting for timezones and throwing out outliers, how close would you get to NTP time?

    And how many randomly chosen wristwatches would you need to get anything reasonable?

  • arn3n 1 hour ago
    Wind gusts were reaching 125 MPH in Boulder county, if anyone’s curious. A lot of power was shut off preemptively to prevent downed power lines from starting wildfires. Energy providers gave warning to locals in advance. Shame that NIST’s backup generator failed, though.
    • Maxion 55 minutes ago
      Somewhat interesting that they themselves don't have access to the site. You'd think there would have been some disaster plans put in place?
      • ssl-3 13 minutes ago
        Maybe this is the disaster plan: There's not a smouldering hole where NIST's Boulder facility used to be, and it will be operational again soon enough.

        There's no present need for important hard-to-replace sciencey-dudes to go into the shop (which is probably both cold, and dark, and may have other problems that make it unsafe: it's deliberately closed) to futz around with the the time machines.

        We still have other NTP clocks. Spooky-accurate clocks that the public can get to, even, like just up the road at NIST in Fort Collins (where WWVB lives, and which is currently up), and in Maryland.

        This is just one set.

        And beyond that, we've also got clocks in GPS satellites orbiting, and a whole world of low-stratum NTP servers that distribute that time on the network. (I have one such GPS-backed NTP server on the shelf behind me; there's not much to it.)

        And the orbital GPS clocks are controlled by the US Navy, not NIST.

        So there's redundancy in distribution, and also control, and some of the clocks aren't even on the Earth.

        Some people may be bit by this if their systems rely on only one NTP server, or only on the subset of them that are down.

        And if we're following section 3.2 of RFC 8633 and using multiple diverse NTP sources for our important stuff, then this event (while certainly interesting!) is not presently an issue at all.

      • TylerE 38 minutes ago
        Step One of most disaster plans is not to create a second emergency.
        • amelius 21 minutes ago
          But can't NTP server downtime cause a disaster?
  • DamonHD 19 minutes ago
    So far I think I'm still seeing one of them in my peers list for my public-ish NTP server:

             remote           refid      st t when poll reach   delay   offset  jitter
        ==============================================================================
        +time-e-b.nist.g .NIST.           1 u  372 1024  377  125.260    1.314   0.280
    • DamonHD 10 minutes ago
      ...and maybe it's gone:

          #time-e-b.nist.g .NIST.           1 u 1071 1024  377  125.260    1.314   0.280
  • cdfuller 2 hours ago
    Can anybody expand on the implications of this?

    Being unfamiliar with it, it's hard to tell if this is a minor blip that happens all the time, or if it's potentially a major issue that could cause cascading errors equal to the hype of Y2K.

    • autarch 1 hour ago
      Time travel is extremely dangerous right now. I highly recommend deferring time travel plans except for extreme temporal emergencies.
      • jeffrallen 27 minutes ago
        Would traveling to the past in order to put in place a preemptive fix for this outage be wise or dangerous?

        Asking for a friend.

      • fuzztester 37 minutes ago
        Same for database transaction roll back and roll forward actions.

        And most enterprises, including banks, use databases.

        So by bad luck, you may get a couple of transactions reversed in order of time, such as a $20 debit incorrectly happening before a $10 credit, when your bank balance was only $10 prior to both those transactions. So your balance temporarily goes negative.

        Now imagine if all those amounts were ten thousand times higher ...

      • yawpitch 1 hour ago
        Define “extreme”?
    • Animats 1 hour ago
      Google has their own fleet of atomic clocks and time servers. So does AWS. So does Microsoft. So does Ubuntu. They're not going to drift enough for months to cause trouble. So the Internet can ride through this, mostly.

      The main problem will be services that assume at least one of the NIST time servers is up. Somewhere, there's going to be something that won't work right when all the NIST NTP servers are down. But what?

      • guenthert 1 hour ago
        Ubuntu using atomic clocks would surprise me. Sure they could, but it's not obvious to me why they would spend $$$$ on such. More plausible to me seems that they would be using GPSDO as reference clocks (in this context, about as good as your own atomic clock), iff they were running their own time servers. Google finds only that they are using servers from the NTP Pool Project, which will be using a variety of reference clocks.

        If you have information on what they actually are using internally, please share.

        • puzzlingcaptcha 44 minutes ago
          I think people have a wrong idea of what a modern atomic clock looks like. These are readily available commercially, Microchip for example will happily sell you hydrogen, cesium or rubidium atomic clocks. Hydrogen masers are rather unwieldy, but you can get a rubidium clock in a 1U format and cesium ones are not much bigger. I think their cesium freq standards are formerly a HP business they acquired.

          Example: https://www.microchip.com/en-us/products/clock-and-timing/co...

          • xorcist 4 minutes ago
            It is also important to realize that an atomic clock will only give you a steady pulse. It will count seconds for you, and do so very accurately, but that is not the same as knowing what time it is.

            If you get a rubidium clock for your garage, you can sync it up with GPS to get an accurate-enough clock for your hobby NTP project, but large research institutions and their expensive contraptions are more elaborate to set up.

      • genidoi 1 hour ago
        Atomic clock non-expert here, what does having a fleet of atomic clocks entail and why would the hyperscalers bother?
        • Gabrys1 1 hour ago
          Having clocks synchronized between your servers is extremely useful. For example, having a guarantee that the timestamp of arrival of a packet (measured by the clock on the destination) is ALWAYS bigger than the timestamp recorded by the sender is a huge win, especially for things like database scaling.

          For this though you need to go beyond NTP into PTP which is still usually based on GPS time and atomic clocks

          • riedel 22 minutes ago
            Actually interesting to think about what UTC actually means and there is seems to be no absolute source of truth [0]. I guess the worry is not that much about the NTP servers (for which people anyways should configure fail overs) but the clocks themselves.

            [0] https://www.septentrio.com/en/learn-more/insights/how-gps-br...

        • synack 1 hour ago
          Spanner depends on having a time source with bounded error to maintain consistency. Google accomplishes this by having GPS and atomic clocks in several datacenters.

          https://static.googleusercontent.com/media/research.google.c...

          https://static.googleusercontent.com/media/research.google.c...

          • londons_explore 44 minutes ago
            And more importantly, the tighter the time bound, the higher the performance, so more accurate clocks easily pay for themselves in other saved infrastructure costs to service the same number of users.
      • adastra22 1 hour ago
        I know this is HN, but the internet is pretty low on the list of things NIST time standards are important for.
        • willis936 1 hour ago
          But pretty high on the list that NIST NTP is important for (since it leaves the building through the internet).
          • adastra22 1 hour ago
            If NIST NTP goes down, the internet doesn’t go down. But atomic clocks drifting does upset many scientific experiments, which would effectively go down for the duration of the outage.
            • willis936 1 hour ago
              This is the reason GP listed out all the alternative robust NTP services that are GPS disciplined, freely available, and used as redundant sources by any responsible timekeeper.

              What atomic clocks are disciplined by NTP anyway? Local GPS disciplining is the standard. If you're using NTP you don't need precision or accuracy in your timekeeping.

            • szundi 31 minutes ago
              [dead]
        • _zoltan_ 1 hour ago
          could you list 3 things that you think are more important than the internet? (I know the internet is going to be fine; I just want to understand what you think ranks higher globally...)
          • adastra22 1 hour ago
            Mostly scientific stuff like astronomical observations — e.g. did this event observed at one telescope coincide with neutrinos detected at this other observatory.

            Note I didn’t say they are more important than the Internet. That’s a value judgement in any case. I said that NIST level 0 NTp servers are more important to these use cases than they are to the Internet.

            • misnome 24 minutes ago
              All these use at least GPS for timing
          • Izmaki 1 hour ago
            The ability for humankind to communicate across the entire globe at nearly 1/4 of the speed of light has drastically accelerated our technological advancement. There is no doubt that the internet is a HUGE addition to society.

            It's not super important when compared to basic needs like plumbing, food, electricity, medical assistance and other silly things we take for granted but are heavily dependent on. We all saw what happened to hospitals during the early stages of the COVID pandemic; we had plenty of internet and electricity but were struggling on the medical part. That was quite bad... I'm not sure if it's any worse if an entire country/continent lost access to the Internet. Quite a lot of our core infrastructure components in society rely on this. And a fair bit of it relies on a common understanding of what time "now" is.

          • makeitdouble 59 minutes ago
            I think it wont be affected by this but on the top of my head:

            - GPS

            - industrial complex that synchronize operations (we could include trains)

            - telecoms in general (so a level higher than the internet)

    • franklyworks 2 hours ago
      Time engineers are very paranoid. I expect large problems can't occur due to a single provider misbehaving.
    • ThrowawayTestr 1 hour ago
      If your computer was using it as your time server and you didn't have alternatives configured your clock my have drifted a few seconds.
  • lovich 1 hour ago
    This was an NTP 0 server right? What is the actual failback mechanism when that level of NTP server fails?

    This is some level of eldritch magic that I am aware of, but not familiar with but am interested in learning.

    • Maxious 30 minutes ago
      There's two other sites for the time.nist.gov service so it'll be okay.

      Probably more interesting is how you get a tier 0 site back in sync - NIST rents out these cyberpunk looking units you can use to get your local frequency standards up to scratch for ~$700/month https://www.nist.gov/programs-projects/frequency-measurement...

      • lovich 13 minutes ago
        What happens in the event all the sites for time.nist.gov go down? is it included in the spec?

        Also thank you for that link, this is exactly the kind of esoteric knowledge that I enjoy learning about

    • lambdaone 39 minutes ago
      There are lots of Stratum 0 servers out there; basically anything with an atomic clock will do. They all count seconds independently from one another, all slowly diverging over time, with offset intervals being measured by mutual synchronization using a number of means (how is this done is interesting all by itself). Some atomic clocks are more accurate than others, and an ensemble of these is typically regarded as 'the' master clock.

      To quote the ITU: "UTC is based on about 450 atomic clocks, which are maintained in 85 national time laboratories around the world." https://www.itu.int/hub/2023/07/coordinated-universal-time-a...

      Beyond this, as other commenters have said, anyone who is really dependent on having exact time (such as telcos, broadcasters, and those running global synchronized databases) should have their own atomic clock fleets. Thousands and thousands of them worldwide Moreover, GPS time, used by many to act as their time reference, is distributed by yet other means.

      Nothing bad will happen, except to those who have deliberately made these specific Stratum 0 clocks their only reference time. Anyone who has either left their computer at its factory settings or has set up their NTP configuration in accordance to recommended settings will be unaffected by this.

  • crazydoggers 1 hour ago
    Status of NIST time servers:

    https://tf.nist.gov/tf-cgi/servers.cgi

  • qmarchi 2 hours ago
    Man, they're having a hell of a time up in Boulder.
  • renewiltord 2 hours ago
    Well, where did NTP at NIST last put it? Did they look there?
    • Y_Y 2 hours ago
      You misunderstand, there's been a coup
      • adastra22 1 hour ago
        Of course there is. Where else would they put the reference standard chickens?
      • renewiltord 2 hours ago
        We have to stop those knaves pushing PTP! NTP must prevail!