On Wednesday, May 16, 2018, 10:27:34 AM GMT+8, Bill Unruh <unruh@xxxxxxxxxxxxxx> wrote:
I use a cheap GPS/PPS card (Sure electronics. Cost $50). which keeps my machine in the
sub-usec range.
(On chrony, here is the output of the tracking
Reference ID : 50505330 (PPS0)
Stratum : 1
Ref time (UTC) : Wed May 16 02:15:54 2018
System time : 0.000000001 seconds fast of NTP time
Last offset : -0.000000042 seconds
RMS offset : 0.000000167 seconds
Frequency : 4.447 ppm fast
Residual freq : -0.000 ppm
Skew : 0.002 ppm
Root delay : 0.000000001 seconds
Root dispersion : 0.000015283 seconds
Update interval : 16.0 seconds
Leap status : Normal
and the PPS sources line
#* PPS0 0 4 377 23 -348ns[ -375ns] +/- 334ns
I do not have experience with the atomic clocks available.
I know they exist.
One step would be to put in an over controled crystal into your machine. That
would bring the tsc drift down substantially. One could even use the thermal
compensation that I think exists in chrony. I have not used it so have no
advice on setting it up.
(It uses one of the motherboard thermometers as a proxy for the crystal
temperature, and fits to the drift as a function of temperature assuming a
linear relationship.
William G. Unruh __| Canadian Institute for|____ Tel: +1(604)822-3273
Physics&Astronomy _|___ Advanced Research _|____ Fax: +1(604)822-5324
UBC, Vancouver,BC _|_ Program in Cosmology |____ unruh@xxxxxxxxxxxxxx
Canada V6T 1Z1 ____|____ and Gravity ______|_ www.theory.physics.ubc.ca/
On Wed, 16 May 2018, Hei Chan wrote:
> Hi Bill,
>
> Let's say I am willing to spend 1K-2K USD for any hardware that can give accurate
time
> (in millisecond without drifting) and that hardware can be installed in a 1U server,
> then I think it could be a good solution. Any tip? Anything installed outside the
> server isn't allowed.
>
> On Tuesday, May 15, 2018, 11:01:44 PM GMT+8, Bill Unruh <unruh@xxxxxxxxxxxxxx> wrote:
>
>
>
> On Tue, 15 May 2018, Hei Chan wrote:
>
> > Hi Bill,
> >
> > I think you are indeed confused. I want accuracy in 100s of ns range. But again I
> > want no jitter/extra latency in my application.
>
> That is really tough. And you are operating with your hands tied behind your
> back.
>
> >
> > In all my measurement from point A to point B, the time span is less than 15 micro
> > 99.9999% of the time (0.0001% for the undesired jitter). And the measurement is
> taken
> > probably 1.5 billion times (or more a day) in multiple cores (~10?). As you can
see
> > timestamping happens very frequent in my system. Hence, that's why I have a weird
> > thought of using rdtsc-clock_gettime() map.
>
> Sure. The designers of the Linux clock had the same idea.
>
> >
> > I have to admit that I don't know how to use the chrony/ntp's parameters very
well.
> > What parameters would you recommend with a NTP source that is one hop a way within
> the
> > same data center?
>
> And how is that ntp source disciplined? How do you know that the time
> delivered by that source has any accuracy whatsoever. And added to that, there
> are the transmission problems. The hubs and routers between your machine and
> that ntp source introduce jitter and delays. Contention for the ethernet
> introduces jitter. The interrupt handling in your computer introduces jitter.
> The abysmally slow network (even gigabit cable takes microseconds to send a
> packet down the line, and then there is teh behaviour of the ethernet cards
> which will amass data and only send it when enough has accumulated and it
> feels like sending something. If you want accurate times you HAVE to have
> something like gps/pps and to get tens of nanosecond precision, you need to have a
> pretty sophisticated one.
>
>
> >
> > So what would you suggest me to use to synchronize in a datacenter that PTP isn't
> > available and GPS clock isn't allowed?
>
> Here is one of the worl'd foremost watches. Now I want to repair it, but you
> must wear boxing gloved while doing so, and you are not allowed to remove them
> for any reason.
>
> >
> > And indeed I have thought about a better solution for quiet some time because of
the
> > conditions above and temperature effect on TSC. But I can't think of a way to
> measure
> > from A to B without jitter and latency, and at the same time, I would like to know
> the
> > approximate epoch time of each "timestamping". (again no jitter/latency is more
>
> approximate? century? year? day, second, millisecond, microsecond nanosecond?
>
>
> > important than accuracy of the epoch time.).
>
> But make sure you never remove those gloves.
>
>
> >
> > If you have a good suggestion, i am all ears.
>
> And at a budget of $50? How much are you willing to spend?
>
> >
> > Thanks!
> >
> >
> > On Tuesday, May 15, 2018, 2:58:52 PM GMT+8, Bill Unruh <unruh@xxxxxxxxxxxxxx>
wrote:
> >
> >
> >
> > On Tue, 15 May 2018, Hei Chan wrote:
> >
> > > If I remember correctly that there was a post explaining why it wasn't a bug, the
> > post
> > > mentioned the value was written to a shared memory (or some sort), and the writer
> and
> > > reader aren't protected by a lock for performance reason, and so it needs to spin
> > (i.e
> > > while loop) to get the value out as soon as the writer finishes.
> > >
> > > I don't have an exact percentage of occurrence nor the exact delay. I vaguely
> > remember
> > > it was like 200 nano or more.
> >
> > I must say I am confused. You are wanting accuracy in the 10s of ns range, but you
> > are using pool servers to set you clock, which will give you accuracy in the
> > hundreds of usec range (on a good day). Or even a local server, which will
> > give you something like 10s of usec accuracy. There is a disconnect here.
> > If you really want ns accuracy you will have to use a refclock directly
> > connected to the machine. Even GPS has problems as it is only after the fact
> > that you can figure out the sawtooth time error on a really good gps timing
> > receiver and compensate for it.
> > Never mind the temperature changes which make the tsc wander away from its
> > rate. It is really unclear to me what you are trying to do, and why?
> >
> >
> >
> >
> > >
> > > Tho, the comparison between the latency of rdtsc and the latency of
clock_gettime()
> > > (~20 nano vs ~50 nano) is widely available online.
> > >
> > > As I mentioned that jitter/latency is more important than accuracy in my case, so
I
> > > comprised accuracy a bit (with complexity).
> > >
> > >
> > > On Tuesday, May 15, 2018, 1:16:23 PM GMT+8, Bill Unruh <unruh@xxxxxxxxxxxxxx>
> wrote:
> > >
> > >
> > >
> > > On Tue, 15 May 2018, Hei Chan wrote:
> > >
> > > > Hi Bill,
> > > >
> > > > Here is the source:
>>>>https://elixir.bootlin.com/linux/v4.9/source/arch/x86/entry/vdso/vclock_gettime.c#L
1
> 8
> > 3
> > >
> > > >
> > > >
> > > > As you can see, clock_gettime() is in a while loop because sometimes, it might
> > > fail...
> > >
> > > Hm, yes. How much of a time delay do you get occassionally due to the while
> > > loop?
> > >
> > > Again that failure sounds like a bug.
> > >
> > >
> > > >
> > > > On Tuesday, May 15, 2018, 11:26:12 AM GMT+8, Bill Unruh <unruh@xxxxxxxxxxxxxx>
> > wrote:
> > > >
> > > >
> > > > On Tue, 15 May 2018, Hei Chan wrote:
> > > >
> > > > > Thanks for your reply.
> > > > >
> > > > > See my comment inline.
> > > > >
> > > > > On Friday, May 11, 2018, 4:26:14 PM GMT+8, Miroslav Lichvar
> <mlichvar@xxxxxxxxxx>
> > > > > wrote:
> > > > >
> > > > >
> > > > > On Fri, May 11, 2018 at 12:30:30AM +0000, Hei Chan wrote:
> > > > > > Hi Bill,
> > > > > > Sorry that I wasn't clear.
> > > > > > What I tried to do is to call clock_gettime() and rdtsc(p) as soon as
chrony
> > > > finishes
> > > > > synch so that I can get the best estimate when I try to derive time from
> > > (invariant)
> > > > > tsc.
> > > > >
> > > > > Ok, so the assumption here is that once the system clock is
> > > > > "synchronized" by chronyd there will be a linear function between the
> > > > > tsc and system time? And the goal is to have a clock that can be read
> > > > > in constant time and it doesn't have to be very accurate, but still
> > > > > track the real time?
> > > > >
> > > > > Yes to both :)
> > > > >
> > > > > I'm not sure if that's possible. The tsc is the direct source for the
> > > > > CLOCK_MONOTONIC_RAW clock. Its frequency doesn't change with chronyd's
> > > > > adjustments, i.e. it's sensitive to temperature changes etc. The
> > > > > constants of the linear function would have to be periodically updated
> > > > > and then you would need to deal with locking, which would increase the
> > > > > maximum latency in the reading of the clock.
> > > > >
> > > > > Here is the design I am thinking.
> > > > >
> > > > > I don't have chronyd run in backgroud, and periodically (through cronjob) to
> > issue
> > > > the
> > > >
> > > > That is a terrible way of usign chrony. One of the key features of both chrony
> > > > and ntpd is that it disciplines not only the offset but also the the rate of
> > > > the clock. And the rate can only be determine over a (lengthy ) time period.
> > > > Why would you run it like this?
> > > >
> > > > > command chronyd -q 'pool [some NTP server/switch which is 1 switch away]
> iburst',
> > > > then
> > > > > as soon as it returns (the clock is synchronized right?), then I do something
> > like:
> > > >
> > > > No. See above.
> > > >
> > > > > s = cpuid + rdtsc
> > > > > clock_getime(REALTIME_CLOCK, &t)
> > > > > e = rdtscp + cpuid
> > > >
> > > > >
> > > > > Then, log it.
> > > > >
> > > > > So after 24 hours, I have a map for rdtsc<->absolute epoch time in nano.
> > > >
> > > > You have a very sophisticated program whose whole purpose is to continuously
> > > > set the translation between the tsc and the UTC. And you throw it all away and
> > > > use it in the way that Unix time was disciplined 40 years ago.
> > > >
> > > >
> > > > >
> > > > > Then, I can use the map to estimate the TSC frequency every 2 t's with the
> > > assumption
> > > > > that t is correct and TSC will change between two t's.
> > > >
> > > >
> > > > >
> > > > > Then, for everything I track with rdtsc, I can estimate the absolute epoch
time
> > in
> > > > > nano.
> > > > >
> > > > > You might question why I don't just have chronyd running in background and
call
> > > > > clock_gettime(CLOCK_REATIME, &t) for all the stamping I do with rdtsc. The
> main
> > > > issue
> > > > > is that clock_gettime(CLOCK_REALTIME) is great 99% of the time but sometimes,
> it
> > > just
> > > > > fails internally and loops and then take a long time to return.
> > > >
> > > > No idea what this is all about. I have never seen this. If it truely does
> > > > this, that is bug, and needs to be reported.
> > > >
> > > >
> > > > >
> > > > > Any issue you see?
> > > > >
> > > > > P.S. calling chronyd and creating the map file will be done by one dedicated
> > core
> > > at
> > > > > C0 (i.e. off OS scheduler to improve accuracy)
> > > > >
> > > > > > Ideally, I have a C application that calls chrony's API (if there is one)
> > similar
> > > > to
> > > > > "chronyd -q" to block till it finishes or gets a callback.
> > > > > > Any suggestion?
> > > > >
> > > > > There is no C API for chrony (yet). Instead, you could use adjtimex()
> > > > > and check the frequency and maxerror fields. The maxerror value
> > > > > increases slowly and drops only when chronyd updates the clock. When
> > > > > it drops below a threshold and the frequency didn't change
> > > > > significantly, the system clock could be considered to be
> > > > > synchronized.
> > > > >
> > > > > --
> > > > > Miroslav Lichvar
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> >
> >
>
>