If I remember correctly that there was a post explaining why it wasn't a bug, the post
mentioned the value was written to a shared memory (or some sort), and the writer and
reader aren't protected by a lock for performance reason, and so it needs to spin (i.e
while loop) to get the value out as soon as the writer finishes.
I don't have an exact percentage of occurrence nor the exact delay. I vaguely remember
it was like 200 nano or more.
Tho, the comparison between the latency of rdtsc and the latency of clock_gettime()
(~20 nano vs ~50 nano) is widely available online.
As I mentioned that jitter/latency is more important than accuracy in my case, so I
comprised accuracy a bit (with complexity).
On Tuesday, May 15, 2018, 1:16:23 PM GMT+8, Bill Unruh <unruh@xxxxxxxxxxxxxx> wrote:
On Tue, 15 May 2018, Hei Chan wrote:
> Hi Bill,
>
> Here is the source:
>https://elixir.bootlin.com/linux/v4.9/source/arch/x86/entry/vdso/vclock_gettime.c#L183
>
>
> As you can see, clock_gettime() is in a while loop because sometimes, it might
fail...
Hm, yes. How much of a time delay do you get occassionally due to the while
loop?
Again that failure sounds like a bug.
>
> On Tuesday, May 15, 2018, 11:26:12 AM GMT+8, Bill Unruh <unruh@xxxxxxxxxxxxxx> wrote:
>
>
> On Tue, 15 May 2018, Hei Chan wrote:
>
> > Thanks for your reply.
> >
> > See my comment inline.
> >
> > On Friday, May 11, 2018, 4:26:14 PM GMT+8, Miroslav Lichvar <mlichvar@xxxxxxxxxx>
> > wrote:
> >
> >
> > On Fri, May 11, 2018 at 12:30:30AM +0000, Hei Chan wrote:
> > > Hi Bill,
> > > Sorry that I wasn't clear.
> > > What I tried to do is to call clock_gettime() and rdtsc(p) as soon as chrony
> finishes
> > synch so that I can get the best estimate when I try to derive time from
(invariant)
> > tsc.
> >
> > Ok, so the assumption here is that once the system clock is
> > "synchronized" by chronyd there will be a linear function between the
> > tsc and system time? And the goal is to have a clock that can be read
> > in constant time and it doesn't have to be very accurate, but still
> > track the real time?
> >
> > Yes to both :)
> >
> > I'm not sure if that's possible. The tsc is the direct source for the
> > CLOCK_MONOTONIC_RAW clock. Its frequency doesn't change with chronyd's
> > adjustments, i.e. it's sensitive to temperature changes etc. The
> > constants of the linear function would have to be periodically updated
> > and then you would need to deal with locking, which would increase the
> > maximum latency in the reading of the clock.
> >
> > Here is the design I am thinking.
> >
> > I don't have chronyd run in backgroud, and periodically (through cronjob) to issue
> the
>
> That is a terrible way of usign chrony. One of the key features of both chrony
> and ntpd is that it disciplines not only the offset but also the the rate of
> the clock. And the rate can only be determine over a (lengthy ) time period.
> Why would you run it like this?
>
> > command chronyd -q 'pool [some NTP server/switch which is 1 switch away] iburst',
> then
> > as soon as it returns (the clock is synchronized right?), then I do something like:
>
> No. See above.
>
> > s = cpuid + rdtsc
> > clock_getime(REALTIME_CLOCK, &t)
> > e = rdtscp + cpuid
>
> >
> > Then, log it.
> >
> > So after 24 hours, I have a map for rdtsc<->absolute epoch time in nano.
>
> You have a very sophisticated program whose whole purpose is to continuously
> set the translation between the tsc and the UTC. And you throw it all away and
> use it in the way that Unix time was disciplined 40 years ago.
>
>
> >
> > Then, I can use the map to estimate the TSC frequency every 2 t's with the
assumption
> > that t is correct and TSC will change between two t's.
>
>
> >
> > Then, for everything I track with rdtsc, I can estimate the absolute epoch time in
> > nano.
> >
> > You might question why I don't just have chronyd running in background and call
> > clock_gettime(CLOCK_REATIME, &t) for all the stamping I do with rdtsc. The main
> issue
> > is that clock_gettime(CLOCK_REALTIME) is great 99% of the time but sometimes, it
just
> > fails internally and loops and then take a long time to return.
>
> No idea what this is all about. I have never seen this. If it truely does
> this, that is bug, and needs to be reported.
>
>
> >
> > Any issue you see?
> >
> > P.S. calling chronyd and creating the map file will be done by one dedicated core
at
> > C0 (i.e. off OS scheduler to improve accuracy)
> >
> > > Ideally, I have a C application that calls chrony's API (if there is one) similar
> to
> > "chronyd -q" to block till it finishes or gets a callback.
> > > Any suggestion?
> >
> > There is no C API for chrony (yet). Instead, you could use adjtimex()
> > and check the frequency and maxerror fields. The maxerror value
> > increases slowly and drops only when chronyd updates the clock. When
> > it drops below a threshold and the frequency didn't change
> > significantly, the system clock could be considered to be
> > synchronized.
> >
> > --
> > Miroslav Lichvar
> >
> >
> >
>
>