Re: [chrony-dev] [PATCH] main: imply -x if time can't be set

[ Thread Index | Date Index | More chrony.tuxfamily.org/chrony-dev Archives ]




On Thu, Mar 8, 2018 at 1:49 PM, Miroslav Lichvar <mlichvar@xxxxxxxxxx> wrote:
On Thu, Mar 08, 2018 at 12:21:16PM +0100, Christian Ehrhardt wrote:
> 1. if you are in a container you very likely can't set the time.
>     Installing chrony there would silently not start the chrony service for
> lacking CAP_SYS_TIME.
>     - You now installed chrony got no error/warning, but it does nothing.

The systemctl status command seems to print in bold letters that
"start condition failed".

But this is what users expect in most cases, right? If an NTP
client/server is installed and enabled in a container, it's usually
not intended, e.g. it was installed as a dependency of another
package.

The people concerned on the link Launchpad bug for example want to serve time.
They would like to have good time, but if for test purpose deployed in a container it is a requirement to them that it at least serves the time as good as it can (not perfect, but better than not).
So they install chrony and wonder that nothing works.
Today they have to change the systemd unit (drop the Cap check) and set -x.

There are additional constraints to them that make it appealing to have a "make it as good as possible but just work" option for containers.

>     - If there are services depending on the chrony service they would not
> start either

Depending on chronyd specifically or the time-sync target? In either
case it probably means they require a synchronized clock. Is it ok to
satisfy that with chronyd -x, which doesn't touch the system clock at
all?

Good thought Miroslav,
Hmm if it wouldn't sound as awkward I'd actually say you'd want to split the client & server services.
I think we can't, but that way the "client" portion would provide time-sync.target and things depending on it can not start.
But at the same time another service can run for the "server" portion and work, it might be at reduced accuracy, but it would work.

I agree that you'd want to reach time-sync target only after you sync - btw what does systemd-timesyncd in that case?
Not sure what that implies, it feels it is wrong whatever way I turn.
 
It would be nice if there was some mechanism to pass this information
to containers.

You mean as the container is not able to steer the clock it's own time-sync.target should actually be that of its host?
I agree that would be correct for the container world as of today, not sure how one would implement that thou.
 
> 2. if you are in busted Host system (or VM) that grants no CAP_SYS_TIME the
> same as above will silently happen.
>     And on a Metal machine or even a VM you'd really want to know it is
> failing, because you'd expect it to work.
>
> 3. If you are in a container with special privileges that allow
> CAP_SYS_TIME (rare) it is very dangerous to do so.
>     There are no time namespaces for containers to use. Therefore multiple
> containers on that system would start fighting with time adjustments which
> would be the worst IMHO.

That's a good point. Maybe chronyd should print a warning when it
detects something else is messing with the clock. Such check would
probably come at a cost.

> For #1 you want to default to -x if you are in a container

I'd say that applies only to a minority of cases when people actually
want to run an NTP server and chrony wasn't just installed as a
dependency.

> For #2 you want to drop the condition on CAP_SYS_TIME
> For #3 you want to default to -x if you are in a container

I'm not sure about this either. If a container does have CAP_SYS_TIME,
it was probably intended to run an NTP client/server.

Just  as a heads up, due to the crazy world of capabilities some containers will soon expose CAP_SYS_TIME since it is correct to have the cap for their space.
But when you actually adjtimex it will fail as it will eventually apply the hosts caps to it and that you don't have.
This comes at the additional WTF (for me) that checking for CAP_SYS_TIME has effectively lost almost all its meaning :-/
(other people might be more elaborate to explain if you need details)
Which is another reason to at least provide the "fallback if unable" IMHO - we can bikeshed at the default of that and decide either way, but I'd still like the actual feature.

Btw - for the reason above (haivng the cap, but not able to set time) is there any way to adjtimex, but 

> For #4 you want to log a message if the case is detected
>
> These should be defaults, and admins would be given a way to override this
> behavior.
> This could either be done either:
> - in a Chrony patch to provide this behavior and a cmdline option to
> override.
> - in a wrapper script to the ExecStart of the service file doing the
> checks/messaging and adding -x as needed

It looks complicated and fragile to me.

Would it make sense to simply remove the CAP_SYS_TIME condition from
the unit file and let chronyd fail if it doesn't have it (possibly
with a better error message)?

Actually yes, that might be better (and way less invasive).
So people (or dependencies) installing in a container will have a unit that starts but has a very literal reason on what/why it fails.
It could even suggest -x if the reason is running in a container (most likely case).

It also would totally cover the crazy CAP-avaialble, but still not adjustable time.

An additional default-off option to then in turn disable syncing the clock would be my perfect solution.
Can we disable it so late, as I thought we detect the inability to do so rather late?
That way people could opt-in to a "instead of the (now better) failure, I want it to run on less accuracy in pure server mode"
 
How about that?


Mail converted by MHonArc 2.6.19+ http://listengine.tuxfamily.org/