Re: [chrony-users] Resume from suspend and default makestep configuration |
[ Thread Index |
Date Index
| More chrony.tuxfamily.org/chrony-users Archives
]
- To: "chrony-users@xxxxxxxxxxxxxxxxxxxx" <chrony-users@xxxxxxxxxxxxxxxxxxxx>
- Subject: Re: [chrony-users] Resume from suspend and default makestep configuration
- From: FUSTE Emmanuel <emmanuel.fuste@xxxxxxxxxxxxxxx>
- Date: Tue, 19 May 2020 11:10:01 +0000
- Accept-language: fr-FR, en-US
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=thalesgroup.com; s=xrt20181201; t=1589886602; bh=r6VhnCU/1lsro9obBQOt4dHJ9L/MgpmhTPjSlXiuJPg=; h=From:To:Subject:Date:Message-ID:References:In-Reply-To: Content-Transfer-Encoding:MIME-Version:From; b=nklFsW8tIjSBc7VZrg2fsJuvi0JYptjPSFDrqgC4qTTtQFX0q/sCmTWIAWgCLBHe5 21NpwDKRlQVyLDmqdt+LA72mCcGY6B/p0939x1c0nJ6tvIR2Y10TEATmWQ8foNxxah VuVLZak2uom6p1SeOLN9xsN2QmpfiMymhzk7z+Mq/Y7q0GMO/fI3jz+Vc5fl7xERo1 QolkFIc+9FVDY02ja1ublsij718N5ljyuV1bMPIdXNNuIvxs58ZA0/eqxa8wwrBDzb FD9zo9MBoMhOhD34iLf8nKmGGpDIZQpDNmV/qQ1MnbW80xHwr+1SZBBndHPhLKO+1J h7LqlOTGbICng==
- Thread-index: AQHWJZCybs6tsl5rVkeZuk2veTx73Kiidm+AgAsddICAAAI2AIAACJIAgAApuoCAAVvJgIAACziA
- Thread-topic: [chrony-users] Resume from suspend and default makestep configuration
Le 19/05/2020 à 12:29, Pali Rohár a écrit :
> On Monday 18 May 2020 13:45:04 FUSTE Emmanuel wrote:
>> Le 18/05/2020 à 13:15, Pali Rohár a écrit :
>>> On Monday 18 May 2020 10:45:02 FUSTE Emmanuel wrote:
>>>> Hello Pali,
>>>>
>>>> Le 18/05/2020 à 12:37, Pali Rohár a écrit :
>>>>> The main problem is when system is put into suspend or hibernate state.
>>>>>
>>>>> In my opinion resuming from suspend / hibernate state should be handled
>>>>> in the same way as (re)starting chronyd. You do not know what may
>>>>> happened during sleep.
>>>> Yes and in case of needed workaround, it should be done at the system
>>>> level, not chrony.
>>>> A job for systemd.
>>> Hello! Sorry for a stupid question, but what has systemd in common with
>>> chronyd? Why should systemd care about chronyd time synchronization?
>> Nothing.
>> But it is to your "process manager" being systemd, sysvinit pile of
>> scripts or whatever to restart or notify chrony, it has do do
>> housekeeping anyway for other things when you suspend/resume.
> Hm... I remember that in past it was needed to blacklist broken daemons,
> software and kernel modules which did not work correctly during S3 or
> hibernate state. It was in some pm scripts utils...
>
> But I thought that these days are already passed and software can deal
> with fact that machine may be put into suspend or hibernate state.
>
> So what you are suggesting is to put chronyd daemon into list of broken
> software (which needs to be stopped prior suspend / resume)?
>
> It does not make sense for me as the immediate step after putting
> software or kernel module into such "blacklist" was to inform upstream
> authors of that daemon or kernel module they it is broken / incompatible
> with suspend state and it should be fixed.
>
> That "blacklist" was just workaround for buggy software and not
> permanent solution.
No not chrony, but the machine which change RTC on your back : buggy Bios
>
>> Exactly as networkmanager, ifupdown scripts, systemd-networkd
>> reload/restart some network services when interfaces/tunnels/vpn are
>> upped/downed.
> This is something totally different. all those mentioned "services" are
> just independent part of system which manages network connections.
>
> chronyd is there to manage time synchronization.
It was an "imaged comparison" for event driven config change.
The event in the suspend vs time case, the event is only know and
should be managed by your init system not by your time daemon.
>
>>>>> And as I pointed there are existing problems that UEFI/BIOS firmware
>>>>> changes RTC clock without good reason which results in completely wrong
>>>>> system clock.
>>>>>
>>>> Could well be identified by blacklist at the udev/systemd level for
>>>> applying or not the workaround (restart chrony or launch a chronyc
>>>> command at resume)
>>> Could you describe in details what do you mean by blacklist? Which udev
>>> blacklist you mean and what should be put into that blacklist? I have
>>> not caught this part.
>> Faulty systems could be identified by DMI/ACPI strings and quirk applied.
> And what is the faulty system?
Citing yourself :
"as I pointed there are existing problems that UEFI/BIOS firmware
changes RTC clock without good reason"
>
> I think this is something general and not related to particular machine.
> I guess under specific conditions it may happen on any system.
>
>> See for example /lib/udev/hwdb.d/60-sensor.hwdb for some laptop sensors.
>> We could add an attribute to the RTC if it matche some vendor/bios
>> version/model etc... to put in the hwdb (the blacklist)
>> A udev rule will assign this attribute to the RTC if you are running on
>> a known buggy system.
>> A script could do anything you want at suspend/resume time in
>> /lib/systemd/system-sleep if your RTC has the offended attribute (see
>> systemd-sleep man page).
>> Or better, a unit run at resume time could do anything too.
>> The hwdb abstraction is not need if it is a local hack and should be
>> properly defined with the hwdb/udev/systemd developers.
> This database is for describing hardware differences or issues.
>
> But above problem with time synchronization is general and hardware
> independent. You can simulate same issue on your machine.
>
> Just put your computer into hibernation. Then boot from liveUSB some
> Linxu distribution and change RTC time. Turn off liveUSB and boot your
> hibernated system. And you should be in same situation as I described.
Yes but this is like shooting yourself in your feet.
If you want to be robust in this case and all others, then by default
you must restart ANY time sync daemon in the resume callback of your
init system, being ntpd or chrony, systemd or sysvinit or upstart or
anything else. But it is problematic as Miroslav point out as you
potentially start to trust any anonymous time source more than your own RTC.
The actual makestep value is a sane default for all the majority of sane
machine with standard usecase.
For broken machine or coner usecase I think that the good level in the
stack for applying a workaround is at the init level, restarting the
time daemon on resume and not messing the makestep value. Because if you
do that you will not only trust any anonymous and potentially bad time
source more than your own RTC at boot /resume time, but at all time.
That's all I could say.
Emmanuel.