[OE-core] [PATCH] openssh: Atomically generate host keys

Wed May 24 13:38:11 UTC 2017

On Wed, May 24, 2017 at 5:03 AM, Peter Kjellerstedt
<peter.kjellerstedt at axis.com> wrote:
>> -----Original Message-----
>> From: openembedded-core-bounces at lists.openembedded.org
>> [mailto:openembedded-core-bounces at lists.openembedded.org] On Behalf Of
>> Joshua Watt
>> Sent: den 23 maj 2017 21:57
>> To: Randy Witt <randy.e.witt at linux.intel.com>
>> Cc: OE-core <openembedded-core at lists.openembedded.org>
>> Subject: Re: [OE-core] [PATCH] openssh: Atomically generate host keys
>>
>> On Tue, May 23, 2017 at 12:56 PM, Joshua Watt <jpewhacker at gmail.com>
>> wrote:
>> > On Tue, May 23, 2017 at 12:23 PM, Randy Witt
>> > <randy.e.witt at linux.intel.com> wrote:
>> >> On 05/23/2017 08:29 AM, Joshua Watt wrote:
>> >>>
>> >>> On Tue, May 23, 2017 at 9:37 AM, Burton, Ross
>> <ross.burton at intel.com>
>> >>> wrote:
>> >>>>
>> >>>>
>> >>>> On 7 May 2017 at 02:33, Joshua Watt <jpewhacker at gmail.com> wrote:
>> >>>>>
>> >>>>>
>> >>>>> +if [ ! -f "$NAME" ]; then
>> >>>>> +    echo "  generating ssh $TYPE key..."
>> >>>>> +    ssh-keygen -q -f "${NAME}.tmp" -N '' -t $TYPE
>> >>>>> +
>> >>>>> +    # Sync to ensure data is written to temp file before
>> renaming
>> >>>>> +    sync
>> >>>>> +
>> >>>>> +    # Move (Atomically rename) files
>> >>>>> +    # Rename the .pub file first, since the check that triggers
>> a
>> >>>>> +    # key generation is based on the private file.
>> >>>>> +    mv -f "${NAME}.tmp.pub" "${NAME}.pub"
>> >>>>> +    sync
>> >>>>> +
>> >>>>> +    mv -f "${NAME}.tmp" "${NAME}"
>> >>>>> +    sync
>> >>>>> +fi
>> >>>>>
>> >>>>
>> >>>> All of these syncs seem quite enthusiastic, are they really
>> needed?
>> >>>> Writing
>> >>>> the file to a temporary name and then mving it to the real name
>> should
>> >>>> result in either no file or a complete file in the event of power
>> loss,
>> >>>> surely?
>> >>>
>> >>>
>> >>> My understanding (and observation) of most journal file systems is
>> >>> that only metadata (i.e. directory entries and such) are journaled
>> in
>> >>> typical usage. The first sync is necessary in this case to ensure
>> that
>> >>> the actual file data gets put on the disk before we rename the
>> files,
>> >>> otherwise in the event of interruption journaled rename might get
>> >>> replayed but have garbage data. The second one is more of a "force
>> >>> operation order" sync to make sure the public file is written
>> before
>> >>> the private one, as a reordering would cause problems. The last
>> sync
>> >>> is the most optional, but I've seen it take minutes for disk to
>> sync
>> >>> contents if no one calls "sync", so it is very possible that all
>> our
>> >>> work of regenerating keys would be for naught if power is
>> interrupted
>> >>> in the meantime.
>> >>>
>> >>> I think some of these syncs can be removed. Namely, the first and
>> >>> third one. The second one needs to be there, but can server double
>> >>> duty of syncing data to disk and enforcing the order between the
>> >>> public and private rename. It does mean we could get a state where
>> the
>> >>> public key exists but is garbage, but this should be OK because the
>> >>> private key won't exist and it would be regenerated.
>> >>>
>> >>> The third sync can be removed and I can put one final sync at the
>> end
>> >>> after all keys are generated to ensure we won't go through all the
>> >>> effort of regenerating the (last) key again in the event of
>> >>> interruption shortly after, unless you would prefer I didn't.
>> >>>
>> >>
>> >> The typical convention for this is,
>> >>
>> >> 1. Update file data.
>> >> 2. sync file
>> >> 3. sync containing directory
>> >> 4. mv file to new location
>> >> 5. sync directory containing new file (although I've seen this step
>> left out
>> >> before)
>> >>
>> >> https://lwn.net/Articles/457672/ is a good example which is linked
>> from
>> >> https://lwn.net/Articles/457667/
>> >>
>> >> But one of the important parts vs your code, is also that the
>> example is
>> >> only calling sync on the files/directory needed, vs calling "sync"
>> with no
>> >> arguments which is going to cause all data in all filesystem caches
>> to be
>> >> flushed.
>> >
>> > Ah, OK. That makes sense, I will update sync to specify the
>> > files/directory explicitly.
>>
>> FWIW, I did some tests on the sync behavior:
>>
>> It appears that older versions of the sync command ignore any
>> arguments and just call sync(). From Ubuntu 14.04:
>> $ strace sync foo
>> ...
>> write(2, "sync: ", 6sync: )                   = 6
>> write(2, "ignoring all arguments", 22)  = 22
>> write(2, "\n", 1)                       = 1
>> sync()                                  = 0
>> ...
>>
>> The same is true for the (default) busybox version of sync on master
>> (again verified with strace), but it doesn't complain nosily about it.
>
> Busybox has a separate fsync applet to do file synchronization, but it
> is not enabled in the default OE-core configuration...

Ok. I think for best compatibility I'll just use "sync", as that will
always be safe, just not always optimally efficient. I think this is
OK, since key generation either only occurs once (r/w rootfs), or once
per boot (r/o rootfs).

>
>> I looked at the coreutils code, and sync was changed to respect
>> arguments in January of 2015
>> (8b2bf5295f353016d4f5e6a2317d55b6a8e7fd00) and only call fsync() on
>> the provided arguments (if any are provided).
>>
>> Either way, adding the arguments to the sync call in my patch won't
>> hurt because it is forward compatible, even though it won't be
>> optimally efficient currently because the sync command currently
>> simply calls sync()
>
> //Peter
>