[OE-core] [PATCH] openssh: Atomically generate host keys

Wed May 24 10:03:53 UTC 2017

> -----Original Message-----
> From: openembedded-core-bounces at lists.openembedded.org
> [mailto:openembedded-core-bounces at lists.openembedded.org] On Behalf Of
> Joshua Watt
> Sent: den 23 maj 2017 21:57
> To: Randy Witt <randy.e.witt at linux.intel.com>
> Cc: OE-core <openembedded-core at lists.openembedded.org>
> Subject: Re: [OE-core] [PATCH] openssh: Atomically generate host keys
> 
> On Tue, May 23, 2017 at 12:56 PM, Joshua Watt <jpewhacker at gmail.com>
> wrote:
> > On Tue, May 23, 2017 at 12:23 PM, Randy Witt
> > <randy.e.witt at linux.intel.com> wrote:
> >> On 05/23/2017 08:29 AM, Joshua Watt wrote:
> >>>
> >>> On Tue, May 23, 2017 at 9:37 AM, Burton, Ross
> <ross.burton at intel.com>
> >>> wrote:
> >>>>
> >>>>
> >>>> On 7 May 2017 at 02:33, Joshua Watt <jpewhacker at gmail.com> wrote:
> >>>>>
> >>>>>
> >>>>> +if [ ! -f "$NAME" ]; then
> >>>>> +    echo "  generating ssh $TYPE key..."
> >>>>> +    ssh-keygen -q -f "${NAME}.tmp" -N '' -t $TYPE
> >>>>> +
> >>>>> +    # Sync to ensure data is written to temp file before
> renaming
> >>>>> +    sync
> >>>>> +
> >>>>> +    # Move (Atomically rename) files
> >>>>> +    # Rename the .pub file first, since the check that triggers
> a
> >>>>> +    # key generation is based on the private file.
> >>>>> +    mv -f "${NAME}.tmp.pub" "${NAME}.pub"
> >>>>> +    sync
> >>>>> +
> >>>>> +    mv -f "${NAME}.tmp" "${NAME}"
> >>>>> +    sync
> >>>>> +fi
> >>>>>
> >>>>
> >>>> All of these syncs seem quite enthusiastic, are they really
> needed?
> >>>> Writing
> >>>> the file to a temporary name and then mving it to the real name
> should
> >>>> result in either no file or a complete file in the event of power
> loss,
> >>>> surely?
> >>>
> >>>
> >>> My understanding (and observation) of most journal file systems is
> >>> that only metadata (i.e. directory entries and such) are journaled
> in
> >>> typical usage. The first sync is necessary in this case to ensure
> that
> >>> the actual file data gets put on the disk before we rename the
> files,
> >>> otherwise in the event of interruption journaled rename might get
> >>> replayed but have garbage data. The second one is more of a "force
> >>> operation order" sync to make sure the public file is written
> before
> >>> the private one, as a reordering would cause problems. The last
> sync
> >>> is the most optional, but I've seen it take minutes for disk to
> sync
> >>> contents if no one calls "sync", so it is very possible that all
> our
> >>> work of regenerating keys would be for naught if power is
> interrupted
> >>> in the meantime.
> >>>
> >>> I think some of these syncs can be removed. Namely, the first and
> >>> third one. The second one needs to be there, but can server double
> >>> duty of syncing data to disk and enforcing the order between the
> >>> public and private rename. It does mean we could get a state where
> the
> >>> public key exists but is garbage, but this should be OK because the
> >>> private key won't exist and it would be regenerated.
> >>>
> >>> The third sync can be removed and I can put one final sync at the
> end
> >>> after all keys are generated to ensure we won't go through all the
> >>> effort of regenerating the (last) key again in the event of
> >>> interruption shortly after, unless you would prefer I didn't.
> >>>
> >>
> >> The typical convention for this is,
> >>
> >> 1. Update file data.
> >> 2. sync file
> >> 3. sync containing directory
> >> 4. mv file to new location
> >> 5. sync directory containing new file (although I've seen this step
> left out
> >> before)
> >>
> >> https://lwn.net/Articles/457672/ is a good example which is linked
> from
> >> https://lwn.net/Articles/457667/
> >>
> >> But one of the important parts vs your code, is also that the
> example is
> >> only calling sync on the files/directory needed, vs calling "sync"
> with no
> >> arguments which is going to cause all data in all filesystem caches
> to be
> >> flushed.
> >
> > Ah, OK. That makes sense, I will update sync to specify the
> > files/directory explicitly.
> 
> FWIW, I did some tests on the sync behavior:
> 
> It appears that older versions of the sync command ignore any
> arguments and just call sync(). From Ubuntu 14.04:
> $ strace sync foo
> ...
> write(2, "sync: ", 6sync: )                   = 6
> write(2, "ignoring all arguments", 22)  = 22
> write(2, "\n", 1)                       = 1
> sync()                                  = 0
> ...
> 
> The same is true for the (default) busybox version of sync on master
> (again verified with strace), but it doesn't complain nosily about it.

Busybox has a separate fsync applet to do file synchronization, but it 
is not enabled in the default OE-core configuration...

> I looked at the coreutils code, and sync was changed to respect
> arguments in January of 2015
> (8b2bf5295f353016d4f5e6a2317d55b6a8e7fd00) and only call fsync() on
> the provided arguments (if any are provided).
> 
> Either way, adding the arguments to the sync call in my patch won't
> hurt because it is forward compatible, even though it won't be
> optimally efficient currently because the sync command currently
> simply calls sync()

//Peter