[OE-core] [PATCH] openssh: Atomically generate host keys

Tue May 23 19:56:54 UTC 2017

On Tue, May 23, 2017 at 12:56 PM, Joshua Watt <jpewhacker at gmail.com> wrote:
> On Tue, May 23, 2017 at 12:23 PM, Randy Witt
> <randy.e.witt at linux.intel.com> wrote:
>> On 05/23/2017 08:29 AM, Joshua Watt wrote:
>>>
>>> On Tue, May 23, 2017 at 9:37 AM, Burton, Ross <ross.burton at intel.com>
>>> wrote:
>>>>
>>>>
>>>> On 7 May 2017 at 02:33, Joshua Watt <jpewhacker at gmail.com> wrote:
>>>>>
>>>>>
>>>>> +if [ ! -f "$NAME" ]; then
>>>>> +    echo "  generating ssh $TYPE key..."
>>>>> +    ssh-keygen -q -f "${NAME}.tmp" -N '' -t $TYPE
>>>>> +
>>>>> +    # Sync to ensure data is written to temp file before renaming
>>>>> +    sync
>>>>> +
>>>>> +    # Move (Atomically rename) files
>>>>> +    # Rename the .pub file first, since the check that triggers a
>>>>> +    # key generation is based on the private file.
>>>>> +    mv -f "${NAME}.tmp.pub" "${NAME}.pub"
>>>>> +    sync
>>>>> +
>>>>> +    mv -f "${NAME}.tmp" "${NAME}"
>>>>> +    sync
>>>>> +fi
>>>>>
>>>>
>>>> All of these syncs seem quite enthusiastic, are they really needed?
>>>> Writing
>>>> the file to a temporary name and then mving it to the real name should
>>>> result in either no file or a complete file in the event of power loss,
>>>> surely?
>>>
>>>
>>> My understanding (and observation) of most journal file systems is
>>> that only metadata (i.e. directory entries and such) are journaled in
>>> typical usage. The first sync is necessary in this case to ensure that
>>> the actual file data gets put on the disk before we rename the files,
>>> otherwise in the event of interruption journaled rename might get
>>> replayed but have garbage data. The second one is more of a "force
>>> operation order" sync to make sure the public file is written before
>>> the private one, as a reordering would cause problems. The last sync
>>> is the most optional, but I've seen it take minutes for disk to sync
>>> contents if no one calls "sync", so it is very possible that all our
>>> work of regenerating keys would be for naught if power is interrupted
>>> in the meantime.
>>>
>>> I think some of these syncs can be removed. Namely, the first and
>>> third one. The second one needs to be there, but can server double
>>> duty of syncing data to disk and enforcing the order between the
>>> public and private rename. It does mean we could get a state where the
>>> public key exists but is garbage, but this should be OK because the
>>> private key won't exist and it would be regenerated.
>>>
>>> The third sync can be removed and I can put one final sync at the end
>>> after all keys are generated to ensure we won't go through all the
>>> effort of regenerating the (last) key again in the event of
>>> interruption shortly after, unless you would prefer I didn't.
>>>
>>
>> The typical convention for this is,
>>
>> 1. Update file data.
>> 2. sync file
>> 3. sync containing directory
>> 4. mv file to new location
>> 5. sync directory containing new file (although I've seen this step left out
>> before)
>>
>> https://lwn.net/Articles/457672/ is a good example which is linked from
>> https://lwn.net/Articles/457667/
>>
>> But one of the important parts vs your code, is also that the example is
>> only calling sync on the files/directory needed, vs calling "sync" with no
>> arguments which is going to cause all data in all filesystem caches to be
>> flushed.
>
> Ah, OK. That makes sense, I will update sync to specify the
> files/directory explicitly.

FWIW, I did some tests on the sync behavior:

It appears that older versions of the sync command ignore any
arguments and just call sync(). From Ubuntu 14.04:
$ strace sync foo
...
write(2, "sync: ", 6sync: )                   = 6
write(2, "ignoring all arguments", 22)  = 22
write(2, "\n", 1)                       = 1
sync()                                  = 0
...

The same is true for the (default) busybox version of sync on master
(again verified with strace), but it doesn't complain nosily about it.

I looked at the coreutils code, and sync was changed to respect
arguments in January of 2015
(8b2bf5295f353016d4f5e6a2317d55b6a8e7fd00) and only call fsync() on
the provided arguments (if any are provided).

Either way, adding the arguments to the sync call in my patch won't
hurt because it is forward compatible, even though it won't be
optimally efficient currently because the sync command currently
simply calls sync()

>
>>
>>
>>>>
>>>>>
>>>>> diff --git
>>>>> a/meta/recipes-connectivity/openssh/openssh/sshdgenkeys.service
>>>>> b/meta/recipes-connectivity/openssh/openssh/sshdgenkeys.service
>>>>> index 148e6ad..af56404 100644
>>>>> --- a/meta/recipes-connectivity/openssh/openssh/sshdgenkeys.service
>>>>> +++ b/meta/recipes-connectivity/openssh/openssh/sshdgenkeys.service
>>>>> @@ -1,22 +1,14 @@
>>>>>   [Unit]
>>>>>   Description=OpenSSH Key Generation
>>>>>   RequiresMountsFor=/var /run
>>>>> -ConditionPathExists=!/var/run/ssh/ssh_host_rsa_key
>>>>> -ConditionPathExists=!/var/run/ssh/ssh_host_dsa_key
>>>>> -ConditionPathExists=!/var/run/ssh/ssh_host_ecdsa_key
>>>>> -ConditionPathExists=!/var/run/ssh/ssh_host_ed25519_key
>>>>> -ConditionPathExists=!/etc/ssh/ssh_host_rsa_key
>>>>> -ConditionPathExists=!/etc/ssh/ssh_host_dsa_key
>>>>> -ConditionPathExists=!/etc/ssh/ssh_host_ecdsa_key
>>>>> -ConditionPathExists=!/etc/ssh/ssh_host_ed25519_key
>>>>
>>>>
>>>>
>>>> Can you not continue to use ConditionPathExists to only run this unit if
>>>> it
>>>> needs to run?  You can prepend the argument with | to make them logical
>>>> OR
>>>> instead of logical AND, if I'm reading this documentation correctly.
>>>
>>>
>>> I will try that. I wasn't aware that was an option since systemd conf
>>> files are somewhat new to me.
>>>
>>>>
>>>> Ross
>>
>>