[OE-core] [PATCH 5/5] useradd-staticids.bbclass: Read passwd/group files before parsing

Mark Hatle mark.hatle at windriver.com
Tue Nov 10 16:07:00 UTC 2015


On 11/10/15 9:54 AM, Peter Kjellerstedt wrote:
>> -----Original Message-----
>> From: Mark Hatle [mailto:mark.hatle at windriver.com]
>> Sent: den 6 november 2015 21:14
>> To: Peter Kjellerstedt
>> Cc: openembedded-core at lists.openembedded.org
>> Subject: Re: [PATCH 5/5] useradd-staticids.bbclass: Read passwd/group
>> files before parsing
>>
>> On 11/6/15 2:09 PM, Peter Kjellerstedt wrote:
>>>> -----Original Message-----
>>>> From: Mark Hatle [mailto:mark.hatle at windriver.com]
>>>> Sent: den 4 november 2015 01:33
>>>> To: Peter Kjellerstedt; openembedded-core at lists.openembedded.org
>>>> Subject: Re: [PATCH 5/5] useradd-staticids.bbclass: Read
>>>> passwd/group files before parsing
>>>>
>>>> On 11/3/15 6:06 PM, Peter Kjellerstedt wrote:
>>>>> Read and merge the passwd/group files before parsing the user and
>>>>> group definitions. This means they will only be read once per
>>>>> recipe. This solves a problem where if a user was definied in
>>>>> multiple files, it could generate group definitions for groups 
>>>>> that should not be created. E.g., if the first passwd file read 
>>>>> defines a user as:
>>>>>
>>>>> foobar::1234::::
>>>>>
>>>>> and the second passwd file defines it as:
>>>>>
>>>>> foobar:::nogroup:The foobar user:/:/bin/sh
>>>>>
>>>>> then a foobar group would be created even if the user will use the
>>>>> nogroup as its primary group.
>>>>
>>>> One minor thing
>>>>
>>>>> @@ -251,7 +269,7 @@ def update_useradd_static_config(d):
>>>>>
>>>>>              newparams.append(newparam)
>>>>>
>>>>> -        return " ;".join(newparams).strip()
>>>>> +        return ";".join(newparams).strip()
>>>>>
>>>>>      # Load and process the users and groups, rewriting the adduser/addgroup params
>>>>>      useradd_packages = d.getVar('USERADD_PACKAGES', True)
>>>>>
>>>>
>>>> The space was required because you could generate a user/group add
>>>> line that ended with a string.  Without the space, you could end up
>>>> merging two sets of arguments causing a failure condition.
>>>>
>>>> So I think that it should be retained unless there is a specific
>>>> reason you believe it should be removed.
>>>
>>> I cannot see how that space can make any difference. Each set of
>>> useradd/grouppadd options added to newparams has the user/group
>>> name at the end of the string. And if that somehow interferes with
>>> the semicolon, then the code in useradd.bbclass which simply does
>>> "cut -d ';'" to split the useradd/groupadd line would break already.
>>
>> The contents when originally parsed my be run as arguments to a shell
>> script or as parameters to these functions.
>>
>> In the shell script world not have a space can confuse the argument
>> parsing into thinking the ; is part of the argument.
> 
> No shell I have heard of (bash, zsh and dash comes to mind) would be 
> affected by the lack of a space before the semicolon. Moreover, this 
> is never actually parsed by a shell (except as part of a variable value). 
> The semicolon is used by useradd.bbclass to split the variables, after 
> which it lets the shell evaluate the part before the semicolon (which 
> will ignore any trailing whitespace).

I've seen broken shells in the past where you would do something like:

/bin/echo foo;/bin/echo bar

and get:  "/bin echo foo;/bin/echo bar" since it treated the middle item as a
single command.  I'm not saying it wasn't a bug in the shell or system -- just
that I've been burned by it in the past and because of this, I try not to rely
on it.. (when adding a space solves the issue.)

>> You don't have that in the python world with the split behavior.
>>
>>> Actually, now that I think about it, I do wonder why
>>> useradd-staticids.bbclass use this advanced variant to split the
>>> useradd/groupadd lines:
>>>
>>>         for param in re.split('''[ \t]*;[ \t]*(?=(?:[^'"]|'[^']*'|"[^"]*")*$)''', params):
>>
>> It is perfectly legal to allow a ';' in the middle of a parameter (that
>> allows it), a parameter that is quoted.
> 
> Sure, and the code above handles some cases, but definitely not all. 
> E.g., this would not be parsed as intended by useradd-staticids.bbclass:
> 
> USERADD_PARAMS_${PN} += "-c Comment\ with\ an\ \'\ in\ it oddcomment; \
>                          -c Other\ odd \'\ comment otheruser"
> 
> but it would be handled correctly by useradd.bbclass...

In this case, we need to emulate a reasonable set of argument processing.  If
there is something built into bitbake/oe then we can use it.  The re.split
though was a good approximation of the common configurations I was seeing at the
time.  (Specifically quotes parameters for spaces and ;)

>> Something like:
>>
>> adduser -c "This user;that user;all users" -d /home/allusers alluser
>>
>> it's odd, but I've certainly seen people put ';' in the comment
>> before.. and it is legal in other palces, like the home dir and 
>> such -- just not advised.
> 
> It may be legal, but it has never been supported by the adduser.bbclass. 
> And thus it has never worked in practice to take an existing passwd 
> file that contains a semicolon in the comment field and expect it to 
> work as input to adduser-staticids.bbclass.

Then that is a bug we should fix.  At one time this was working (perhaps not in
OE, but locally?)   Since I did have a customer who had both spaces and ';' in
their comment field.  (This is how I originally found the problem and needed to
figure out a regex that worked in more cases.)

> Moreover, neither adduser.bbclass nor adduser-staticids.bbclass will 
> handle a semicolon correctly in any other field. And let's not get 
> started on other special characters like ", \, $ and > which could 
> wreak havoc on both classes in a correctly crafted passwd file.

I realize there are may corner cases here.  We need to try to support the
reasonable ones and prevent the others.  If that list should be avoided in
certain cases -- then we should enhance the system catch things "not to do".
But using ';' in the comment is fairly common in some environments.

>>> when this would do the job just as well:
>>>
>>>         for param in params.split(';'):
>>>
>>> given that that is what useradd.bbclass does. It looks as if tries
>>> to support something like --comment "something with a ; in it", but
>>> using that would break in useradd.bbclass anyway...
>>
>> Then the useradd class is broken in this case.  The --comment
>> processing needs to work, it's just rarely used in the normal 
>> case, but very much used in the "lets take a previously generated 
>> passwd file and reuse it" case of the adduser-static.
>>
>>> //Peter
> 
> So, from reading the code in both adduser.bbclass and 
> adduser-staticids.bbclass, having spaces or no spaces before the 
> semicolon in the USERADD_PARAM_${PN} variable cannot make any 
> difference to how the users are created in the end. Thus it should 
> be safe to remove them.

The (initial) generated adduser/addgroup/etc command is dumped into the pre/post
install script section of the packages.  So when the package manager runs it
needs to execute the shell script.  This is where I've seen the problem of the
';' in the past...

Adduser and adduser-static should be synced to use the same delimitation
mechanism.  And for things like a ; in the comment, both should equally support
it.  We've got a mismatch and that is definitely a bug.

--Mark

> //Peter
> 




More information about the Openembedded-core mailing list