[bitbake-devel] [PATCH] prserv: don't wait until exit to sync

Gary Thomas gary at mlbassoc.com
Mon Nov 3 18:27:15 UTC 2014


On 2014-11-03 10:30, Richard Purdie wrote:
> On Mon, 2014-11-03 at 09:47 -0600, Ben Shelton wrote:
>> On 11/02, Burton, Ross wrote:
>>> On 27 October 2014 17:27, Ben Shelton <ben.shelton at ni.com> wrote:
>>>
>>>> In the commit 'prserv: Ensure data is committed', the PR server moved to
>>>> only committing transactions to the database when the PR server is
>>>> stopped.  This improves performance, but it means that if the machine
>>>> running the PR server loses power unexpectedly or if the PR server
>>>> process gets SIGKILL, the uncommitted package revision data is lost.
>>>>
>>>> To fix this issue, sync the database periodically, once per 30 seconds
>>>> by default, if it has been marked as dirty.  To be safe, continue to
>>>> sync the database at exit regardless of its status.
>>>>
>>>
>>> This appears to be causing random problems for me where bitbake will
>>> timeout attempting to access the PR database, my hunch is that it's
>>> blocking on disk I/O.  Are there any tricks we can do with sqlite to reduce
>>> the overhead of committing? (assuming that sqlite isn't causing a full
>>> filesystem sync).
>>>
>>> Ross
>>
>> After running a few large nightly builds, we've seen some issues with
>> this as well.  It looks like the issue is in the PR server itself, which
>> logs this error:
>>
>> "OperationalError: cannot start a transaction within a transaction"
>>
>> However, I'm confused as to why this is happening, since the only place
>> new transactions are being created is in the sync() function ("BEGIN
>> EXCLUSIVE TRANSACTION"), and AFAIK that's only called by a single
>> thread.  Any ideas?
>
> Did the commit() fail and therefore there was already an transaction
> open? It leads to another quesiton of why the commit would fail (timeout
> maybe?).
>
>> Would it make sense to revert the patch until we identify/fix the issue?
>
> You have flagged a valid issue that I would like to get to the bottom of
> so perhaps not quite yet.
>
> I'm wondering if we can have some in memory copy of the table which we
> flush to disk in a separate thread which wouldn't influence the PR
> service request responses but its a horrible idea to workaround what
> seems like a fundamental problem in sqlite :/.

I just got this error:
ERROR: Can NOT get PRAUTO from remote PR service
ERROR: Function failed: package_get_auto_pr
ERROR: Logfile of failure stored in: /home/local/rpi-latest_2014-10-30/tmp/work/armv6-vfp-amltd-linux-gnueabi/usbutils/007-r0/temp/log.do_package.13260
ERROR: Task 3204 (/home/local/poky-latest/meta/recipes-bsp/usbutils/usbutils_007.bb, do_package) failed with exit code '1'

Is it the same as what's being discussed above?  Where can I
look for more info on what happened?

n.b. I just restarted my build and it seems happy to carry on
where it left off.

-- 
------------------------------------------------------------
Gary Thomas                 |  Consulting for the
MLB Associates              |    Embedded world
------------------------------------------------------------



More information about the bitbake-devel mailing list