[OE-core] Fetch time optimization (svn : gcc/eglibc - git : linux-yocto)

Fri Mar 30 06:44:56 UTC 2012

2012/3/30 Bruce Ashfield <bruce.ashfield at gmail.com>:
> On Thu, Mar 29, 2012 at 6:03 PM, Richard Purdie
> <richard.purdie at linuxfoundation.org> wrote:
>> On Thu, 2012-03-29 at 22:53 +0200, Eric Bénard wrote:
>>> I noticed in from scratch builds for qemuarm that the longest time is
>>> taken in fetching sources, especially those fetched using git
>>> (linux-yocto for example) & svn (gcc, eglibc & co).
>>
>> Are you timing these as fetches from the source control systems or from
>> the mirror tarballs of the repositories. The tarballs should be
>> faster...
>>
>>> To reduce the fetch time would that make sense to
>>> - fetch gcc/glibc & co from the archive of a stable version and then
>>>   apply patches on top of it (maybe patches stored in an archive
>>>   fetched from oe's website and applied in bulk or patches stored in OE)
>>
>> Unfortunately the patches tend to get unwieldy. The tarballs of the svn
>> repos on the mirror should be about equal in size to the upstream
>> archive in this case.
>>
>>> - do the same thing for the linux-yocto kernel or add a --reference
>>>   option to the git fetcher so that we can provide a local tree as a
>>>   reference ?
>>
>> This is effectively how the repositories in DL_DIR are used. If you
>> place a tree in the right place there, it should reuse references...
>
> Agreed .. they definitely do here.
>
> Richard probably recalls me asking for a --reference option several
> years ago as well .. but in the end, at some point the initial fetch happens
> and then the blobs are re-used. So setting up local mirrors, or pre-fetching
> are options to make sure that the first download is primed and ready to
> go. For most builds I do, any time fetching just happens in the background
> and doesn't get in the way.
>
>>
>>> Do you have other ideas (appart from using a local mirror) to optimize
>>> the fetch time ?
>>
>> I'd be interested firstly to understand if you're using the SCM directly
>> or using the mirror tarballs as that should make a big difference. In
>> the standard configuration it should be using mirror tarballs...
>
> As would I, since there are some ideas, but they either break workflows,
> don't follow best practices or compromise the completeness of the data.
>
> Cheers,
>
> Bruce
>
>>
>> Cheers,
>>
>> Richard
>>
>>
>> _______________________________________________
>> Openembedded-core mailing list
>> Openembedded-core at lists.openembedded.org
>> http://lists.linuxtogo.org/cgi-bin/mailman/listinfo/openembedded-core
>
>
>
> --
> "Thou shalt not follow the NULL pointer, for chaos and madness await
> thee at its end"
>
> _______________________________________________
> Openembedded-core mailing list
> Openembedded-core at lists.openembedded.org
> http://lists.linuxtogo.org/cgi-bin/mailman/listinfo/openembedded-core

Hi,
this might be a bit off-topic, but another idea would be to add a
separate threading mechanism for fetching.

Current threading can help to use the CPU and memory load to it's optimum,
but sometimes you have to wait for a download to finish..
Instead there could be a separate set of threads that only download
the sources and make optimal use of the bandwidth too.

This would also allow to fetch files when the normal threads are busy
with configuring/building/packaging recipes.

The downside would be that it requires some sort of inter process
communication.
Or it could be regulated with a simple check if the download is finished..

How does this idea sound to you?

-- 
Regards
Samuel