Rust builds systematically time out

OpenSubmitted by Ludovic Courtès.
Details
5 participants
  • Ivan Petkov
  • John Soo
  • Ludovic Courtès
  • mikadoZero
  • Pierre Langlois
Owner
unassigned
Severity
important
L
L
Ludovic Courtès wrote on 4 Apr 2019 10:59
(address . bug-Guix@gnu.org)
878swqtabb.fsf@gnu.org
Hello,
On berlin, Rust 1.24.1 builds systematically exceed the timeout:
Toggle snippet (15 lines)Building stage1 compiler artifacts (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu) Compiling arena v0.0.0 (file:///tmp/guix-build-rust-1.24.1.drv-0/rustc-1.24.1-src/src/libarena) Compiling rustc_driver v0.0.0 (file:///tmp/guix-build-rust-1.24.1.drv-0/rustc-1.24.1-src/src/librustc_driver)
[...]
Compiling rls-data v0.14.0 Compiling rustc_data_structures v0.0.0 (file:///tmp/guix-build-rust-1.24.1.drv-0/rustc-1.24.1-src/src/librustc_data_structures) Compiling flate2 v1.0.1 Compiling syntax_pos v0.0.0 (file:///tmp/guix-build-rust-1.24.1.drv-0/rustc-1.24.1-src/src/libsyntax_pos) Compiling rustc_errors v0.0.0 (file:///tmp/guix-build-rust-1.24.1.drv-0/rustc-1.24.1-src/src/librustc_errors) Compiling backtrace v0.3.4guix offload: error: timeout expired while offloading '/gnu/store/61bd22d9mg3xl260jwddisiahh3kmanj-rust-1.24.1.drv'
Strangely, the build lasts ~9000 seconds (2.5 hours) on the front-endnode of berlin¹, and the timeout for guix-daemon on berlin is 6h (seeguix-maintenance.git) while the max-silent-time is 1h.
The build nodes may be slower than the front-end, but still, it seemsunlikely that it would take more than 6h there. (That could happen ifthe test suite, which lasts 2.1h, were “embarrassingly parallel”, butwe’re running tests with ‘-j1’.)
To summarize, there are two problems:
1. Rust takes too long to build. What can we do about it? Enable parallel builds?
2. Offloaded builds seem to time out prematurely or something.
Thoughts?
Ludo’.
¹ See https://ci.guix.info/log/rkrnm3rr7g6fhr17160vn1mz5rdzh9lv-rust-1.24.1 for timings.
P
P
Pierre Langlois wrote on 4 Apr 2019 11:28
(address . bug-guix@gnu.org)(name . Ivan Petkov)(address . ivanppetkov@gmail.com)
87bm1mglus.fsf@gmx.com
Hello!
Ludovic Courtès writes:
Toggle quote (34 lines)> Hello,>> On berlin, Rust 1.24.1 builds systematically exceed the timeout:>> --8<---------------cut here---------------start------------->8---> Building stage1 compiler artifacts (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)> Compiling arena v0.0.0 (file:///tmp/guix-build-rust-1.24.1.drv-0/rustc-1.24.1-src/src/libarena)> Compiling rustc_driver v0.0.0 (file:///tmp/guix-build-rust-1.24.1.drv-0/rustc-1.24.1-src/src/librustc_driver)>> [...]>> Compiling rls-data v0.14.0> Compiling rustc_data_structures v0.0.0 (file:///tmp/guix-build-rust-1.24.1.drv-0/rustc-1.24.1-src/src/librustc_data_structures)> Compiling flate2 v1.0.1> Compiling syntax_pos v0.0.0 (file:///tmp/guix-build-rust-1.24.1.drv-0/rustc-1.24.1-src/src/libsyntax_pos)> Compiling rustc_errors v0.0.0 (file:///tmp/guix-build-rust-1.24.1.drv-0/rustc-1.24.1-src/src/librustc_errors)> Compiling backtrace v0.3.4> guix offload: error: timeout expired while offloading '/gnu/store/61bd22d9mg3xl260jwddisiahh3kmanj-rust-1.24.1.drv'> --8<---------------cut here---------------end--------------->8--->> Strangely, the build lasts ~9000 seconds (2.5 hours) on the front-end> node of berlin¹, and the timeout for guix-daemon on berlin is 6h (see> guix-maintenance.git) while the max-silent-time is 1h.>> The build nodes may be slower than the front-end, but still, it seems> unlikely that it would take more than 6h there. (That could happen if> the test suite, which lasts 2.1h, were “embarrassingly parallel”, but> we’re running tests with ‘-j1’.)>> To summarize, there are two problems:>> 1. Rust takes too long to build. What can we do about it? Enable> parallel builds?
One thing I suggested in the past was to remove the check phase *only*for rust packages used for bootstrapping. This way we still run thetests for the final rust but not at every step in the chain.
Although, I wonder if we're more likely to miss a bug if we do this, I'mnot sure.
For reference: https://lists.gnu.org/archive/html/guix-patches/2018-11/msg00453.html
Thanks,Pierre
L
L
Ludovic Courtès wrote on 4 Apr 2019 13:24
control message for bug #35139
(address . control@debbugs.gnu.org)
87y34qrp14.fsf@gnu.org
severity 35139 important
I
I
Ivan Petkov wrote on 4 Apr 2019 17:47
Re: bug#35139: Rust builds systematically time out
(address . bug-guix@gnu.org)
101FBDE5-97FA-4449-9076-DD24C56B8715@gmail.com
Toggle quote (12 lines)> On Apr 4, 2019, at 1:59 AM, Ludovic Courtès <ludo@gnu.org> wrote:> > The build nodes may be slower than the front-end, but still, it seems> unlikely that it would take more than 6h there. (That could happen if> the test suite, which lasts 2.1h, were “embarrassingly parallel”, but> we’re running tests with ‘-j1’.)> > To summarize, there are two problems:> > 1. Rust takes too long to build. What can we do about it? Enable> parallel builds?
Rust tests are designed to run in parallel, as long as you have enoughRAM, file descriptors, etc. available on the machine for the amount ofconcurrency being used. The compiler test suite is largely just compilingfiles, so the most important resource is probably available RAM/swap.
Toggle quote (9 lines)> On Apr 4, 2019, at 2:28 AM, Pierre Langlois <pierre.langlois@gmx.com> wrote:> > One thing I suggested in the past was to remove the check phase *only*> for rust packages used for bootstrapping. This way we still run the> tests for the final rust but not at every step in the chain.> > Although, I wonder if we're more likely to miss a bug if we do this, I'm> not sure.
Although that definitely will speed the bootstrap chain, I’m concerned thatif a dependency package ever gets updated and breaks things we wouldn’tknow without running the test suite.
Maybe if the bootstrapped versions don’t ever change skipping the checkphase will be safe, but I think we should try running parallel tests firstand see how far that gets us.
—Ivan
Attachment: file
L
L
Ludovic Courtès wrote on 4 Apr 2019 18:06
(name . Ivan Petkov)(address . ivanppetkov@gmail.com)
87imvtrc0g.fsf@gnu.org
Ivan Petkov <ivanppetkov@gmail.com> skribis:
Toggle quote (17 lines)>> On Apr 4, 2019, at 1:59 AM, Ludovic Courtès <ludo@gnu.org> wrote:>> >> The build nodes may be slower than the front-end, but still, it seems>> unlikely that it would take more than 6h there. (That could happen if>> the test suite, which lasts 2.1h, were “embarrassingly parallel”, but>> we’re running tests with ‘-j1’.)>> >> To summarize, there are two problems:>> >> 1. Rust takes too long to build. What can we do about it? Enable>> parallel builds?>> Rust tests are designed to run in parallel, as long as you have enough> RAM, file descriptors, etc. available on the machine for the amount of> concurrency being used. The compiler test suite is largely just compiling> files, so the most important resource is probably available RAM/swap.
Perhaps we could start with:
"-j" (number->string (min (parallel-job-count) 2))
?
Toggle quote (4 lines)> Maybe if the bootstrapped versions don’t ever change skipping the check> phase will be safe, but I think we should try running parallel tests first> and see how far that gets us.
Sounds like a good start.
So the only reason we’re running tests sequentially is because of memoryusage concerns?
Thanks,Ludo’.
I
I
Ivan Petkov wrote on 4 Apr 2019 19:37
(name . Ludovic Courtès)(address . ludo@gnu.org)
17B412D1-5D9A-40C8-B37E-D8C08F0E9641@gmail.com
Danny’s got a patch for turning on parallel tests in #35126
Not sure why the previous tests were running sequentially, but there is a comment somewhere saying it’s to avoid EAGAIN errors.
--Ivan
Toggle quote (38 lines)> On Apr 4, 2019, at 9:06 AM, Ludovic Courtès <ludo@gnu.org> wrote:> > Ivan Petkov <ivanppetkov@gmail.com> skribis:> >>> On Apr 4, 2019, at 1:59 AM, Ludovic Courtès <ludo@gnu.org> wrote:>>> >>> The build nodes may be slower than the front-end, but still, it seems>>> unlikely that it would take more than 6h there. (That could happen if>>> the test suite, which lasts 2.1h, were “embarrassingly parallel”, but>>> we’re running tests with ‘-j1’.)>>> >>> To summarize, there are two problems:>>> >>> 1. Rust takes too long to build. What can we do about it? Enable>>> parallel builds?>> >> Rust tests are designed to run in parallel, as long as you have enough>> RAM, file descriptors, etc. available on the machine for the amount of>> concurrency being used. The compiler test suite is largely just compiling>> files, so the most important resource is probably available RAM/swap.> > Perhaps we could start with:> > "-j" (number->string (min (parallel-job-count) 2))> > ?> >> Maybe if the bootstrapped versions don’t ever change skipping the check>> phase will be safe, but I think we should try running parallel tests first>> and see how far that gets us.> > Sounds like a good start.> > So the only reason we’re running tests sequentially is because of memory> usage concerns?> > Thanks,> Ludo’.
Attachment: file
M
M
mikadoZero wrote on 5 Apr 2019 23:18
(name . Ludovic Courtès)(address . ludo@gnu.org)
cucpnq040ck.fsf@yandex.com
When I try to install rust I get similar behavior. It does not finishbuilding. The longest I have let it try for was around 12 hours. Thatwas is a on a machine with 1GB RAM and 10GB SWAP.
Ludovic Courtès writes:
Toggle quote (42 lines)> Hello,>> On berlin, Rust 1.24.1 builds systematically exceed the timeout:>> --8<---------------cut here---------------start------------->8---> Building stage1 compiler artifacts (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)> Compiling arena v0.0.0 (file:///tmp/guix-build-rust-1.24.1.drv-0/rustc-1.24.1-src/src/libarena)> Compiling rustc_driver v0.0.0 (file:///tmp/guix-build-rust-1.24.1.drv-0/rustc-1.24.1-src/src/librustc_driver)>> [...]>> Compiling rls-data v0.14.0> Compiling rustc_data_structures v0.0.0 (file:///tmp/guix-build-rust-1.24.1.drv-0/rustc-1.24.1-src/src/librustc_data_structures)> Compiling flate2 v1.0.1> Compiling syntax_pos v0.0.0 (file:///tmp/guix-build-rust-1.24.1.drv-0/rustc-1.24.1-src/src/libsyntax_pos)> Compiling rustc_errors v0.0.0 (file:///tmp/guix-build-rust-1.24.1.drv-0/rustc-1.24.1-src/src/librustc_errors)> Compiling backtrace v0.3.4> guix offload: error: timeout expired while offloading '/gnu/store/61bd22d9mg3xl260jwddisiahh3kmanj-rust-1.24.1.drv'> --8<---------------cut here---------------end--------------->8--->> Strangely, the build lasts ~9000 seconds (2.5 hours) on the front-end> node of berlin¹, and the timeout for guix-daemon on berlin is 6h (see> guix-maintenance.git) while the max-silent-time is 1h.>> The build nodes may be slower than the front-end, but still, it seems> unlikely that it would take more than 6h there. (That could happen if> the test suite, which lasts 2.1h, were “embarrassingly parallel”, but> we’re running tests with ‘-j1’.)>> To summarize, there are two problems:>> 1. Rust takes too long to build. What can we do about it? Enable> parallel builds?>> 2. Offloaded builds seem to time out prematurely or something.>> Thoughts?>> Ludo’.>> ¹ See <https://ci.guix.info/log/rkrnm3rr7g6fhr17160vn1mz5rdzh9lv-rust-1.24.1>> for timings.
J
J
John Soo wrote on 30 Mar 07:42 +0200
Rust builds systematically time out
(address . 35139@debbugs.gnu.org)
119E152C-F6C8-4CB1-86C1-F213EB4571C0@asu.edu
Hi everyone,
Is this still happening? It looks like rust-1.24.1 is completing successfully on both ci servers.
- John
?