[PATCH] Fix guile-fibers resource leak

DoneSubmitted by Caleb Ristvedt.
Details
2 participants
  • Caleb Ristvedt
  • Christopher Baines
Owner
unassigned
Severity
normal
C
C
Caleb Ristvedt wrote on 22 May 03:52 +0200
(address . guix-patches@gnu.org)
87zha0zqft.fsf@cune.org
This adds a patch to guile-fibers to fix a resource leak that causedfile descriptors to be opened and never closed with each invocation of`run-fibers'. This is presumably what was causing the tests to fail, asguile will abort when it gets EMFILE while attempting to create a newthread. I've verified that it builds on my system, but it's only a4-core machine, and the rate at which file descriptors leak scales withthe number of cores, so it's possible it would have built successfullyhere regardless. Could someone with access to a system with more coresverify that it now builds properly there?
Hopefully a bug fix release will show up soon enough and we can get ridof this.
- reepca
From 659fa6b70cb8364187753e240076cdb107320070 Mon Sep 17 00:00:00 2001From: Caleb Ristvedt <caleb.ristvedt@cune.org>Date: Thu, 21 May 2020 20:30:58 -0500Subject: [PATCH] gnu: guile-fibers: Add patch to fix resource leak.
guile-fibers@1.0.0 has a resource leak where run-fibers will only destroy onescheduler, but it creates as many as there are cpu cores by default (seehttps://github.com/wingo/fibers/issues/36). This causes the tests to fail onsystems with many cores, and can cause guile to crash under certaincircumstances. This fixes that resource leak. At present neither git masternor the latest release has fixed this yet.
* gnu/packages/patches/guile-fibers-destroy-peer-schedulers.patch: new patch.* gnu/local.mk: add it to the list of patches.* gnu/packages/guile-xyz.scm (guile-fibers): use it.--- gnu/local.mk | 1 + gnu/packages/guile-xyz.scm | 5 +++- ...guile-fibers-destroy-peer-schedulers.patch | 24 +++++++++++++++++++ 3 files changed, 29 insertions(+), 1 deletion(-) create mode 100644 gnu/packages/patches/guile-fibers-destroy-peer-schedulers.patch
Toggle diff (60 lines)diff --git a/gnu/local.mk b/gnu/local.mkindex 1d9de9a57e..2f24f892b1 100644--- a/gnu/local.mk+++ b/gnu/local.mk@@ -1053,6 +1053,7 @@ dist_patch_DATA = \ %D%/packages/patches/guile-3.0-relocatable.patch \ %D%/packages/patches/guile-linux-syscalls.patch \ %D%/packages/patches/guile-3.0-linux-syscalls.patch \+ %D%/packages/patches/guile-fibers-destroy-peer-schedulers.patch \ %D%/packages/patches/guile-gdbm-ffi-support-gdbm-1.14.patch \ %D%/packages/patches/guile-present-coding.patch \ %D%/packages/patches/guile-rsvg-pkgconfig.patch \diff --git a/gnu/packages/guile-xyz.scm b/gnu/packages/guile-xyz.scmindex 674b1f922b..a1deee32d1 100644--- a/gnu/packages/guile-xyz.scm+++ b/gnu/packages/guile-xyz.scm@@ -523,7 +523,10 @@ Unix-style DSV format and RFC 4180 format.") (("#:use-module \\(fibers\\)") (string-append "#:use-module (fibers)\n" "#:use-module (ice-9 threads)\n")))- #t))))+ #t))+ (patches+ ;; fixes a resource leak that causes crashes in the tests+ (search-patches "guile-fibers-destroy-peer-schedulers.patch")))) (build-system gnu-build-system) (arguments '(;; The code uses 'scm_t_uint64' et al., which are deprecated in 3.0.diff --git a/gnu/packages/patches/guile-fibers-destroy-peer-schedulers.patch b/gnu/packages/patches/guile-fibers-destroy-peer-schedulers.patchnew file mode 100644index 0000000000..8bb7153153--- /dev/null+++ b/gnu/packages/patches/guile-fibers-destroy-peer-schedulers.patch@@ -0,0 +1,24 @@+Fibers 1.0.0 has a bug in run-fibers in which peer schedulers aren't destroyed -+so if you had 4 cores, 1 would be destroyed when run-fibers returned, but the+other 3 would stay around. Each scheduler uses 3 file descriptors, so for+machines with many cores, this resource leak adds up quickly - quickly enough+that the test suite can even fail because of it.++See https://github.com/wingo/fibers/issues/36.++This fixes that. It should be safe to destroy the peer schedulers at the given+point because the threads that could be running them are all either dead or the+current thread.++As of May 21, 2020, this bug still existed in the 1.0.0 (latest) release and in+git master.+--- a/fibers.scm 2020-05-21 18:38:06.890690154 -0500++++ b/fibers.scm 2020-05-21 18:38:56.395686693 -0500+@@ -137,5 +137,6 @@+ (%run-fibers scheduler hz finished? affinity))+ (lambda ()+ (stop-auxiliary-threads scheduler)))))++ (for-each destroy-scheduler (scheduler-remote-peers scheduler))+ (destroy-scheduler scheduler)+ (apply values (atomic-box-ref ret))))))+-- 2.26.2
-----BEGIN PGP SIGNATURE-----
iQEzBAEBCAAdFiEEdNapMPRLm4SepVYGwWaqSV9/GJwFAl7HMGYACgkQwWaqSV9/GJyeoAf/d/jVHgzjj4V/5g2JAfGhzz02VaFoI8xi5hTR/QzGuV1oHtQTqOW52P//L2EXz4U0jA2fKoBn3wrmu2J8X87N/o02G+uai42MFshOVn8wjqrtYOm9TIVLYZPpoQYusoklnR123viu/QEABfzP7GF0u3dizMqF/q1ukdTMb+fNy7sLZ/UrYl7wimS3JyRq2pisaOFJWxSQkg8WROsPHxOQBGE8/2aPp6KdTu2QZpuyDm8pYApd4/rG8ZrYXrx1naejA8SA5GyEmdkAZWbphze3l2Hj0aNTa8PzHfsa7sRuod2wvYgLlVmg3E3g1+SlYaBnTAHCsLyUqx62t29p/harLA===oNOA-----END PGP SIGNATURE-----
C
C
Christopher Baines wrote on 22 May 19:44 +0200
87eerbc19g.fsf@cbaines.net
Caleb Ristvedt <caleb.ristvedt@cune.org> writes:
Toggle quote (10 lines)> This adds a patch to guile-fibers to fix a resource leak that caused> file descriptors to be opened and never closed with each invocation of> `run-fibers'. This is presumably what was causing the tests to fail, as> guile will abort when it gets EMFILE while attempting to create a new> thread. I've verified that it builds on my system, but it's only a> 4-core machine, and the rate at which file descriptors leak scales with> the number of cores, so it's possible it would have built successfully> here regardless. Could someone with access to a system with more cores> verify that it now builds properly there?
I've tried this on bayfront.guix.gnu.org which has 32 cores and I'm veryglad to say it seems to work!
Maybe tweak the capitalisation in the commit message, "New patch", "Addit ...", "Use it", but yeah, I'm all for merging this, with it I'll beable to reconfigure bayfront hopefully. Are you set to tweak the commitmessage and push?
Thanks,
Chris
-----BEGIN PGP SIGNATURE-----
iQKTBAEBCgB9FiEEPonu50WOcg2XVOCyXiijOwuE9XcFAl7ID5tfFIAAAAAALgAoaXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDNFODlFRUU3NDU4RTcyMEQ5NzU0RTBCMjVFMjhBMzNCMEI4NEY1NzcACgkQXiijOwuE9XeMjg/9FjyEOHwUerdwzDNIz8l/KikV3SWCM0W7bFdoohfeMHHhq8MI7SPzWPu3pL750T/kgjHW6cyE/Q5yil7wqnPTItEXP+IL1vIw9IRTjmKCU4iBqvFzXG/wj3N1TKGlxbUGfs8tkinn4JU3zaQL+zwQtfS8krXUoMdzcD7S4cGi8+tDh/JIMvFV9R1kYzduQRr60YTO2q24QY9jfwrlHH9+kDLU5gAs2dRMLcOjdCwyKQid1CGttOge6K8NqUB4Wh/RreGt5OojzjWRtgngN3hlIbUUX7fbT5tiFh6nl1SkT/X73XRKg8CAjzAoXTGexUNZNbwPGd1yqkc/xh3+FOvixDB9M3kRlZOB9MfJM32hbPY2HY7G93hQDvHHjJ97CT+OPtD3YFZeoS9q3K7Jlwg1hZluk/V++y1chDsgSgbDWvPEIryP7AOnmhxWuLzCAly/NkjI/0h+G/7H7qfmcE0PRKg856cVaUh89VyifLQ+XJDXpgbYGSSL2cBjWTbSyMczzVWsLM5YIpGnofJ7svCUNCZwxOoYC9yg4qCc0I3Meqk4z+k0iR58uYe+SuQ1Cr64xad0UnuzFLO/vdlDSR/sjcf0FbXalQKtzB31kCwywjryZwJfuyvUwmm8iS6y893Ywq9FHfw0DLVnlc2LtRdffTP+x7efQMmcHlGRpCbqnGo==oaFa-----END PGP SIGNATURE-----
C
C
Caleb Ristvedt wrote on 22 May 21:38 +0200
(name . Christopher Baines)(address . mail@cbaines.net)(address . 41444@debbugs.gnu.org)
87tv07zrnh.fsf@cune.org
Christopher Baines <mail@cbaines.net> writes:
Toggle quote (3 lines)> I've tried this on bayfront.guix.gnu.org which has 32 cores and I'm very> glad to say it seems to work!
Excellent.
Toggle quote (3 lines)> Maybe tweak the capitalisation in the commit message, "New patch", "Add> it ...", "Use it",
Done.
Toggle quote (2 lines)> Are you set to tweak the commit message and push?
Indeed, tweaked and pushed as 9af90aafdfd8afd5fb7b5377ca5daf2215d38d7a.
- reepca
C
C
Caleb Ristvedt wrote on 22 May 21:39 +0200
done
(address . 41444-done@debbugs.gnu.org)
87pnavzrmj.fsf@cune.org
closed.
Closed
?