[PATCH] gnu: libtorrent-rasterbar: Work around hang in test_ssl.

  • Open
  • quality assurance status badge
Details
2 participants
  • Maxim Cournoyer
  • Tomas Volf
Owner
unassigned
Submitted by
Tomas Volf
Severity
normal
T
T
Tomas Volf wrote on 6 Oct 18:11 +0200
(address . guix-patches@gnu.org)(name . Tomas Volf)(address . ~@wolfsden.cz)
b318e317dab97a658b90669a49edc79247716c87.1728231109.git.~@wolfsden.cz
test_ssl does sometimes hang (at least when executed under faketime). It is
somewhat unlikely to happen, and (on my machine) required a build with
--rounds=32 to reproduce it.

The workaround is to set somewhat lower timeout of 240s (expected test
duration * 5 rounded up to whole minutes) and retry few times on failure. In
this way, --rounds=64 finished successfully (on my machine).

At the same time remove the timeout from the other tests, since it is not
necessary (they do not hang), and one of them runs for ~270s (almost half the
original timeout), so it could pose a problem on slow/overloaded machine.

* gnu/packages/bittorrent.scm
(libtorrent-rasterbar)[arguments]<#:phases>['check]: Remote test timeout for
most tests. Lower the timeout for test_ssl. Retry test_ssl on failure.

Change-Id: I535c72fec24658a4b2151d2e8794319055c9a278
---
gnu/packages/bittorrent.scm | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)

Toggle diff (49 lines)
diff --git a/gnu/packages/bittorrent.scm b/gnu/packages/bittorrent.scm
index 2b38c7cb65..1a0735d928 100644
--- a/gnu/packages/bittorrent.scm
+++ b/gnu/packages/bittorrent.scm
@@ -452,7 +452,6 @@ (define-public libtorrent-rasterbar
(exclude-regex (string-append "^("
(string-join disabled-tests "|")
")$"))
- (timeout "600")
(jobs (if parallel-tests?
(number->string (parallel-job-count))
"1")))
@@ -460,7 +459,6 @@ (define-public libtorrent-rasterbar
(invoke "ctest"
"-E" exclude-regex
"-j" jobs
- "--timeout" timeout
"--output-on-failure")
;; test_ssl relies on bundled TLS certificates with a fixed
;; expiry date. To ensure succesful builds in the future,
@@ -470,16 +468,16 @@ (define-public libtorrent-rasterbar
;; test_fast_extension, test_privacy and test_resolve_links
;; to hang, even with FAKETIME_ONLY_CMDS. Not sure why. So
;; execute only test_ssl under faketime.
- ;;
- ;; Note: The test_ssl test times out in the ci.
- ;; Temporarily disable it until that is resolved.
- ;; (invoke "faketime" "2022-10-24"
- ;; "ctest"
- ;; "-R" "^test_ssl$"
- ;; "-j" jobs
- ;; "--timeout" timeout
- ;; "--output-on-failure")
- )))))))
+ (invoke "faketime" "2022-10-24"
+ "ctest"
+ "-R" "^test_ssl$"
+ "-j" jobs
+ ;; test_ssl sometimes hangs (at least when run under
+ ;; faketime), therefore set a time limit and retry
+ ;; few times on failure.
+ "--timeout" "240"
+ "--repeat" "until-pass:5"
+ "--output-on-failure"))))))))
(inputs (list boost openssl))
(native-inputs
(list libfaketime
--
2.46.0
M
M
Maxim Cournoyer wrote on 17 Dec 02:20 +0100
(name . Tomas Volf)(address . ~@wolfsden.cz)(address . 73664@debbugs.gnu.org)
87r067rtpd.fsf@gmail.com
Hi Tomas,

Tomas Volf <~@wolfsden.cz> writes:

Toggle quote (4 lines)
> test_ssl does sometimes hang (at least when executed under faketime). It is
> somewhat unlikely to happen, and (on my machine) required a build with
> --rounds=32 to reproduce it.

It'd be nice if upstream was made aware of this problem. Perhaps they
could come up with a fix for good.

Toggle quote (8 lines)
> The workaround is to set somewhat lower timeout of 240s (expected test
> duration * 5 rounded up to whole minutes) and retry few times on failure. In
> this way, --rounds=64 finished successfully (on my machine).
>
> At the same time remove the timeout from the other tests, since it is not
> necessary (they do not hang), and one of them runs for ~270s (almost half the
> original timeout), so it could pose a problem on slow/overloaded machine.

This means the tests may take up to 20 minutes, which is a bit too much
to my taste.

[...]

Toggle quote (20 lines)
> - ;; Note: The test_ssl test times out in the ci.
> - ;; Temporarily disable it until that is resolved.
> - ;; (invoke "faketime" "2022-10-24"
> - ;; "ctest"
> - ;; "-R" "^test_ssl$"
> - ;; "-j" jobs
> - ;; "--timeout" timeout
> - ;; "--output-on-failure")
> - )))))))
> + (invoke "faketime" "2022-10-24"
> + "ctest"
> + "-R" "^test_ssl$"
> + "-j" jobs
> + ;; test_ssl sometimes hangs (at least when run under
> + ;; faketime), therefore set a time limit and retry
> + ;; few times on failure.
> + "--timeout" "240"
> + "--repeat" "until-pass:5"
> + "--output-on-failure"))))))))

I think that a test sometimes hang is a good reason to leave it
disabled, report it to upstream, and reference the issue. The test can
be re-enabled when the issue is resolved and part of a new release.

So in concrete terms, what I'd rather see here is a report of the
problems (requirement on faketime + propension to hang) to upstream, the
and an updated comment cross-referencing it (with the test kept
commented-out/disabled in the mean time).

Does that make sense?

--
Thanks,
Maxim
M
M
Maxim Cournoyer wrote on 17 Dec 06:08 +0100
(name . Tomas Volf)(address . ~@wolfsden.cz)(address . 73664@debbugs.gnu.org)
87a5cusxq1.fsf@gmail.com
Hello!

Tomas Volf <~@wolfsden.cz> writes:

Toggle quote (4 lines)
> test_ssl does sometimes hang (at least when executed under faketime). It is
> somewhat unlikely to happen, and (on my machine) required a build with
> --rounds=32 to reproduce it.

Also worth adding on top of my previous reply, when trying this out, I
got a failure:

Toggle snippet (35 lines)
[...]

MALICIOUS PEER TEST: valid-certificate valid-SNI-hash invalid-bittorrent-hash port: 35161
set_password_callback
use_certificate_file "../ssl/peer_certificate.pem"
use_private_key_file "../ssl/peer_private_key.pem"
use_tmp_dh_file "../ssl/dhparams.pem"
connecting 127.0.0.1:35161
SNI: e300afcc0aa67a459ec14862a4d0bf930060167a
SSL handshake
bittorrent handshake
00:00:00.010: ses1: [log] *** peer SSL handshake done [ ip: 127.0.0.1:44976 ec: certificate verify failed (SSL routines) socket: SSL/TCP ]
read bittorrent handshake
00:00:00.010: ses1: [peer_error] - peer [ 127.0.0.1:44976 client: Unknown ] peer error [ssl_handshake] [asio.ssl]: certificate verify failed (SSL routines)
--- peer_errors: 6 ssl_disconnects: 6
failed to read bittorrent handshake: sslv3 alert bad certificate (SSL routines)


0% tests passed, 1 tests failed out of 1

Total Test time (real) = 405.71 sec

The following tests FAILED:
75 - test_ssl (Failed)
Errors while running CTest
error: in phase 'check': uncaught exception:
%exception #<&invoke-error program: "faketime" arguments: ("2022-10-24" "ctest" "-R" "^test_ssl$" "-j" "1" "--timeout" "240" "--repeat" "until-pass:5" "--output-on-failure") exit-status: 8 term-signal: #f stop-signal: #f>
phase `check' failed after 1154.0 seconds
command "faketime" "2022-10-24" "ctest" "-R" "^test_ssl$" "-j" "1" "--timeout" "240" "--repeat" "until-pass:5" "--output-on-failure" failed with status 8
build process 18 exited with status 256
builder for `/gnu/store/hkji5nzsa32jngg7kii9bg9ch9kdvs84-libtorrent-rasterbar-2.0.10.drv' failed with exit code 1
build of /gnu/store/hkji5nzsa32jngg7kii9bg9ch9kdvs84-libtorrent-rasterbar-2.0.10.drv failed
View build log at '/var/log/guix/drvs/hk/ji5nzsa32jngg7kii9bg9ch9kdvs84-libtorrent-rasterbar-2.0.10.drv'.

--
Thanks,
Maxim
?
Your comment

Commenting via the web interface is currently disabled.

To comment on this conversation send an email to 73664@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 73664
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch