Cuirass: the 'nr' filter doesn't work when builds have multiple outputs

  • Done
  • quality assurance status badge
Details
2 participants
  • Clément Lassieur
  • Danny Milosavljevic
Owner
unassigned
Submitted by
Clément Lassieur
Severity
normal
C
C
Clément Lassieur wrote on 29 Jul 2018 01:21
Cuirass: the 'nr' filter doesn't when builds have multiple outputs
(address . bug-guix@gnu.org)
874lgjc5g9.fsf@lassieur.org
Hi,

With, say, 'nr' = 4, the GROUP-OUTPUTS procedure in cuirass/database.scm
will transform

0 | out | /gnu/store/...
1 | out | /gnu/store/...
1 | debug | /gnu/store/...
2 | out | /gnu/store/...

into

((#:id . 0) (#:outputs ("out" (#:path . "/gnu/store/..."))))
((#:id . 1) (#:outputs ("out" (#:path . "/gnu/store/..."))
("debug" (#:path . "/gnu/store/..."))))
((#:id . 2) (#:outputs ("out" (#:path . "/gnu/store/..."))))

Thus there are only 3 elements returned by the low-level DB-GET-BUILDS
procedure, while we expect 4.

This bug is visible through the API (latestbuilds, queue) and the web
interface (eval) because they use that DB-GET-BUILDS procedure.

Clément
C
C
Clément Lassieur wrote on 29 Jul 2018 01:23
control message for bug #32300
(address . control@debbugs.gnu.org)
8736w3c5cm.fsf@lassieur.org
retitle 32300 Cuirass: the 'nr' filter doesn't work when builds have multiple outputs
C
C
Clément Lassieur wrote on 29 Jul 2018 01:26
Re: bug#32300: Cuirass: the 'nr' filter doesn't work when builds have multiple outputs
(address . 32300@debbugs.gnu.org)
871sbnc57b.fsf@lassieur.org
Typo in subject: it doesn't *work.
C
C
Clément Lassieur wrote on 4 Aug 2018 18:00
[PATCH] database: Fix the builds limit issue.
(address . 32300@debbugs.gnu.org)
20180804160057.20254-1-clement@lassieur.org

* src/cuirass/database.scm (filters->order): New procedure.
(db-get-builds): Remove FORMAT-OUTPUT, CONS-OUTPUT, COLLECT-OUTPUTS,
FINISH-GROUP, SAME-GROUP?, GROUP-OUTPUTS procedures. Remove the 'LEFT JOIN
Outputs' clause. Use DB-GET-OUTPUTS for each build that was fetched.
---
src/cuirass/database.scm | 126 ++++++++++++---------------------------
1 file changed, 37 insertions(+), 89 deletions(-)

Toggle diff (156 lines)
diff --git a/src/cuirass/database.scm b/src/cuirass/database.scm
index 4927f2a..b4b1652 100644
--- a/src/cuirass/database.scm
+++ b/src/cuirass/database.scm
@@ -443,104 +443,33 @@ log file for DRV."
(cons `(,name . ((#:path . ,path)))
outputs))))))
+(define (filters->order filters)
+ (match (assq 'order filters)
+ (('order . 'build-id) "id ASC")
+ (('order . 'decreasing-build-id) "id DESC")
+ (('order . 'finish-time) "stoptime DESC")
+ (('order . 'finish-time+build-id) "stoptime DESC, id DESC")
+ (('order . 'start-time) "starttime DESC")
+ (('order . 'submission-time) "timestamp DESC")
+ ;; With this order, builds in 'running' state (-1) appear
+ ;; before those in 'scheduled' state (-2).
+ (('order . 'status+submission-time) "status DESC, timestamp DESC")
+ (_ "id DESC")))
+
(define (db-get-builds db filters)
"Retrieve all builds in database DB which are matched by given FILTERS.
FILTERS is an assoc list whose possible keys are 'id | 'jobset | 'job |
'system | 'nr | 'order | 'status | 'evaluation."
-
- (define (format-output name path)
- `(,name . ((#:path . ,path))))
-
- (define (cons-output name path rest)
- "If NAME and PATH are both not #f, cons them to REST.
-Otherwise return REST unchanged."
- (if (and (not name) (not path))
- rest
- (cons (format-output name path) rest)))
-
- (define (collect-outputs repeated-builds-id repeated-row outputs rows)
- "Given rows somewhat like
-1 'a 'b 2 'x
-^ 'c 'd 2 'x
-| ^^^^^ ^^^^
-| group ++++- group headers
-| detail
-+------------ group id
-
-return rows somewhat like
-
-1 2 'x '((a b) (c d))
-
-.
-
-As a special case, if the group detail is #f #f, ignore it.
-This is made specifically to support LEFT JOINs.
-
-Assumes that if group id stays the same the group headers stay the same."
- (define (finish-group)
- (match repeated-row
- (#(timestamp starttime stoptime log status derivation job-name system
- nix-name specification)
- `((#:id . ,repeated-builds-id)
- (#:timestamp . ,timestamp)
- (#:starttime . ,starttime)
- (#:stoptime . ,stoptime)
- (#:log . ,log)
- (#:status . ,status)
- (#:derivation . ,derivation)
- (#:job-name . ,job-name)
- (#:system . ,system)
- (#:nix-name . ,nix-name)
- (#:specification . ,specification)
- (#:outputs . ,outputs)))))
-
- (define (same-group? builds-id)
- (= builds-id repeated-builds-id))
-
- (match rows
- (() (list (finish-group)))
- ((#((? same-group? x-builds-id) x-output-name x-output-path other-cells ...) . rest)
- ;; Accumulate group members of current group.
- (let ((outputs (cons-output x-output-name x-output-path outputs)))
- (collect-outputs repeated-builds-id repeated-row outputs rest)))
- ((#(x-builds-id x-output-name x-output-path other-cells ...) . rest)
- (cons (finish-group) ;finish current group
-
- ;; Start new group.
- (let* ((outputs (cons-output x-output-name x-output-path '()))
- (x-repeated-row (list->vector other-cells)))
- (collect-outputs x-builds-id x-repeated-row outputs rest))))))
-
- (define (group-outputs rows)
- (match rows
- (() '())
- ((#(x-builds-id x-output-name x-output-path other-cells ...) . rest)
- (let ((x-repeated-row (list->vector other-cells)))
- (collect-outputs x-builds-id x-repeated-row '() rows)))))
-
- (let* ((order (match (assq 'order filters)
- (('order . 'build-id) "id ASC")
- (('order . 'decreasing-build-id) "id DESC")
- (('order . 'finish-time) "stoptime DESC")
- (('order . 'finish-time+build-id) "stoptime DESC, id DESC")
- (('order . 'start-time) "starttime DESC")
- (('order . 'submission-time) "timestamp DESC")
- (('order . 'status+submission-time)
- ;; With this order, builds in 'running' state (-1) appear
- ;; before those in 'scheduled' state (-2).
- "status DESC, timestamp DESC")
- (_ "id DESC")))
+ (let* ((order (filters->order filters))
(stmt-text (format #f "SELECT * FROM (
-SELECT Builds.id, Outputs.name, Outputs.path, Builds.timestamp,
-Builds.starttime, Builds.stoptime, Builds.log, Builds.status,
-Builds.derivation, Derivations.job_name, Derivations.system,
-Derivations.nix_name,Specifications.name
+SELECT Builds.id, Builds.timestamp, Builds.starttime, Builds.stoptime,
+Builds.log, Builds.status, Builds.derivation, Derivations.job_name,
+Derivations.system, Derivations.nix_name, Specifications.name
FROM Builds
INNER JOIN Derivations ON Builds.derivation = Derivations.derivation
AND Builds.evaluation = Derivations.evaluation
INNER JOIN Evaluations ON Derivations.evaluation = Evaluations.id
INNER JOIN Specifications ON Evaluations.specification = Specifications.name
-LEFT JOIN Outputs ON Outputs.build = Builds.id
WHERE (:id IS NULL OR (:id = Builds.id))
AND (:jobset IS NULL OR (:jobset = Specifications.name))
AND (:job IS NULL OR (:job = Derivations.job_name))
@@ -580,7 +509,26 @@ ORDER BY ~a, id ASC;" order))
(#f -1)
(x x)))
(sqlite-reset stmt)
- (group-outputs (sqlite-fold-right cons '() stmt))))
+ (let loop ((rows (sqlite-fold-right cons '() stmt))
+ (builds '()))
+ (match rows
+ (() (reverse builds))
+ ((#(id timestamp starttime stoptime log status derivation job-name
+ system nix-name specification) . rest)
+ (loop rest
+ (cons `((#:id . ,id)
+ (#:timestamp . ,timestamp)
+ (#:starttime . ,starttime)
+ (#:stoptime . ,stoptime)
+ (#:log . ,log)
+ (#:status . ,status)
+ (#:derivation . ,derivation)
+ (#:job-name . ,job-name)
+ (#:system . ,system)
+ (#:nix-name . ,nix-name)
+ (#:specification . ,specification)
+ (#:outputs . ,(db-get-outputs db id)))
+ builds)))))))
(define (db-get-build db id)
"Retrieve a build in database DB which corresponds to ID."
--
2.18.0
C
C
Clément Lassieur wrote on 4 Aug 2018 18:10
(address . 32300@debbugs.gnu.org)
87sh3ui038.fsf@lassieur.org
Clément Lassieur <clement@lassieur.org> writes:

Toggle quote (7 lines)
>
> * src/cuirass/database.scm (filters->order): New procedure.
> (db-get-builds): Remove FORMAT-OUTPUT, CONS-OUTPUT, COLLECT-OUTPUTS,
> FINISH-GROUP, SAME-GROUP?, GROUP-OUTPUTS procedures. Remove the 'LEFT JOIN
> Outputs' clause. Use DB-GET-OUTPUTS for each build that was fetched.

This may be less efficient because there are more SQL queries (one per
output), but it's way less complicated and less buggy, so I think it's
worth it.
D
D
Danny Milosavljevic wrote on 7 Aug 2018 12:46
(name . Clément Lassieur)(address . clement@lassieur.org)(address . 32300@debbugs.gnu.org)
20180807124626.29e6a12a@scratchpost.org
Hi Clément,

On Sat, 04 Aug 2018 18:10:51 +0200
Clément Lassieur <clement@lassieur.org> wrote:

Toggle quote (13 lines)
> Clément Lassieur <clement@lassieur.org> writes:
>
> > Fixes <https://bugs.gnu.org/32300>.
> >
> > * src/cuirass/database.scm (filters->order): New procedure.
> > (db-get-builds): Remove FORMAT-OUTPUT, CONS-OUTPUT, COLLECT-OUTPUTS,
> > FINISH-GROUP, SAME-GROUP?, GROUP-OUTPUTS procedures. Remove the 'LEFT JOIN
> > Outputs' clause. Use DB-GET-OUTPUTS for each build that was fetched.
>
> This may be less efficient because there are more SQL queries (one per
> output), but it's way less complicated and less buggy, so I think it's
> worth it.

The more complicated version is a LOT faster - and was added because
the version in this patch was just way too slow (unusably slow).

I think it's better to also remove the call to db-get-outputs (and
the entry #:outputs) entirely. I don't think our overview page even
shows the outputs in the first place, so why fetch them?
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAltpeIMACgkQ5xo1VCww
uqUjkwf+Lh7+SmRc8RZ0LvbpPoX2Grun4UCj/RAqv0DruNFSuDh+ukz+uU+gBfsA
DTX0nSKyrNDm5ECAOUrmePx1GkP+joGYsvzzor5Txz+keP1KATnv0MKqZl+NbTyy
pC6OpWc4i/F42L03O8yuOQsOel9iRO4jZF/+reO1vIVcMOZW8jnTUxpApouzkTAf
BleQ/U/YMibzpFONSX7/OxVuRoJcjj+qSSduIS1J1BEddPJ+GlgML8pPDqUBEALi
aFtt3xSe6vDnN1KiHJQxeS7qiGgZTNDbSTTCLeGa/0qm9AkrJS+3a9qFBbdFjEo5
6ksXjeQyShBrxysdAURU1rD0EhvYpA==
=ibYz
-----END PGP SIGNATURE-----


C
C
Clément Lassieur wrote on 8 Aug 2018 13:44
(name . Danny Milosavljevic)(address . dannym@scratchpost.org)(address . 32300@debbugs.gnu.org)
878t5h13rn.fsf@lassieur.org
Hi Danny,

Danny Milosavljevic <dannym@scratchpost.org> writes:

Toggle quote (7 lines)
> The more complicated version is a LOT faster - and was added because
> the version in this patch was just way too slow (unusably slow).
>
> I think it's better to also remove the call to db-get-outputs (and
> the entry #:outputs) entirely. I don't think our overview page even
> shows the outputs in the first place, so why fetch them?

As you can see on my test machine with a Berlin database, it's almost
instantaneous. Also, with my commit that merges the Derivations and the
Builds tables, the queries are way lighter: they return about 100 builds
per evaluation, instead of 20000.

Plus, the outputs are used to get the build log.


WDYT?
Clément
C
C
Clément Lassieur wrote on 8 Aug 2018 17:32
(address . 32300@debbugs.gnu.org)
874lg427t5.fsf@lassieur.org
Clément Lassieur <clement@lassieur.org> writes:

Toggle quote (20 lines)
> Hi Danny,
>
> Danny Milosavljevic <dannym@scratchpost.org> writes:
>
>> The more complicated version is a LOT faster - and was added because
>> the version in this patch was just way too slow (unusably slow).
>>
>> I think it's better to also remove the call to db-get-outputs (and
>> the entry #:outputs) entirely. I don't think our overview page even
>> shows the outputs in the first place, so why fetch them?
>
> As you can see on my test machine with a Berlin database, it's almost
> instantaneous. Also, with my commit that merges the Derivations and the
> Builds tables, the queries are way lighter: they return about 100 builds
> per evaluation, instead of 20000.
>
> Plus, the outputs are used to get the build log.
>
> https://cuirass.lassieur.org:8081/jobset/guix-master

I just reverted to my own Cuirass config because my hard drive is too
small (256G...) to build everything.

And I added Tatiana's recent changes.

Clément
D
D
Danny Milosavljevic wrote on 9 Aug 2018 07:57
(name . Clément Lassieur)(address . clement@lassieur.org)(address . 32300@debbugs.gnu.org)
20180809075709.44883c92@scratchpost.org
Hi,

On Sat, 04 Aug 2018 18:10:51 +0200
Clément Lassieur <clement@lassieur.org> wrote:

Toggle quote (13 lines)
> Clément Lassieur <clement@lassieur.org> writes:
>
> > Fixes <https://bugs.gnu.org/32300>.
> >
> > * src/cuirass/database.scm (filters->order): New procedure.
> > (db-get-builds): Remove FORMAT-OUTPUT, CONS-OUTPUT, COLLECT-OUTPUTS,
> > FINISH-GROUP, SAME-GROUP?, GROUP-OUTPUTS procedures. Remove the 'LEFT JOIN
> > Outputs' clause. Use DB-GET-OUTPUTS for each build that was fetched.
>
> This may be less efficient because there are more SQL queries (one per
> output), but it's way less complicated and less buggy, so I think it's
> worth it.

Yeah, if it's still usable, I agree.

But I think we shouldn't overlook the possibility of not fetching the outputs
in the first place (at all).
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEds7GsXJ0tGXALbPZ5xo1VCwwuqUFAltr17UACgkQ5xo1VCww
uqXtHwf+MtPLJDj+rcjxQd/C0Bk1cB7r4hvAprotwj41XELmCm1wiTQL5vwRimrf
Andv3Xfp8YldJeBhj4ohq4DhYK6APID1iW05lrP41zhe9HrG5cSlAxv6acR7/I12
OAEUoicv5p9jjqnpm/0H5NaWm8afAKAAXx5+n4RnlKQJqfCRFct+o4qVn3JxXkZU
I+vVzdN6kIAAbr8z3etWrczncVbZZP47yd6Pkj+AL+BIuhk7zxsG+d4xkhcR0EJz
j2qU7DhL399Z9eW7Mqd5KZX71UlubrBBikrgwCArj1Kkhr8EN/jCadm7NSddZT0w
FRGLoaafdMMWbs7SWeGwFjVz698uag==
=7GBG
-----END PGP SIGNATURE-----


C
C
Clément Lassieur wrote on 9 Aug 2018 10:02
(name . Danny Milosavljevic)(address . dannym@scratchpost.org)(address . 32300@debbugs.gnu.org)
87k1p0otlk.fsf@lassieur.org
Danny Milosavljevic <dannym@scratchpost.org> writes:

Toggle quote (23 lines)
> Hi,
>
> On Sat, 04 Aug 2018 18:10:51 +0200
> Clément Lassieur <clement@lassieur.org> wrote:
>
>> Clément Lassieur <clement@lassieur.org> writes:
>>
>> > Fixes <https://bugs.gnu.org/32300>.
>> >
>> > * src/cuirass/database.scm (filters->order): New procedure.
>> > (db-get-builds): Remove FORMAT-OUTPUT, CONS-OUTPUT, COLLECT-OUTPUTS,
>> > FINISH-GROUP, SAME-GROUP?, GROUP-OUTPUTS procedures. Remove the 'LEFT JOIN
>> > Outputs' clause. Use DB-GET-OUTPUTS for each build that was fetched.
>>
>> This may be less efficient because there are more SQL queries (one per
>> output), but it's way less complicated and less buggy, so I think it's
>> worth it.
>
> Yeah, if it's still usable, I agree.
>
> But I think we shouldn't overlook the possibility of not fetching the outputs
> in the first place (at all).

I totally agree! Although it should be another commit, I think.
C
C
Clément Lassieur wrote on 16 Aug 2018 22:59
(name . Danny Milosavljevic)(address . dannym@scratchpost.org)(address . 32300-done@debbugs.gnu.org)
87zhxm3u4k.fsf@lassieur.org
Toggle quote (23 lines)
> Hi,
>
> On Sat, 04 Aug 2018 18:10:51 +0200
> Clément Lassieur <clement@lassieur.org> wrote:
>
>> Clément Lassieur <clement@lassieur.org> writes:
>>
>> > Fixes <https://bugs.gnu.org/32300>.
>> >
>> > * src/cuirass/database.scm (filters->order): New procedure.
>> > (db-get-builds): Remove FORMAT-OUTPUT, CONS-OUTPUT, COLLECT-OUTPUTS,
>> > FINISH-GROUP, SAME-GROUP?, GROUP-OUTPUTS procedures. Remove the 'LEFT JOIN
>> > Outputs' clause. Use DB-GET-OUTPUTS for each build that was fetched.
>>
>> This may be less efficient because there are more SQL queries (one per
>> output), but it's way less complicated and less buggy, so I think it's
>> worth it.
>
> Yeah, if it's still usable, I agree.
>
> But I think we shouldn't overlook the possibility of not fetching the outputs
> in the first place (at all).

Pushed. Thank you for the review!

Clément
Closed
?