Cuirass: Performance monitoring

  • Done
  • quality assurance status badge
Details
6 participants
  • Andreas Enge
  • Bonface M. K.
  • Ludovic Courtès
  • Christopher Baines
  • Mathieu Othacehe
  • zimoun
Owner
unassigned
Submitted by
Ludovic Courtès
Severity
normal
L
L
Ludovic Courtès wrote on 28 Aug 2018 00:33
(address . bug-guix@gnu.org)
87pny3783p.fsf@gnu.org
As discussed earlier today on IRC with Clément, we could add performance
monitoring capabilities to Cuirass. Interesting metrics would be:

• time of push to time of evaluation completion;

• time of evaluation completion to time of build completion.

We could visualize that per job over time. Perhaps these are also stats
that ‘guix weather’ could display.

Ludo’.
M
M
Mathieu Othacehe wrote on 6 Sep 2020 16:42
(name . Ludovic Courtès)(address . ludo@gnu.org)(address . 32548@debbugs.gnu.org)
87d02zge5e.fsf@gnu.org
Hello,

Toggle quote (7 lines)
> As discussed earlier today on IRC with Clément, we could add performance
> monitoring capabilities to Cuirass. Interesting metrics would be:
>
> • time of push to time of evaluation completion;
>
> • time of evaluation completion to time of build completion.

Small update on that one. With Cuirass commit
154232bc767d002f69aa6bb1cdddfd108b98584b, we now have the following
timestamps:

* Checkout commit time.
* Evaluation creation.
* Evaluation checkouts completion.
* Evaluation completion.

For the first timestamp, I'm using Guile-Git to extract the commit time,
which is not the commit push time. In fact, I think there is no such
thing as "commit push time" in git.

We can still compute the metric 'time of commit to time of evaluation
completion', but it's less relevant than the proposed 'time of push to
time of evaluation completion'.

The other proposed metric, 'time of evaluation completion to time of
build completion' can now be computed.

Regarding the actual computation and reporting of those metrics, I'm
still considering different options. I'd like to have a look to
Guile-prometheus that is written by Christopher.

Thanks,

Mathieu
C
C
Christopher Baines wrote on 6 Sep 2020 20:51
(name . Mathieu Othacehe)(address . othacehe@gnu.org)
87y2lmlox2.fsf@cbaines.net
Mathieu Othacehe <othacehe@gnu.org> writes:

Toggle quote (22 lines)
> Hello,
>
>> As discussed earlier today on IRC with Clément, we could add performance
>> monitoring capabilities to Cuirass. Interesting metrics would be:
>>
>> • time of push to time of evaluation completion;
>>
>> • time of evaluation completion to time of build completion.
>
> Small update on that one. With Cuirass commit
> 154232bc767d002f69aa6bb1cdddfd108b98584b, we now have the following
> timestamps:
>
> * Checkout commit time.
> * Evaluation creation.
> * Evaluation checkouts completion.
> * Evaluation completion.
>
> For the first timestamp, I'm using Guile-Git to extract the commit time,
> which is not the commit push time. In fact, I think there is no such
> thing as "commit push time" in git.

I had this issue with the Guix Data Service as well, it uses the
timestamp in the email sent by the Savannah git hook, which is the
closest I've got to "commit push time".

Toggle quote (4 lines)
> We can still compute the metric 'time of commit to time of evaluation
> completion', but it's less relevant than the proposed 'time of push to
> time of evaluation completion'.

As someone can commit, then potentially push those commits hours later,
assuming no one else has pushed, this data might be a bit noisy. Time
between Curiass noticing the new commit to the evaluation completion
might be cleaner.
-----BEGIN PGP SIGNATURE-----

iQKTBAEBCgB9FiEEPonu50WOcg2XVOCyXiijOwuE9XcFAl9VL5lfFIAAAAAALgAo
aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldDNF
ODlFRUU3NDU4RTcyMEQ5NzU0RTBCMjVFMjhBMzNCMEI4NEY1NzcACgkQXiijOwuE
9XckjQ/+IB9r2/HsVcnXPQ794Vs8AAm1yp6xalHSAxYct41+S+YusCWODECbPj/j
0aE1q/TUDGkZ5K8VzlSrEJnfNzbyQXva6wfBKW1tbdfgGL+q7J4wQnq2ntnp4YPK
YrWtbQpnXdJT/9cCH2aOeR8pN+gSBO2YbV+PGNOuzxXEJmmBmHkX5oQdmy79I5va
8rwoStXEilKqQtkxpLWBbEpHf6jzMAeLyJivSHxLFkCm4ZPNxsz74ObGHdV25Ivr
gCoizfecW/rKXBmJkI3utVdsJJNuaf1S2kPTaaWrt/1jPMJ/lSXSzwOCovIffS1Q
lSzXP7KA9F7bBAUHJu93+MeFCpKMBG5FkbI61ZfGqP449NpKPf/BqvWG99hHKe2J
nP+tuubW5fRc9fsbZ51qtIvJenI7OyvrThuKx1K0v/3oE1aaQdGcQiR6FeB51QeS
wQFn7Q4X7KmejhZsJ6BsPlRdJdN1BfkUJdzeYDOXyEk9b2NZKXn/DAs2/KsWVWAm
XnUOOglm3bsEwdPJ6v22YIogSfnCCoHIZHKgvdATLHPEHI54e5td9GCEk5DpwYwh
GjqFU/XamR4IRiozzN2x5dqOlxQKeSor/+8ByN5qDteOj1XH/Q4JADSdeZ6fs/pO
ITs7C3cK/TRR80mM/S44DACBZ2PFtUjmwnhFkYy5ZOUWL8IaY/4=
=xr6N
-----END PGP SIGNATURE-----

L
L
Ludovic Courtès wrote on 7 Sep 2020 10:11
(name . Christopher Baines)(address . mail@cbaines.net)
874koarop0.fsf@gnu.org
Hi,

Christopher Baines <mail@cbaines.net> skribis:

Toggle quote (28 lines)
> Mathieu Othacehe <othacehe@gnu.org> writes:
>
>> Hello,
>>
>>> As discussed earlier today on IRC with Clément, we could add performance
>>> monitoring capabilities to Cuirass. Interesting metrics would be:
>>>
>>> • time of push to time of evaluation completion;
>>>
>>> • time of evaluation completion to time of build completion.
>>
>> Small update on that one. With Cuirass commit
>> 154232bc767d002f69aa6bb1cdddfd108b98584b, we now have the following
>> timestamps:
>>
>> * Checkout commit time.
>> * Evaluation creation.
>> * Evaluation checkouts completion.
>> * Evaluation completion.
>>
>> For the first timestamp, I'm using Guile-Git to extract the commit time,
>> which is not the commit push time. In fact, I think there is no such
>> thing as "commit push time" in git.
>
> I had this issue with the Guix Data Service as well, it uses the
> timestamp in the email sent by the Savannah git hook, which is the
> closest I've got to "commit push time".

Neat.

Toggle quote (9 lines)
>> We can still compute the metric 'time of commit to time of evaluation
>> completion', but it's less relevant than the proposed 'time of push to
>> time of evaluation completion'.
>
> As someone can commit, then potentially push those commits hours later,
> assuming no one else has pushed, this data might be a bit noisy. Time
> between Curiass noticing the new commit to the evaluation completion
> might be cleaner.

Agreed. We regularly push commits that are weeks or months old
(sometimes years), so there might be too many outliers when looking at
the commit time.

Thanks for pushing this, Mathieu!

Ludo’.
M
M
Mathieu Othacehe wrote on 10 Sep 2020 15:26
(name . Ludovic Courtès)(address . ludo@gnu.org)
87k0x14vax.fsf@gnu.org
Hello,

Toggle quote (4 lines)
> Agreed. We regularly push commits that are weeks or months old
> (sometimes years), so there might be too many outliers when looking at
> the commit time.

Yes, so I used checkout time instead of commit time with
af12a80599346968fb9f52edb33b48dd26852788.

I also turned Evaluation 'in_progress' field into 'status' field. This
way it's much easier to create some metrics on evaluations. It also
allows to distinguish between 'aborted' and 'failed' evaluations.

Thanks,

Mathieu
M
M
Mathieu Othacehe wrote on 14 Sep 2020 15:34
(address . 32548@debbugs.gnu.org)
87tuw05vom.fsf@gnu.org
Hello,

I just pushed support for computing and displaying metrics in Cuirass. I
started with two metrics:

* Builds per day
* Average evaluation speed per specification.

Those metrics can now be seen at:


and are updated every hour.

I plan to add more metrics such as:

- Evaluation completion percentage.
- Evaluation completion speed.
- Failed evaluations percentage.
- Pending builds per day.

Don't hesitate to comment or propose other metrics.

Thanks,

Mathieu
Z
Z
zimoun wrote on 14 Sep 2020 16:10
(name . Mathieu Othacehe)(address . othacehe@gnu.org)
CAJ3okZ3TA-xnm1+nS_qxmjkeat=B+GETHrQ6PoV2YXSH9Q_c-A@mail.gmail.com
Hi Mathieu,

Really cool!

On Mon, 14 Sep 2020 at 15:35, Mathieu Othacehe <othacehe@gnu.org> wrote:

Toggle quote (3 lines)
> * Builds per day
> * Average evaluation speed per specification.

Something interesting could be: min and max (of 100 evaluations).
The standard error deviation too but I am not sure it is easy to
interpret with a quick look. Instead, the median could be
interesting.

For example, consider these 2 evaluations:


Well, if there is say 99 evaluations of first "kind" and 1 of second
kind, the average is:
(99*849 + 1_595_796_252) / 100 = 15_958_803.03
which does not really represent the effective workload.

Well, I will try to give a look if I can schedule a moment. :-)


Toggle quote (4 lines)
> Those metrics can now be seen at:
>
> https://ci.guix.gnu.org/metrics

Nice plot!


Toggle quote (7 lines)
> I plan to add more metrics such as:
>
> - Evaluation completion percentage.
> - Evaluation completion speed.
> - Failed evaluations percentage.
> - Pending builds per day.

Cool!
Maybe time between commit time (not author time) and start of the build.


Cheers,
simon
L
L
Ludovic Courtès wrote on 14 Sep 2020 21:27
(name . Mathieu Othacehe)(address . othacehe@gnu.org)
87o8m8b1l8.fsf@gnu.org
Hi!

Mathieu Othacehe <othacehe@gnu.org> skribis:

Toggle quote (12 lines)
> I just pushed support for computing and displaying metrics in Cuirass. I
> started with two metrics:
>
> * Builds per day
> * Average evaluation speed per specification.
>
> Those metrics can now be seen at:
>
> https://ci.guix.gnu.org/metrics
>
> and are updated every hour.

This is very cool, thumbs up!

Toggle quote (7 lines)
> I plan to add more metrics such as:
>
> - Evaluation completion percentage.
> - Evaluation completion speed.
> - Failed evaluations percentage.
> - Pending builds per day.

That’d be awesome.

As discussed on IRC, builds per day should be compared to new
derivations per day. For example, if on a day there’s 100 new
derivations and we only manage to build 10 of them, we have a problem.

BTW, in cuirass.log I noticed this:

Toggle snippet (18 lines)
2020-09-14T21:16:21 Updating metric average-eval-duration-per-spec (guix-modular-master) to 414.8085106382979.
2020-09-14T21:16:21 Failed to compute metric average-10-last-eval-duration-per-spec (kernel-updates).
2020-09-14T21:16:21 Failed to compute metric average-100-last-eval-duration-per-spec (kernel-updates).
2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (kernel-updates).
2020-09-14T21:16:21 Failed to compute metric average-10-last-eval-duration-per-spec (staging-staging).
2020-09-14T21:16:21 Failed to compute metric average-100-last-eval-duration-per-spec (staging-staging).
2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (staging-staging).
2020-09-14T21:16:21 Failed to compute metric average-10-last-eval-duration-per-spec (version-1.0.1).
2020-09-14T21:16:21 Failed to compute metric average-100-last-eval-duration-per-spec (version-1.0.1).
2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (version-1.0.1).
2020-09-14T21:16:21 Failed to compute metric average-10-last-eval-duration-per-spec (version-1.1.0).
2020-09-14T21:16:21 Failed to compute metric average-100-last-eval-duration-per-spec (version-1.1.0).
2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (version-1.1.0).
2020-09-14T21:16:21 Failed to compute metric average-10-last-eval-duration-per-spec (wip-desktop).
2020-09-14T21:16:21 Failed to compute metric average-100-last-eval-duration-per-spec (wip-desktop).
2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (wip-desktop).

Perhaps it can’t compute an average yet for these jobsets?

Thanks!

Ludo’.
B
B
Bonface M. K. wrote on 16 Sep 2020 04:21
(name . zimoun)(address . zimon.toutoune@gmail.com)
867dsu5unm.fsf@gmail.com
Hi all.

zimoun <zimon.toutoune@gmail.com> writes:

Toggle quote (20 lines)
> Hi Mathieu,
>
> Really cool!
>
> On Mon, 14 Sep 2020 at 15:35, Mathieu Othacehe <othacehe@gnu.org> wrote:
>
>> * Builds per day
>> * Average evaluation speed per specification.
>
> Something interesting could be: min and max (of 100 evaluations).
> The standard error deviation too but I am not sure it is easy to
> interpret with a quick look. Instead, the median could be
> interesting.
>
> For example, consider these 2 evaluations:
>
> https://ci.guix.gnu.org/build/2094496/details
> https://ci.guix.gnu.org/build/3035986/details
>

I'm getting a 504 Gateway Time-out error when
visiting the above links(at the time of sending
this email).

Toggle quote (33 lines)
> Well, if there is say 99 evaluations of first "kind" and 1 of second
> kind, the average is:
> (99*849 + 1_595_796_252) / 100 = 15_958_803.03
> which does not really represent the effective workload.
>
> Well, I will try to give a look if I can schedule a moment. :-)
>
>
>> Those metrics can now be seen at:
>>
>> https://ci.guix.gnu.org/metrics
>
> Nice plot!
>
>
>> I plan to add more metrics such as:
>>
>> - Evaluation completion percentage.
>> - Evaluation completion speed.
>> - Failed evaluations percentage.
>> - Pending builds per day.
>
> Cool!
> Maybe time between commit time (not author time) and start of the build.
>
>
> Cheers,
> simon
>
>
>
>

--
Bonface M. K. (https://www.bonfacemunyoki.com)
Chief Emacs Mchochezi
GPG key = D4F09EB110177E03C28E2FE1F5BBAE1E0392253F
-----BEGIN PGP SIGNATURE-----

iQJNBAEBCAA3FiEE1PCesRAXfgPCji/h9buuHgOSJT8FAl9hdpEZHGJvbmZhY2Vt
dW55b2tpQGdtYWlsLmNvbQAKCRD1u64eA5IlP+z1D/wLFJVaZ+AMvA1NKSgb0kaz
5iJo2yGLDz6tjgo1Y0VxJLTcmQ4mJ1+b8SAfGNwK5H1qwEBaA2X8nTxKCeZZ/8zC
67lima2VZMAIcNUZ2iwhwV1xVOxPVrDI4fo3fKlZzhyvoxZyR+4k/n5n6Ig2XveE
yHUu2MyF2NfdSjuWtXHbUrENzVl1xXq5Bta7euqPjij8Qt0/1RkXfQf9JE21dZ6+
ei/YI2y8UOsC9Gp44XfJydR4qlOA62YCJ0n+8bh/qZEUNzKM+rbvWeApkQnkHQDP
iVCxvZADQLlYYbGnMYnzJxNz671tZx7O//Xjorfp3turBBodpq5XGKEfHzQySwJu
PuRFWL9IaXgRWN84nzYgSfYCWuhg0oVycYJkfNfDiyXTwV8bkMUw+tsBN4BHjhIA
hYZS17qkYn9Mu4fgMrjYFiY1tW0G1AxgT4QWlS/cpo9BTmB98YrN+vdRmYDuYAb6
m0aJvMuFwsceqo3gfhV/ZwMI3qQ/SrXY+RedahUsLukz+PgawfwYvTXuRyI4LAnh
7Qy2bbIi7NWuwtNpMCWzGUIzuF7P7uTK6Qqmzzt5/0ofLsaEZrfh4bP/xu7ArDFO
7ynEe610QCUjNhuzfgzq/gJySGzup19xqNvZTUeUOz+o1Og0KjfcG3F5xqEQQoru
BqMpDDM0mLZiAWr7C55DCA==
=cn3h
-----END PGP SIGNATURE-----

A
A
Andreas Enge wrote on 16 Sep 2020 17:56
(name . Mathieu Othacehe)(address . othacehe@gnu.org)
20200916155639.GA16338@jurong
On Mon, Sep 14, 2020 at 03:34:17PM +0200, Mathieu Othacehe wrote:
Toggle quote (7 lines)
> I just pushed support for computing and displaying metrics in Cuirass. I
> started with two metrics:
> * Builds per day
> * Average evaluation speed per specification.
> Those metrics can now be seen at:
> https://ci.guix.gnu.org/metrics

Congratulations, that looks like a very useful start already!
(And the number of builds has doubled since yesterday, so someone already
put it to good use!)

How about also adding metrics per build machine? I have the impression,
for instance, that the aarch64 machine in my living room is not used.
If this is confirmed, we could take appropriate action (uncomment it in
/etc/machines.scm :-), compare to other used machines, change the scheduling
in the daemon, or even turn it off to conserve energy should it turn out
that we have too much build power...).

Andreas
M
M
Mathieu Othacehe wrote on 17 Sep 2020 09:10
(name . Andreas Enge)(address . andreas@enge.fr)(address . 32548@debbugs.gnu.org)
87een098un.fsf@gnu.org
Hello Andreas,

Toggle quote (4 lines)
> Congratulations, that looks like a very useful start already!
> (And the number of builds has doubled since yesterday, so someone already
> put it to good use!)

Thanks for your feedback :)

Toggle quote (7 lines)
> How about also adding metrics per build machine? I have the impression,
> for instance, that the aarch64 machine in my living room is not used.
> If this is confirmed, we could take appropriate action (uncomment it in
> /etc/machines.scm :-), compare to other used machines, change the scheduling
> in the daemon, or even turn it off to conserve energy should it turn out
> that we have too much build power...).

Yes I would really like to have something like:
https://hydra.nixos.org/machines,with a build rate for every machine.

However, it cannot be done without structural changes to how offloading
is handled. For now it's working this way:

Cuirass -> guix-daemon -> guix offload -> build machines

Which means that Cuirass has almost no information about offloaded
builds. We are currently starting discussions about inviting the Guix
Build Coordinator to the party.

That could maybe help us implement what you are proposing, among other
things.

Thanks,

Mathieu
M
M
Mathieu Othacehe wrote on 17 Sep 2020 12:07
(name . Ludovic Courtès)(address . ludo@gnu.org)
877dss90ne.fsf@gnu.org
Hey Ludo,

Toggle quote (4 lines)
> As discussed on IRC, builds per day should be compared to new
> derivations per day. For example, if on a day there’s 100 new
> derivations and we only manage to build 10 of them, we have a problem.

I added this line, and they sadly do not overlap :(

Toggle quote (7 lines)
> 2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (version-1.1.0).
> 2020-09-14T21:16:21 Failed to compute metric average-10-last-eval-duration-per-spec (wip-desktop).
> 2020-09-14T21:16:21 Failed to compute metric average-100-last-eval-duration-per-spec (wip-desktop).
> 2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (wip-desktop).
>
> Perhaps it can’t compute an average yet for these jobsets?

Yes as soon as those evaluations will be repaired, we should be able to
compute those metrics. I chose to keep the error messages as a
remainder.

I added various other metrics and updated the "/metrics" page. Once we
have a better view, we should think of adding thresholds on those
metrics.

Closing this one!

Thanks,

Mathieu

--
Closed
L
L
Ludovic Courtès wrote on 17 Sep 2020 22:22
(name . Mathieu Othacehe)(address . othacehe@gnu.org)
87y2l85f2u.fsf@gnu.org
Hi,

Mathieu Othacehe <othacehe@gnu.org> skribis:

Toggle quote (6 lines)
>> As discussed on IRC, builds per day should be compared to new
>> derivations per day. For example, if on a day there’s 100 new
>> derivations and we only manage to build 10 of them, we have a problem.
>
> I added this line, and they sadly do not overlap :(

It seems less bad than I thought though, and the rendering is pretty.
:-)

Toggle quote (11 lines)
>> 2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (version-1.1.0).
>> 2020-09-14T21:16:21 Failed to compute metric average-10-last-eval-duration-per-spec (wip-desktop).
>> 2020-09-14T21:16:21 Failed to compute metric average-100-last-eval-duration-per-spec (wip-desktop).
>> 2020-09-14T21:16:21 Failed to compute metric average-eval-duration-per-spec (wip-desktop).
>>
>> Perhaps it can’t compute an average yet for these jobsets?
>
> Yes as soon as those evaluations will be repaired, we should be able to
> compute those metrics. I chose to keep the error messages as a
> remainder.

Makes sense.

Toggle quote (4 lines)
> I added various other metrics and updated the "/metrics" page. Once we
> have a better view, we should think of adding thresholds on those
> metrics.

Excellent.

Thanks a lot for closing this gap!

Ludo’.
Closed
L
L
Ludovic Courtès wrote on 18 Sep 2020 14:21
(name . Mathieu Othacehe)(address . othacehe@gnu.org)
87mu1nxoku.fsf@gnu.org
Hi Mathieu!

Mathieu Othacehe <othacehe@gnu.org> skribis:

Toggle quote (10 lines)
>> How about also adding metrics per build machine? I have the impression,
>> for instance, that the aarch64 machine in my living room is not used.
>> If this is confirmed, we could take appropriate action (uncomment it in
>> /etc/machines.scm :-), compare to other used machines, change the scheduling
>> in the daemon, or even turn it off to conserve energy should it turn out
>> that we have too much build power...).
>
> Yes I would really like to have something like:
> https://hydra.nixos.org/machines, with a build rate for every machine.

+1!

Toggle quote (8 lines)
> However, it cannot be done without structural changes to how offloading
> is handled. For now it's working this way:
>
> Cuirass -> guix-daemon -> guix offload -> build machines
>
> Which means that Cuirass has almost no information about offloaded
> builds.

In practice, it could parse the offload events that it gets; a bit of a
hack, but good enough. However…

Toggle quote (3 lines)
> We are currently starting discussions about inviting the Guix Build
> Coordinator to the party.

… this sounds like the better option longer-term.

Ludo’.
?
Your comment

This issue is archived.

To comment on this conversation send an email to 32548@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 32548
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch