People need to report failing builds even though we have ci.guix.gnu.org for that

  • Done
  • quality assurance status badge
Details
10 participants
  • Andreas Enge
  • Dr. Arne Babenhauserheide
  • Andy Tai
  • Giovanni Biscuolo
  • ???
  • Maxim Cournoyer
  • Maxime Devos
  • Bruno Victal
  • Csepp
  • Simon Tournier
Owner
unassigned
Submitted by
Maxime Devos
Severity
normal
M
M
Maxime Devos wrote on 20 Aug 2023 01:53
(name . bug-guix)(address . bug-guix@gnu.org)
295ef8c8-574a-4169-98f3-6d9aaeb773f1@telenet.be
For example, naev used to work just fine, yet apparently it doesn't
Given that Guix has ci.guix.gnu.org, I would expect such new problems to
be detected and resolved early, and it was detected by ci.guix.gnu.org,
yet going by issues.guix.gnu.org it was never even investigated.
(Yes, there is a delay, but that doesn't matter at all, as there's this
Do people really need to report 33% of all jobs
are taken seriously, instead of the ‘there don't seem to be that much
more build failures from the core-updates/... merge, let's solve them
later (i.e., never)’ that seems to be status quo?
Best regards,
Maxime Devos
Attachment: OpenPGP_signature
C
(name . Maxime Devos)(address . maximedevos@telenet.be)
874jkqeiox.fsf@riseup.net
Maxime Devos <maximedevos@telenet.be> writes:

Toggle quote (25 lines)
> [[PGP Signed Part:Undecided]]
> For example, naev used to work just fine, yet apparently it doesn't
> anymore: https://issues.guix.gnu.org/65390.
>
> Given that Guix has ci.guix.gnu.org, I would expect such new problems
> to be detected and resolved early, and it was detected by
> ci.guix.gnu.org, yet going by issues.guix.gnu.org it was never even
> investigated.
>
> (Yes, there is a delay, but that doesn't matter at all, as there's
> this dashboard <https://ci.guix.gnu.org/eval/668365/dashboard>.)
>
> Do people really need to report 33% of all jobs
> (https://ci.guix.gnu.org/eval/668365/dashboard) before those failures
> are taken seriously, instead of the ‘there don't seem to be that much
> more build failures from the core-updates/... merge, let's solve them
> later (i.e., never)’ that seems to be status quo?
>
> Best regards,
> Maxime Devos
>
> [2. OpenPGP public key --- application/pgp-keys; OpenPGP_0x49E3EE22191725EE.asc]...
>
> [[End of PGP Signed Part]]

I tried signing up to the CI mailing list and it immediately became
overwhelming.
Also the CI UI could use some improvements. I'm pretty sure I've
mentioned this before, but there is no easy way to find out which inputs
I need to fix to make a dependency failure disappear. I think everyone
has better things to do than perform a linear search by hand.
So I rely on my own installations for detecting errors, that way I at
least know that I don't get flooded with notifications for packages I
know nothing about.
One possible improvement I have been thinking about is making it easy
for users to filter CI output to the packages in their profile closure,
so for example they would get advance notice of any broken packages
*before* attempting to install them.
Teams could also have their own filters.
S
S
Simon Tournier wrote on 24 Aug 2023 11:57
(address . 65391@debbugs.gnu.org)
86zg2gyd7n.fsf@gmail.com
Hi,

On Wed, 23 Aug 2023 at 01:45, Csepp <raingloom@riseup.net> wrote:

Toggle quote (6 lines)
> One possible improvement I have been thinking about is making it easy
> for users to filter CI output to the packages in their profile closure,
> so for example they would get advance notice of any broken packages
> *before* attempting to install them.
> Teams could also have their own filters.

Maybe I am missing what you would like, from my understanding, that’s
already possible using time-machine and weather. For example,

guix time-machine -- weather -m manifest.scm

allow to know the status of the last commit. What is missing is a clear
return code for chaining. For instance, see this proposal:

subject: guix weather exit status?
from: Leo Famulari <leo@famulari.name>
date: Thu, 08 Jul 2021 16:35:03 -0400
message-id: id:YOdhd7FfMOvKjTQe@jasmine.lan

However, I agree that the next step (find the log of the broken package)
for teams is a bit convoluted.

Cheers,
simon
M
M
Maxime Devos wrote on 24 Aug 2023 16:52
(name . Csepp)(address . raingloom@riseup.net)
ad986d87-4da7-3df4-0cd5-0fb156d0498c@telenet.be
Op 23-08-2023 om 01:45 schreef Csepp:
Toggle quote (4 lines)
> Also the CI UI could use some improvements. I'm pretty sure I've
> mentioned this before, but there is no easy way to find out which inputs
> I need to fix to make a dependency failure disappear. I think everyone
> has better things to do than perform a linear search by hand.
Go to the package of a failed build, e.g.
need to fix are marked with a red cross or a red danger triangle. In
case of a danger triangle, you need to look at the dependencies of the
dependency, which you can visit via the hyperlink.
I don't see any linear search here.
Best regards,
Maxime Devos.
Attachment: OpenPGP_signature
M
M
Maxime Devos wrote on 24 Aug 2023 17:02
(name . Csepp)(address . raingloom@riseup.net)(address . 65391@debbugs.gnu.org)
69663b24-0736-df6f-9ee4-95a77ea77f18@telenet.be
Op 23-08-2023 om 01:45 schreef Csepp:
Toggle quote (2 lines)
> I tried signing up to the CI mailing list and it immediately became
> overwhelming.
If the CI list was split in ‘broken’ and ‘fixed’, such that you have the
option to only subscribe to ‘broken’, would that help? A large fraction
of messages is for fixed packages, which do not need to be acted upon.
Toggle quote (4 lines)
> One possible improvement I have been thinking about is making it easy
> for users to filter CI output to the packages in their profile closure,
> so for example they would get advance notice of any broken packages
> *before* attempting to install them.
I assume you meant s/install/update.
How is this an improvement? I mean, how does this make
‘People need to report failing builds even though we have
ci.guix.gnu.org for that.’
less true?
Best regards,
Maxime Devos.
Attachment: OpenPGP_signature
C
(name . Simon Tournier)(address . zimon.toutoune@gmail.com)
87h6oobbcm.fsf@riseup.net
Simon Tournier <zimon.toutoune@gmail.com> writes:

Toggle quote (30 lines)
> Hi,
>
> On Wed, 23 Aug 2023 at 01:45, Csepp <raingloom@riseup.net> wrote:
>
>> One possible improvement I have been thinking about is making it easy
>> for users to filter CI output to the packages in their profile closure,
>> so for example they would get advance notice of any broken packages
>> *before* attempting to install them.
>> Teams could also have their own filters.
>
> Maybe I am missing what you would like, from my understanding, that’s
> already possible using time-machine and weather. For example,
>
> guix time-machine -- weather -m manifest.scm
>
> allow to know the status of the last commit. What is missing is a clear
> return code for chaining. For instance, see this proposal:
>
> subject: guix weather exit status?
> from: Leo Famulari <leo@famulari.name>
> date: Thu, 08 Jul 2021 16:35:03 -0400
> message-id: id:YOdhd7FfMOvKjTQe@jasmine.lan
> https://yhetil.org/guix/YOdhd7FfMOvKjTQe@jasmine.lan
>
> However, I agree that the next step (find the log of the broken package)
> for teams is a bit convoluted.
>
> Cheers,
> simon

Thanks, I was not aware of this solution, but it also kind of isn't a
complete solution.
A pull is a quite costly operation, why should I have to perform one on
my netbook when what I'm trying to find out is which commit is actually
worth pulling to?
C
(name . Maxime Devos)(address . maximedevos@telenet.be)
87cyzcbau0.fsf@riseup.net
Maxime Devos <maximedevos@telenet.be> writes:

Toggle quote (24 lines)
> [[PGP Signed Part:Undecided]]
>
>
> Op 23-08-2023 om 01:45 schreef Csepp:
>> Also the CI UI could use some improvements. I'm pretty sure I've
>> mentioned this before, but there is no easy way to find out which inputs
>> I need to fix to make a dependency failure disappear. I think everyone
>> has better things to do than perform a linear search by hand.
>
> Go to the package of a failed build, e.g.
> <https://ci.guix.gnu.org/build/1840209/details>. The dependencies you
> need to fix are marked with a red cross or a red danger triangle. In
> case of a danger triangle, you need to look at the dependencies of the
> dependency, which you can visit via the hyperlink.
>
> I don't see any linear search here.
>
> Best regards,
> Maxime Devos.
>
> [2. OpenPGP public key --- application/pgp-keys; OpenPGP_0x49E3EE22191725EE.asc]...
>
> [[End of PGP Signed Part]]

That is precisely what the linear search algorithm is. I should not
have to look through the dependency tree to figure out if two package
failures have the same cause, or to know how many (possibly indirect)
dependencies of a package are failing.
As an example, pandoc often fails to build on i686, but when you look at
the CI page, you see that it was caused by several of its inputs
failing, all due to some of *their* dependencies.
Now, you could dig down on one branch of the dependency DAG and find one
failing package, but that doesn't *actually* answer the question: "what
packages do I need to fix to enable this one?", because it could have
multiple failing inputs instead of just one. The only way to tell is to
look at each page, that means having to visually find each failing input
on the page, wait for their CI pages to load, and repeat the whole
process.
If your browser is not particularly fast or you aren't so quick at
navigating a webpage, this can take a while.
But for the CI server, generating this information would take less than
a second.
Maybe some people value their time so little that they are fine with
doing this the manual way, but personally I have better things to do.
C
(name . Maxime Devos)(address . maximedevos@telenet.be)
875y54baeh.fsf@riseup.net
Maxime Devos <maximedevos@telenet.be> writes:

Toggle quote (10 lines)
> [[PGP Signed Part:Undecided]]
> Op 23-08-2023 om 01:45 schreef Csepp:
>> I tried signing up to the CI mailing list and it immediately became
>> overwhelming.
>
> If the CI list was split in ‘broken’ and ‘fixed’, such that you have
> the option to only subscribe to ‘broken’, would that help? A large
> fraction of messages is for fixed packages, which do not need to be
> acted upon.

Yup, that would be an improvement. Or some way to group messages by
package.

Toggle quote (21 lines)
>> One possible improvement I have been thinking about is making it easy
>> for users to filter CI output to the packages in their profile closure,
>> so for example they would get advance notice of any broken packages
>> *before* attempting to install them.
>
> I assume you meant s/install/update.
>
> How is this an improvement? I mean, how does this make
>
> ‘People need to report failing builds even though we have
> ci.guix.gnu.org for that.’
>
> less true?
>
> Best regards,
> Maxime Devos.
>
> [2. OpenPGP public key --- application/pgp-keys; OpenPGP_0x49E3EE22191725EE.asc]...
>
> [[End of PGP Signed Part]]

A user is more likely to be able and motivated to fix a package that
they are using. Getting notifications as a stream is a recipe for alert
fatigue. There needs to be a way to at the very least move actionable
alert to the top of the list and to deduplicate alerts.
TLDR: alert fatigue is bad and it should not be the casual contributor's
job to fight it on their own. If its filtering and grouping is expected
to be done on the client side then there should be guides for setting
those filters up.
Personally, it already takes enough time for me to read the bug
discussions.
?
(name . Maxime Devos)(address . maximedevos@telenet.be)
871qfpxp76.fsf@envs.net
Maxime Devos <maximedevos@telenet.be> writes:

Toggle quote (8 lines)
> For example, naev used to work just fine, yet apparently it doesn't
> anymore: https://issues.guix.gnu.org/65390.
>
> Given that Guix has ci.guix.gnu.org, I would expect such new problems
> to be detected and resolved early, and it was detected by
> ci.guix.gnu.org, yet going by issues.guix.gnu.org it was never even
> investigated.

Yes, honestly I only look for build failures from bug reports, not from
CI if i'm not doing a "request for merge" from another branch.

Toggle quote (4 lines)
>
> (Yes, there is a delay, but that doesn't matter at all, as there's
> this dashboard <https://ci.guix.gnu.org/eval/668365/dashboard>.)

I found the dashboard inconvenient to use, it show failures for both
builds and dependencies in the same red color, and can't be searched.
What I usually do is:

1. download the job status json with:

2. use jq to show package names with build failures:
cat jobs.json | jq '. | map(select(.status == 1)) | .[].name' -r

3. select interested one to investigate (if doing merge, diff the failures from
working branch with master).


Toggle quote (7 lines)
>
> Do people really need to report 33% of all jobs
> (https://ci.guix.gnu.org/eval/668365/dashboard) before those failures
> are taken seriously, instead of the ‘there don't seem to be that much
> more build failures from the core-updates/... merge, let's solve them
> later (i.e., never)’ that seems to be status quo?

Maybe we can automatically report the failures as bugs, say every 7
days, and remove a package if it still fail to build in 90 days?

As for now, x86_64 master (eval 668365) has 696 build failures, 604
dependencies failures, 30 unknown (canceld?) failures, total 1330
failures according to the jobs.json data.

Should we open a bug report for each of those 696 build failures?
Attachment: ooo
M
M
Maxim Cournoyer wrote on 27 Aug 2023 05:38
(name . ???)(address . iyzsong@envs.net)
87sf85i289.fsf@gmail.com
Hello,

??? <iyzsong@envs.net> writes:

Toggle quote (30 lines)
> Maxime Devos <maximedevos@telenet.be> writes:
>
>> For example, naev used to work just fine, yet apparently it doesn't
>> anymore: https://issues.guix.gnu.org/65390.
>>
>> Given that Guix has ci.guix.gnu.org, I would expect such new problems
>> to be detected and resolved early, and it was detected by
>> ci.guix.gnu.org, yet going by issues.guix.gnu.org it was never even
>> investigated.
>
> Yes, honestly I only look for build failures from bug reports, not from
> CI if i'm not doing a "request for merge" from another branch.
>
>>
>> (Yes, there is a delay, but that doesn't matter at all, as there's
>> this dashboard <https://ci.guix.gnu.org/eval/668365/dashboard>.)
>
> I found the dashboard inconvenient to use, it show failures for both
> builds and dependencies in the same red color, and can't be searched.
> What I usually do is:
>
> 1. download the job status json with:
> wget -O jobs.json 'https://ci.guix.gnu.org/api/jobs?evaluation=692229&system=x86_64-linux'
>
> 2. use jq to show package names with build failures:
> cat jobs.json | jq '. | map(select(.status == 1)) | .[].name' -r
>
> 3. select interested one to investigate (if doing merge, diff the failures from
> working branch with master).

Maybe we should open Cuirass feature requests on our bug tracker to
remember what would be valuable to implement.

Toggle quote (9 lines)
>> Do people really need to report 33% of all jobs
>> (https://ci.guix.gnu.org/eval/668365/dashboard) before those failures
>> are taken seriously, instead of the ‘there don't seem to be that much
>> more build failures from the core-updates/... merge, let's solve them
>> later (i.e., never)’ that seems to be status quo?
>
> Maybe we can automatically report the failures as bugs, say every 7
> days, and remove a package if it still fail to build in 90 days?

That's sounds reasonable to me.

Toggle quote (6 lines)
> As for now, x86_64 master (eval 668365) has 696 build failures, 604
> dependencies failures, 30 unknown (canceld?) failures, total 1330
> failures according to the jobs.json data.
>
> Should we open a bug report for each of those 696 build failures?

I'm not against, though that sounds like a lot of work unless automated.

--
Thanks,
Maxim
M
M
Maxim Cournoyer wrote on 27 Aug 2023 05:39
(name . ???)(address . iyzsong@envs.net)
87o7iti25x.fsf@gmail.com
Hi again,

??? <iyzsong@envs.net> writes:

Toggle quote (13 lines)
> Maxime Devos <maximedevos@telenet.be> writes:
>
>> For example, naev used to work just fine, yet apparently it doesn't
>> anymore: https://issues.guix.gnu.org/65390.
>>
>> Given that Guix has ci.guix.gnu.org, I would expect such new problems
>> to be detected and resolved early, and it was detected by
>> ci.guix.gnu.org, yet going by issues.guix.gnu.org it was never even
>> investigated.
>
> Yes, honestly I only look for build failures from bug reports, not from
> CI if i'm not doing a "request for merge" from another branch.

Another idea I had was we could add some feature to Cuirass where it'd
notify a team (by email) when a package under their scope has been
broken on the master branch.

--
Thanks,
Maxim
B
B
Bruno Victal wrote on 27 Aug 2023 06:30
283d99d4-4682-4577-b69c-f064ff5cd179@makinata.eu
On 2023-08-27 02:13, ??? wrote:
Toggle quote (3 lines)
> Maybe we can automatically report the failures as bugs, say every 7
> days, and remove a package if it still fail to build in 90 days?

I'm not so sure about removing packages, personally if I'm in need of
a package that happens to be broken I find it easier to fix it given
that some work has already been put into writing the package definition
than starting from scratch.


--
Furthermore, I consider that nonfree software must be eradicated.

Cheers,
Bruno.
?
[Cuirass] feature requests for dashboard
(address . bug-guix@gnu.org)
877cpgc336.fsf_-_@envs.net
Hello, I think the current CI dashboard (eg: https://ci.guix.gnu.org/eval/693369/dashboard)
is a little inconvenient to use, and I'd like it have:

1. different colors for build failures (status=1) and dependencies
failures (status=2), and other type failures. Maybe yellow for
dependencies failures, and grey for other.

2. more search options in addition to job name, eg:
status:failed
status:failed-dependency
status:canceled
team:python
also a help like in mumi https://issues.guix.gnu.org/help#searchfor
those options.

3. for a failed build, show a link to its bug report on
issues.guix.gnu.org if one existed.
add a Issue row with link to https://issues.guix.gnu.org/65392
so we can know this build failure is known.




Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

Toggle quote (16 lines)
>> I found the dashboard inconvenient to use, it show failures for both
>> builds and dependencies in the same red color, and can't be searched.
>> What I usually do is:
>>
>> 1. download the job status json with:
>> wget -O jobs.json 'https://ci.guix.gnu.org/api/jobs?evaluation=692229&system=x86_64-linux'
>>
>> 2. use jq to show package names with build failures:
>> cat jobs.json | jq '. | map(select(.status == 1)) | .[].name' -r
>>
>> 3. select interested one to investigate (if doing merge, diff the failures from
>> working branch with master).
>
> Maybe we should open Cuirass feature requests on our bug tracker to
> remember what would be valuable to implement.

Okay, I'll open one here.
G
G
Giovanni Biscuolo wrote on 27 Aug 2023 17:07
Re: bug#65391: People need to report failing builds even though we have ci.guix.gnu.org for that
87sf84a5ho.fsf@xelera.eu
Bruno Victal <mirai@makinata.eu> writes:

Toggle quote (4 lines)
> On 2023-08-27 02:13, ??? wrote:
>> Maybe we can automatically report the failures as bugs, say every 7
>> days, and remove a package if it still fail to build in 90 days?

maybe precedeed by an automated email notification (to guix-bugs) so
that interested people have the chance to step in and fix it?

Toggle quote (5 lines)
> I'm not so sure about removing packages, personally if I'm in need of
> a package that happens to be broken I find it easier to fix it given
> that some work has already been put into writing the package definition
> than starting from scratch.

You don't need to start from scratch if you want, you just have to
checkout the right git commit (before the package was deleted) and start
from that, if needed: WDYT?

Happy hacking! Gio'

[...]

--
Giovanni Biscuolo

Xelera IT Infrastructures
-----BEGIN PGP SIGNATURE-----

iQJABAEBCgAqFiEERcxjuFJYydVfNLI5030Op87MORIFAmTrZrQMHGdAeGVsZXJh
LmV1AAoJENN9DqfOzDkSISAP/RYSohmV2sVeJxlt3mBNtb6CQ8IP+NV88lTtLFRE
fYHq7K6c4JND80VegeERerD2TYLmOm/450NhYBz+/QrbZLeAuRL+0PANCOQGr+eG
kJpGhnnVRT5quq82VB6J3tQZsK5j1uR66PImjGW01MrhbT4OXOdyED4qb5hWFdYZ
sriwZiKqlDAgHzaK6iM/waPzQo5muIslSFjXhbkmB+d6j2HnF0n/Bp5a+qvVY/TW
n0SXHxCw3WWi4VA9rpMKQK0M+VQbSWgsCqMOCd8D/B3M5ZUUmEj9uTAXY/080HZd
5BQtzGdknIihXDE6C5KmOipi9LIJceiDizc8tqT6yFHbZQ0uS9kmpgFRQY66GZz1
WxCzGJYDxTGSswwmfXhdZNJslWgB+FzvqbSfXw0RM3ai0f3INT3XXm/l9jkduQ72
QH/wjpw5AraUW2LY3FaiTofp6dHgmN4L7BjLexs7utu0K4VNJPGWIEj+s9231vcs
Jj3K5Z6sVUb4N1DrSdPqt6IalNqMLEg+uyPTbmFZw+GSqI2gLAiIE3tEdmiE2kz4
LAIDoF74E6vCgfPINDlGZWy++swkqJE3mp/In/XkUr0+jnCyVYw0cWddi5BE8PLc
w9CA5ymbue6WvXNIPwi5neXrvBcV0l5oqsPF7LTg7NeIDBeekp5mew4KTY8ymG8B
BySd
=nDMh
-----END PGP SIGNATURE-----

A
A
Andy Tai wrote on 27 Aug 2023 18:24
Re: bug#65391: People need to report failing builds even
(address . guix-patches@gnu.org)
CAJsg1E-mWem-j-weaEBKpMjTN+GXBotHZBA2k3NCSQgE_p1JkQ@mail.gmail.com
On 2023-08-27 02:13, ??? wrote:
Toggle quote (3 lines)
> Maybe we can automatically report the failures as bugs, say every 7
> days, and remove a package if it still fail to build in 90 days?

Hi, maybe build failures should be limited to certain platforms that can
cause this treatment, such as (32-bit) x86, x86-64 and arm-64, so build
failures on other platforms would not make a package removed if build
failure not fixed

The reason is that most people do not have arm32, PowerPC or Risc-V
hardware so these platforms may be more likely to suffer build failures and
for most people x86 and 64-bit arm platforms are what they use. Build
failures on the less common platforms can be fixed if there are people with
resources and interests, and wanting to fix them
Attachment: file
M
M
Maxime Devos wrote on 29 Aug 2023 16:03
Re: bug#65391: Acknowledgement (People need to report failing builds even though we have ci.guix.gnu.org for that)
6a62aced-9138-0496-fb01-d5d8e89ba8d6@telenet.be
(I did not receive the e-mails from Andy Tai and ???, I had to look at
Toggle quote (2 lines)
> Maybe we can automatically report the failures as bugs, say every 7
> days, and remove a package if it still fail to build in 90 days?
The first part looks reasonable to me (though I would decrease 7 days to
daily or even hourly, as I don't see a point in the delay), but how does
the second part (removing packages) make sense at all?
I mean, if you do that:
1. Build failures happen (independent of whether you do that).
2. Hence, by doing that, the distro shrinks over time.
3. Leading to frustrated users(*), because the packages they were
using and which were working well were suddenly removed for no good
reason (**).
4. Leading to less people fixing build failures (because of the
frustration).
which seems rather counter-productive to me.
(I suppose the feedback loop eventually stabilises by ‘less people ->
less changes made to Guix -> less new build failures -> less
frustration’, but that's not really a good thing.)
Instead, what about:
> Maybe we can automatically report the failures as bugs, say every
> hour, and revert the commit(s) causing the new build failures if they
> haven't been fixed in a week.
(3 months seems to have to high a chance of merge conflicts and
decreased motivation to fix the mistakes to me.)
Expanding upon this a bit more:
* Expecting that people fix build failures of X when updating X seems
reasonable to me, and I think this is not in dispute.
* Expecting that people using X fix build failures of X or risk the
package X being deleted when someone else changed a dependency Y of
X seems unreasonable to me. More generally, I am categorically
opposed to:
‘If you change something and it breaks something else, you should
leave fixing the something else to someone (unless you want to
fix it yourself).’
(I can think of some situations where this is a good thing, but not
in general and in particular not in this Guix situation.)
I mean, I don't know about you, but for me it fails the categorical
imperative and the so-called Golden Rule.
(*) making no distinction between users and developers here, as the
latter are users too.
(**) I can think of four classes of causes of new build failures, in all
of which removing the package usually makes no sense:
+ Non-determinism. While fixing the non-determinism would be ideal,
instead of removing the package, you could just retry the build.
+ Time-bombs. These tend to be simple to fix. Often they are in
tests, which at worst you could simply disable, instead of
removing the package.
+ Update of dependency that is incompatible with the dependent.
That should be caught at review time -- if there is anything
that should be removed, it's the update (i.e., revert it).
Also, Guix supports having multiple versions of a package,
you could use that? Or if it is a simple change, you could
patch things while things haven't diverged much yet (and
maybe upstream even already has an update to make things
compatible!)
+ Out-of-memory problems and the like: see non-determinism.
Best regards,
Maxime Devos
Attachment: OpenPGP_signature
M
M
Maxim Cournoyer wrote on 29 Aug 2023 16:45
(name . Maxime Devos)(address . maximedevos@telenet.be)
87h6ohc3gk.fsf@gmail.com
Hi Maxime,

Maxime Devos <maximedevos@telenet.be> writes:

Toggle quote (20 lines)
> (I did not receive the e-mails from Andy Tai and ???, I had to look
> at <https://issues.guix.gnu.org/65391>.)
>
>> Maybe we can automatically report the failures as bugs, say every 7
>> days, and remove a package if it still fail to build in 90 days?
>
> The first part looks reasonable to me (though I would decrease 7 days
> to daily or even hourly, as I don't see a point in the delay), but how
> does the second part (removing packages) make sense at all?
>
> I mean, if you do that:
>
> 1. Build failures happen (independent of whether you do that).
> 2. Hence, by doing that, the distro shrinks over time.
> 3. Leading to frustrated users(*), because the packages they were
> using and which were working well were suddenly removed for no good
> reason (**).
> 4. Leading to less people fixing build failures (because of the
> frustration).

We could bump the expiry time to 180 days, or even 365 days (a full
year). If nobody opens an issue for a broken package in that amount of
time, it's probably not used much if at all and may not be worth the
maintenance burden. It can always be resurrected from the git history
if someone is motivated to pick it up. Looking for removed packages
from the git history could become a second instinct if this was made
policy. We already have a yasnippet snippet that automates commit
message for package removal: 'remove... TAB', which makes it easy to
search for:

Toggle snippet (13 lines)
git log --grep='gnu: Remove'

commit 72abf72062f0e813efb633e05b42c99c4bc78cff
Author: Maxim Cournoyer <me>
Date: Fri Aug 11 21:29:54 2023 -0400

gnu: Remove qtquickcontrols2.
* gnu/packages/qt.scm (qtquickcontrols2): Delete variable.
(pyotherside) [inputs]: Remove qtquickcontrols2.
[...]

It's frustrating for users when a package is missing, but it's also
frustrating/inefficient for maintainers to stumble upon broken packages
when checking if an upgrade broke dependent packages (it takes time to
build them just to find out they fail, and researching they already
did), so a balance is needed.

Toggle quote (36 lines)
> which seems rather counter-productive to me.
>
> (I suppose the feedback loop eventually stabilises by ‘less people ->
> less changes made to Guix -> less new build failures -> less
> frustration’, but that's not really a good thing.)
>
> Instead, what about:
>
>> Maybe we can automatically report the failures as bugs, say every
>> hour, and revert the commit(s) causing the new build failures if they
>> haven't been fixed in a week.


> (3 months seems to have to high a chance of merge conflicts and
> decreased motivation to fix the mistakes to me.)
>
> Expanding upon this a bit more:
>
> * Expecting that people fix build failures of X when updating X seems
> reasonable to me, and I think this is not in dispute.
>
> * Expecting that people using X fix build failures of X or risk the
> package X being deleted when someone else changed a dependency Y of
> X seems unreasonable to me. More generally, I am categorically
> opposed to:
>
> ‘If you change something and it breaks something else, you should
> leave fixing the something else to someone (unless you want to
> fix it yourself).’
>
> (I can think of some situations where this is a good thing, but not
> in general and in particular not in this Guix situation.)
>
> I mean, I don't know about you, but for me it fails the categorical
> imperative and the so-called Golden Rule.

I think we can all assume contributors are acting in good faith and
are ready to fix any problems resulting from their installed changes;
but they need to be made aware of these failures.

Which to me suggests we (again) need better tooling (that's already
improved much with the QA service, thanks to Christopher's efforts).

It can still be improved; the QA could for example notify contributors
by email when their patch or series have broken something, like the CI
of forges typically do, or other UI improvements to make it easier to
see what has been broken. Cuirass in particular would benefit from a
status:failed-new (freshly broken) query ability. I think the data is
already there, it just needs to be exposed.

I've opened new feature requests for the CI to help with that:
https://issues.guix.gnu.org/65594("[feature] [qa] Notify users by email
of problems") and https://issues.guix.gnu.org/65595("[feature]
[cuirass] Add ability to filter builds for status:failed-new").

Thanks for weighing in!

--
Thanks,
Maxim
M
M
Maxime Devos wrote on 30 Aug 2023 00:44
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)
ac627ec6-f21a-fd24-1151-b47d6d2c84b3@telenet.be
Toggle quote (13 lines)
>> The first part looks reasonable to me (though I would decrease 7 days
>> to daily or even hourly, as I don't see a point in the delay), but how
>> does the second part (removing packages) make sense at all?
>>
>> I mean, if you do that:
>>
>> 1. Build failures happen (independent of whether you do that).
>> 2. Hence, by doing that, the distro shrinks over time.
>> 3. Leading to frustrated users(*), because the packages they were
>> using and which were working well were suddenly removed for no good
>> reason (**).
>> 4. Leading to less people fixing build failures (because of the
>> frustration).
Toggle quote (4 lines)
> We could bump the expiry time to 180 days, or even 365 days (a full
> year). If nobody opens an issue for a broken package in that amount of
> time, it's probably not used much if at all and may not be worth the
> maintenance burden.
Please read the subject line of the original message, subject lines
aren't just fluff.
More to the point, no, it doesn't mean that that the package is not used
much, it could instead mean that the people using the package (or
interested in using the package, if it was already broken when they
discovered it) thought that the existence of ci.guix.gnu.org means that
contributors doing Guix maintenance already know that the package is
broken and assumed that it would be fixed, and that a new bug report
would just be annoying the contributors because they already have a bug
report: the build failure on ci.guix.gnu.org.
> It can always be resurrected from the git history if someone is
> motivated to pick it up. Looking for removed packages from the git
> history could become a second instinct if this was made policy.
> Looking for removed packages from the git history could become a
> second instinct if this was made policy. [trimmed yasnippet stuff]
Yes, all this could be done. But how does any of this address my
arguments you quoted at all?
Op 29-08-2023 om 16:45 schreef Maxim Cournoyer:
Toggle quote (5 lines)
> It's frustrating for users when a package is missing, but it's also
> frustrating/inefficient for maintainers to stumble upon broken packages
> when checking if an upgrade broke dependent packages (it takes time to
> build them just to find out they fail, and researching they already
> did), so a balance is needed.
This part, OTOH, actually has something to do with what you quoted.
Again, as I wrote previously, maintainers are users too -- if something
is frustrating to users it is frustrating to users because
maintainers⊆users. What remains is the quantity of frustration, which
is a valid point, but how would you even quantify that? I don't know
about you, but I don't know how to do that, so while a valid point, it
doesn't seem a useful point to me because it seems impossible to
determine whether it is a point for or against.
Also, the amount of frustration would be less than what you appear to
believe it to be:
If maintainers check that no new build failures are created, then over
time the total amount of old build failures becomes roughly zero
(roughly, because of occasional mistake and new timebombs).
Then, the frustration of researching they already did mostly disappears.
(Other sources of inefficiency and frustration remain.)
Also, I believe there shouldn't be a balance, or IOW, the balance should
tilt almost completely towards no new broken packages and no removals (*).
I mean, having reliable non-broken packages (and services, installation
etc.) is the whole point of a distro, and if that inherently results in
frustration for people modifying the distro, IMO that means the
frustration should be minimised (see e.g. better tooling suggestions) or
computers should stop being used, not that Guix should stop being a distro.
(*) Sometimes upstream is really not with the times instead of slightly
out of touch, sometimes the broken package has a good replacement and
often security updates need to be performed before they existed, but the
‘remove packages’ proposal is not limited to such exceptions.
>> [some other part]
Toggle quote (19 lines)
>> Expanding upon this a bit more:
>>
>> * Expecting that people fix build failures of X when updating X seems
>> reasonable to me, and I think this is not in dispute.
>>
>> * Expecting that people using X fix build failures of X or risk the
>> package X being deleted when someone else changed a dependency Y of
>> X seems unreasonable to me. More generally, I am categorically
>> opposed to:
>>
>> ‘If you change something and it breaks something else, you should
>> leave fixing the something else to someone (unless you want to
>> fix it yourself).’
>>
>> (I can think of some situations where this is a good thing, but not
>> in general and in particular not in this Guix situation.)
>>
>> I mean, I don't know about you, but for me it fails the categorical
>> imperative and the so-called Golden Rule.
>
Toggle quote (3 lines)
> I think we can all assume contributors are acting in good faith and
> are ready to fix any problems resulting from their installed changes;
> but they need to be made aware of these failures. [...]
Again, how does this reply addresses what you quoted? Like, this is a
valuable reply (and I mostly agree with it, but I would qualify
‘contributors’ as ‘most regular contributors’ (**)) ... but it is not a
good reply to what you quoted.
* if you left out the quote or separated your reply from the quote
(more explicitly, you could e.g. start with ‘On related matters,
...’), it would be fine.
* but if you don't, then you're blatantly ignoring what I wrote, which
is not fine at all.
It's something I have encountered and pointed out (less explicitly) in
the past in other threads as well.
(**) If you want me to, I could sent you an example of someone writing a
single message (and no other messages to Guix) in bad faith by PM.
> [tooling / QA improval suggestions]
Agreed.
Best regards,
Maxime Devos.
Attachment: OpenPGP_signature
M
M
Maxime Devos wrote on 30 Aug 2023 00:52
Re: bug#65391: People need to report failing builds even though we have ci.guix.gnu.org for that
(address . 65391@debbugs.gnu.org)
fdf86517-7aac-a2e3-223b-23e6ef7a90d5@telenet.be
Toggle quote (1 lines)
> [Two mails previously]
> Also the CI UI could use some improvements. I'm pretty sure I've
> mentioned this before, but there is no easy way to find out which
> inputs I need to fix to make a dependency failure disappear.
Toggle quote (20 lines)
> [...]
> That is precisely what the linear search algorithm is. I should not
> have to look through the dependency tree to figure out if two package
> failures have the same cause, or to know how many (possibly indirect)
> dependencies of a package are failing.
> As an example, pandoc often fails to build on i686, but when you look at
> the CI page, you see that it was caused by several of its inputs
> failing, all due to some of *their* dependencies.
> Now, you could dig down on one branch of the dependency DAG and find one
> failing package, but that doesn't *actually* answer the question: "what
> packages do I need to fix to enable this one?", because it could have
> multiple failing inputs instead of just one. The only way to tell is to
> look at each page, that means having to visually find each failing input
> on the page, wait for their CI pages to load, and repeat the whole
> process.
> If your browser is not particularly fast or you aren't so quick at
> navigating a webpage, this can take a while.
> But for the CI server, generating this information would take less than
> a second > Maybe some people value their time so little that they are fine with
> doing this the manual way, but personally I have better things to do.
ci.guix.gnu.org loads fast enough for me in my experience, but I do
agree that more automation is good!
(I usually don't respond to e-mails I agree with except for
superficialities, but I was wondering if such non-replies are actually
interpreted as such, or as disagreements, or neither.)
Best regards,
Maxime Devos.
Attachment: OpenPGP_signature
M
M
Maxim Cournoyer wrote on 30 Aug 2023 04:28
Re: bug#65391: Acknowledgement (People need to report failing builds even though we have ci.guix.gnu.org for that)
(name . Maxime Devos)(address . maximedevos@telenet.be)
87zg296z7y.fsf@gmail.com
Hi Maxime,

Maxime Devos <maximedevos@telenet.be> writes:

Toggle quote (20 lines)
>>> The first part looks reasonable to me (though I would decrease 7 days
>>> to daily or even hourly, as I don't see a point in the delay), but how
>>> does the second part (removing packages) make sense at all?
>>> I mean, if you do that:
>>> 1. Build failures happen (independent of whether you do that).
>>> 2. Hence, by doing that, the distro shrinks over time.
>>> 3. Leading to frustrated users(*), because the packages they were
>>> using and which were working well were suddenly removed for no good
>>> reason (**).
>>> 4. Leading to less people fixing build failures (because of the
>>> frustration).
>
>> We could bump the expiry time to 180 days, or even 365 days (a full
>> year). If nobody opens an issue for a broken package in that amount of
>> time, it's probably not used much if at all and may not be worth the
>> maintenance burden.
>
> Please read the subject line of the original message, subject lines
> aren't just fluff.

Believe it or not, I actually did! :-) I was replying to the first part
of your message, where you mentioned you were against packages removal.
My reply was giving support to devising policy that would define when
it's acceptable to prune the distribution of broken/unmaintained
packages, which is tangentially related to the topic of reporting broken
packages. These are just ideas and if we decide to turn some of them
into policy we could write it in a way that would favor resolving
problems instead of just making them disappear.

[...]

Toggle quote (4 lines)
> If maintainers check that no new build failures are created, then over
> time the total amount of old build failures becomes roughly zero
> (roughly, because of occasional mistake and new timebombs).

You mean that the building vs failing ratio improves, right? I'm all
for giving a best effort to keep as many packages as we have the
capacity to do, but at some point the Pareto principle kicks in and you
realize there's not that much value in spending 3 days trying to fix a
hardly maintained leaf package that has been failing to build for a year
or two.

[...]

Toggle quote (6 lines)
> (*) Sometimes upstream is really not with the times instead of
> slightly out of touch, sometimes the broken package has a good
> replacement and often security updates need to be performed before
> they existed, but the ‘remove packages’ proposal is not limited to
> such exceptions.

This is the kind of considerations that we could mention in a package
removal policy (basically mention it's a last resort thing).

Toggle quote (37 lines)
>>> [some other part]
>>> Expanding upon this a bit more:
>>> * Expecting that people fix build failures of X when updating X
>>> seems
>>> reasonable to me, and I think this is not in dispute.
>>> * Expecting that people using X fix build failures of X or risk
>>> the
>>> package X being deleted when someone else changed a dependency Y of
>>> X seems unreasonable to me. More generally, I am categorically
>>> opposed to:
>>> ‘If you change something and it breaks something else, you
>>> should
>>> leave fixing the something else to someone (unless you want to
>>> fix it yourself).’
>>> (I can think of some situations where this is a good thing,
>>> but not
>>> in general and in particular not in this Guix situation.)
>>> I mean, I don't know about you, but for me it fails the
>>> categorical
>>> imperative and the so-called Golden Rule.
>>
>> I think we can all assume contributors are acting in good faith and
>> are ready to fix any problems resulting from their installed changes;
>> but they need to be made aware of these failures. [...]
>
> Again, how does this reply addresses what you quoted? Like, this is
> a valuable reply (and I mostly agree with it, but I would qualify
> ‘contributors’ as ‘most regular contributors’ (**)) ... but it is not
> a good reply to what you quoted.
>
> * if you left out the quote or separated your reply from the quote
> (more explicitly, you could e.g. start with ‘On related matters,
> ...’), it would be fine.
>
> * but if you don't, then you're blatantly ignoring what I wrote, which
> is not fine at all.

The text of yours I quoted was to provide some context as to what I was
answering to; I replied to the essence of your argument I synthesized
from it, not point by point as I agreed with it and it wouldn't have
added much to do so.

Toggle quote (3 lines)
> It's something I have encountered and pointed out (less explicitly) in
> the past in other threads as well.

I think it's a common reaction when faced with a detailed text -- some
people may simply ignore it, feeling overwhelmed, or they may synthesize
the essence of it to keep it high level and the discussion more fluid.
I don't think it should be perceived as mean; a partial reply is still
better than none.

--
Thanks,
Maxim
M
M
Maxim Cournoyer wrote on 30 Aug 2023 04:36
Re: bug#65391: People need to report failing builds even though we have ci.guix.gnu.org for that
(name . Maxime Devos)(address . maximedevos@telenet.be)
87sf816yu1.fsf@gmail.com
Hi Maxime,

Maxime Devos <maximedevos@telenet.be> writes:

[...]

Toggle quote (4 lines)
> (I usually don't respond to e-mails I agree with except for
> superficialities, but I was wondering if such non-replies are actually
> interpreted as such, or as disagreements, or neither.)

I'd say it's safer to assume neither, though perhaps with a slight bias
toward agreement, especially if the person was otherwise actively
participating in the conversation (as I would expect people are most
likely to post a reply when they disagree with something).

--
Thanks,
Maxim
?
Re: bug#65391: Acknowledgement (People need to report failing builds even though we have ci.guix.gnu.org for that)
(name . Maxime Devos)(address . maximedevos@telenet.be)
87edjkvn6b.fsf@envs.net
Maxime Devos <maximedevos@telenet.be> writes:

Toggle quote (8 lines)
>> Maybe we can automatically report the failures as bugs, say every 7
>> days, and remove a package if it still fail to build in 90 days?
>
> The first part looks reasonable to me (though I would decrease 7 days
> to daily or even hourly, as I don't see a point in the delay), but how
> does the second part (removing packages) make sense at all?
>

Oh, to be more clear I didn't mean automatically remove a package, but
notify guix-devel to consider removing one if its "fail to build" issue
had existed for a long time and no one care.

Toggle quote (8 lines)
> [...]
>
> Instead, what about:
>
>> Maybe we can automatically report the failures as bugs, say every
>> hour, and revert the commit(s) causing the new build failures if they
>> haven't been fixed in a week.

Yes, automatically report bugs would be helpful. And I'll leave the
reverting rights to committers, which usually need some research and
maybe risky.


Toggle quote (21 lines)
> [...]
> Expanding upon this a bit more:
>
> * Expecting that people fix build failures of X when updating X seems
> reasonable to me, and I think this is not in dispute.
>
> * Expecting that people using X fix build failures of X or risk the
> package X being deleted when someone else changed a dependency Y of
> X seems unreasonable to me. More generally, I am categorically
> opposed to:
>
> ‘If you change something and it breaks something else, you should
> leave fixing the something else to someone (unless you want to
> fix it yourself).’
>
> (I can think of some situations where this is a good thing, but not
> in general and in particular not in this Guix situation.)
>
> I mean, I don't know about you, but for me it fails the categorical
> imperative and the so-called Golden Rule.

I agree. Well sometimes if breaks are overlooked by me, then it's very
welcome for other to give me a hand.


Thanks.
D
D
Dr. Arne Babenhauserheide wrote on 30 Aug 2023 12:39
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)
871qfkg65s.fsf@web.de
Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:
Toggle quote (7 lines)
> Believe it or not, I actually did! :-) I was replying to the first part
> of your message, where you mentioned you were against packages removal.
> My reply was giving support to devising policy that would define when
> it's acceptable to prune the distribution of broken/unmaintained
> packages, which is tangentially related to the topic of reporting broken
> packages.

Please don’t remove packages that are broken on the CI. I often had a
case where no substitute was available but the package built just fine
locally. This is not a perfect situation (nicer would be to track why it
doesn’t come from CI — sometimes it’s just a resource problem on the
CI), but if you removed a package I use that would break all updates for
me.

I had that in the past. It’s not a nice situation, because it not only
break that one package but also prevents getting security updates until
you find time to inspect what exactly is broken.

And if you depend on that package, stuff stops working. Example: The
changes to the Texlive packages currently break the PDF export of many
pages for me — I have not found the deeper reason yet. And I usually
cannot investigate such problems right-away, because I can’t just drop
everything for hobby automation that should just keep working.

If a change in packages breaks my manifest, that is extremely painful.

Best wishes,
Arne
--
Unpolitisch sein
heißt politisch sein,
ohne es zu merken.
draketo.de
-----BEGIN PGP SIGNATURE-----

iQJEBAEBCAAuFiEE801qEjXQSQPNItXAE++NRSQDw+sFAmTvHdAQHGFybmVfYmFi
QHdlYi5kZQAKCRAT741FJAPD678dD/9kN7iuQYWzpn4Y+/3x0rm50sSg8SNR62vd
tUWgprnknM2jhqNEfTST1XJmyeJ63ci9KOvwssL0fL7kkEOi0H1XrpWJcfNz7uJY
fwB6JR9h0bZ6QBzB4Bxa8V2AOteC0nFmcLvMx+FBVka2NiQlK1RECIgW1Kdo/D74
LPYbQ+pJIN1/R0u0gX5AW6lo5F02OUFC4uzGwSKjzDkjOTcSIWewC1jTHONhOElM
/gbLGy1OZN2dov/oLsiFh+SfHgaa5PbNB5/Li2Q+v7GefWIy3kItYlz47gKa2sxy
ghRCiiu3AfWf7pNUVzy1c2t9gLBCIl7kT4asn3vHYzVzphr5yvRVEvj8XDiurZh6
mzxqnh9is+ER0/DLIknSwvEuL7o2BkmmPdCXxRHtsly5kFOLKNTi+hMBFHqIJ4g7
21Z6rynw69ZOhE2w9lrFGPq2ucvKU5p7z8guvSsIRev15vWet6+ZXKaNzj/2IHSG
CA8roScxkCoiyTVf4qG6a8NzfUpPFgVx3xtzFXmFZLovA5KhudwtsCXlDJ2z49H9
XeON6wCAPp4MAjvrPb0q+Sid+b0Kc9mbIXtqenqnCf8+dP8Bk8swuFT/c35SmOKP
p9bYFuXyAu8DuRZLN6Vgp6uuOghsmyp3nsKpLB4jSHeOm2xUJZE9UHjGitiT0jIK
9LwazsMQDojEBAEBCAAuFiEE3Si95tmHXKvOSosd3M8NswvBBUgFAmTvHdAQHGFy
bmVfYmFiQHdlYi5kZQAKCRDczw2zC8EFSM2gA/0ac0gr37FavjdykIF+3yvB5NU1
AP3pmsOPjoxkp5B96HaStmcKv7RBJ+DWhtkKZwaO54ETABzCPNfje169r48T+Zie
OrI50z0xRcY1pQ42g7qQOcMJNsqvODSkV6cSxKFgpst3GZWqZ0S+eLkESySgd6c7
HQBQwj4JgQeKTPd2Eg==
=5FRN
-----END PGP SIGNATURE-----

M
M
Maxime Devos wrote on 30 Aug 2023 13:50
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)
32c84040-e788-b87d-589c-16f13e2e6c93@telenet.be
Toggle quote (28 lines)
> [...]
> Maxime Devos<maximedevos@telenet.be> writes:
>
>>>> The first part looks reasonable to me (though I would decrease 7 days
>>>> to daily or even hourly, as I don't see a point in the delay), but how
>>>> does the second part (removing packages) make sense at all?
>>>> I mean, if you do that:
>>>> 1. Build failures happen (independent of whether you do that).
>>>> 2. Hence, by doing that, the distro shrinks over time.
>>>> 3. Leading to frustrated users(*), because the packages they were
>>>> using and which were working well were suddenly removed for no good
>>>> reason (**).
>>>> 4. Leading to less people fixing build failures (because of the
>>>> frustration).
>>> We could bump the expiry time to 180 days, or even 365 days (a full
>>> year). If nobody opens an issue for a broken package in that amount of
>>> time, it's probably not used much if at all and may not be worth the
>>> maintenance burden.
>> Please read the subject line of the original message, subject lines
>> aren't just fluff.
> Believe it or not, I actually did! :-) I was replying to the first part
> of your message, where you mentioned you were against packages removal.
> My reply was giving support to devising policy that would define when
> it's acceptable to prune the distribution of broken/unmaintained
> packages, which is tangentially related to the topic of reporting broken
> packages. These are just ideas and if we decide to turn some of them
> into policy we could write it in a way that would favor resolving
> problems instead of just making them disappear.
OK sounds good.
Toggle quote (6 lines)
> [...]
>
>> If maintainers check that no new build failures are created, then over
>> time the total amount of old build failures becomes roughly zero
>> (roughly, because of occasional mistake and new timebombs).
> You mean that the building vs failing ratio improves, right?
That too, but in relation to what I replied to, I meant what I wrote,
which is a stronger statement.
Toggle quote (8 lines)
> I'm all
> for giving a best effort to keep as many packages as we have the
> capacity to do, but at some point the Pareto principle kicks in and you
> realize there's not that much value in spending 3 days trying to fix a
> hardly maintained leaf package that has been failing to build for a year
> or two.
>
> [...]
The point is that this situation wouldn't happen if build failures were
addressed soon after their introduction.
If it is noticed that Guix has exceeded its capacity to maintain its
packages and needs to trim its package set to maintain the remaining
packages effectively, then while that's unfortunate and possibly
frustrating to users, I don't have any better option available, but the
original (^) proposal did not have this ‘if capacity is exceeded’
qualifier attached.
(^) In a new e-mail, ??? has amended it a bit.
(It fails the ‘distro ? reliable packages’ property since packages were
removed, but with this approach, it could be a one-time intervention
with a promise to in the future try to stay within capacity, and future
package removals could have a nuanced deprecation policy that avoids
making the packages unreliable(*).)
(*) I was searching for whatever Debian's package removal policy is (as
an example to base things on), but I only found "apt-get remove" etc..
Actually I don't know if Debian has one, but probably I'm just looking
in the wrong places.
It's important _how_ it is trimmed. In the original proposal by ???,
packages are simply removed for failing to build -- there were no
regards to how difficult it would be to fix the build failure, how
popular the package is (or would be if it built and people knew about
it), how useful it is, etc..
On that matter, I think it would be useful to set up a variant of
something like Debian's popcon, in order to have actual statistics on
what's popular (sure statistics would be flawed, but I'd think it's easy
to do better than ‘package fails to build -> unpopular’). I say
variant, such that it could also count packages that aren't actually
installed because they failed to build. (Maybe have separate ‘desired’
and ‘used’ manifests?)
Toggle quote (8 lines)
>
>> (*) Sometimes upstream is really not with the times instead of
>> slightly out of touch, sometimes the broken package has a good
>> replacement and often security updates need to be performed before
>> they existed, but the ‘remove packages’ proposal is not limited to
>> such exceptions.
> This is the kind of considerations that we could mention in a package
> removal policy (basically mention it's a last resort thing).
If there is an actual nuanced package removal policy instead of ‘fails
to build -> remove it’, my objection pretty much goes away.
>> [...]
Toggle quote (4 lines)
> The text of yours I quoted was to provide some context as to what I was
> answering to; I replied to the essence of your argument I synthesized
> from it, not point by point as I agreed with it and it wouldn't have
> added much to do so.
OK, but I don't share your optimism -- while I would (mostly) agree that
_currently_ most contributors are acting in good faith etc., I would say
that after the proposed change the frequency of such contributors could
easily decrease, because:
* the proposal has no actual ‘acting in good faith etc.’ clause, so
it's quite vulnerable to rules-lawyering. I mean, look at
difference between how I interpreted the proposal and between what
??? actually wrote -- in retrospect I read too much in it and I
didn't even try to rules-lawyer.
* there is (indirectly) an incentive for breaking packages, because
the motivation for changing a package and the motivation for fixing
the consequences of that change are different. (Whether
motivation change <, = or > motivation fixing consequences depends.)
* there is little to no incentive for fixing packages you aren't
personally interested in
Maybe things would work out and people in it for self-interest also are
in do enlightened self-interest ... (I don't know which way things would
go.)
Toggle quote (7 lines)
>> It's something I have encountered and pointed out (less explicitly) in
>> the past in other threads as well.
> I think it's a common reaction when faced with a detailed text -- some
> people may simply ignore it, feeling overwhelmed, or they may synthesize
> the essence of it to keep it high level and the discussion more fluid.
> I don't think it should be perceived as mean; a partial reply is still
> better than none.
k, but I'm ignoring the 'common' part -- common does not imply good.
Best regards,
Maxime Devos
Attachment: OpenPGP_signature
M
M
Maxim Cournoyer wrote on 30 Aug 2023 21:12
(name . Dr. Arne Babenhauserheide)(address . arne_bab@web.de)
87bkeo73av.fsf@gmail.com
Hi Arne,

"Dr. Arne Babenhauserheide" <arne_bab@web.de> writes:


[...]

Toggle quote (7 lines)
> Please don’t remove packages that are broken on the CI. I often had a
> case where no substitute was available but the package built just fine
> locally. This is not a perfect situation (nicer would be to track why it
> doesn’t come from CI — sometimes it’s just a resource problem on the
> CI), but if you removed a package I use that would break all updates for
> me.

I agree! It'd be important, if we decide to have such a policy, to add
add guards such that packages are only removed as a last resort, after
options have been considered, and when it's been broken for a while with
an issue opened for it, and when it's a real problem with the package,
not with our CI.

--
Thanks,
Maxim
S
S
Simon Tournier wrote on 7 Sep 2023 13:32
87tts6kym3.fsf@gmail.com
Hi,

On Tue, 29 Aug 2023 at 10:45, Maxim Cournoyer <maxim.cournoyer@gmail.com> wrote:

Toggle quote (6 lines)
> It's frustrating for users when a package is missing, but it's also
> frustrating/inefficient for maintainers to stumble upon broken packages
> when checking if an upgrade broke dependent packages (it takes time to
> build them just to find out they fail, and researching they already
> did), so a balance is needed.

There is nothing worse as an user to have this experience:

guix search foobar

oh cool, foobar is there, let try it,

guix shell foobar

… wait …
… stuff are building …
… laptop is burning …
… wait …
Bang!

Keeping broken packages is just annoyances. Contributor are annoyed
because as said by the paragraph above. And user are annoyed as
described just above.

I am in favor to set a policy for removing then.

The question is the way to detect them. QA can do whatever we want but
until people are helping Chris because, IMHO, Chris is already enough
busy to keep stuff running, we probably need to keep our process simple
enough in order to stay actionable and avoid some vacuum of “coulda,
shoulda or woulda”. For what my opinion is worth on that. :-)

Cheers,
simon
S
S
Simon Tournier wrote on 7 Sep 2023 13:53
87pm2ukxn5.fsf@gmail.com
Hi,

On Wed, 30 Aug 2023 at 12:39, "Dr. Arne Babenhauserheide" <arne_bab@web.de> wrote:

Toggle quote (7 lines)
> Please don’t remove packages that are broken on the CI. I often had a
> case where no substitute was available but the package built just fine
> locally. This is not a perfect situation (nicer would be to track why it
> doesn’t come from CI — sometimes it’s just a resource problem on the
> CI), but if you removed a package I use that would break all updates for
> me.

Well, I do not think that any policy will mark a package for removal on
the first build failure. However, if the same package is still failing
after several X <duration> or attempts, it means something is wrong.
Marking it as a candidate for removal implies:

1. check if the failure is from CI when it builds locally,
2. keep a set of packages that we know they are installable.

For instance, ocaml4.07-* packages are failing since more or less April.


Does it make sense to keep them? For another example, some perl6-*
packages are failing since… 2021.


Does it make sense to keep them?

The usual situation is that CI is able to build the packages. The set
of packages that CI is not able to build is very limited and it is the
exception.

Having a rule to deal with the regular broken packages appears to me a
good thing and very helpful to keep Guix reliable. And that rule cannot
be based on rare exceptional cases.


Toggle quote (2 lines)
> If a change in packages breaks my manifest, that is extremely painful.

Yeah, and such rule for dealing with broken packages will be helpful for
detecting such change and so avoid such situation.

Cheers,
simon
C
Re: People need to report failing builds even though we have ci.guix.gnu.org for that
(name . Simon Tournier)(address . zimon.toutoune@gmail.com)
cuc34zl6u64.fsf@riseup.net
(changing the subject back to the intended one. I think the fact that
someone replies to an automated acknowledgement email like once a week
says indicates that the emails are not communicating clearly what their
purpose is. anyways, on to the actual issue at hand.)

Simon Tournier <zimon.toutoune@gmail.com> writes:

Toggle quote (39 lines)
> Hi,
>
> On Tue, 29 Aug 2023 at 10:45, Maxim Cournoyer <maxim.cournoyer@gmail.com> wrote:
>
>> It's frustrating for users when a package is missing, but it's also
>> frustrating/inefficient for maintainers to stumble upon broken packages
>> when checking if an upgrade broke dependent packages (it takes time to
>> build them just to find out they fail, and researching they already
>> did), so a balance is needed.
>
> There is nothing worse as an user to have this experience:
>
> guix search foobar
>
> oh cool, foobar is there, let try it,
>
> guix shell foobar
>
> … wait …
> … stuff are building …
> … laptop is burning …
> … wait …
> Bang!
>
> Keeping broken packages is just annoyances. Contributor are annoyed
> because as said by the paragraph above. And user are annoyed as
> described just above.
>
> I am in favor to set a policy for removing then.
>
> The question is the way to detect them. QA can do whatever we want but
> until people are helping Chris because, IMHO, Chris is already enough
> busy to keep stuff running, we probably need to keep our process simple
> enough in order to stay actionable and avoid some vacuum of “coulda,
> shoulda or woulda”. For what my opinion is worth on that. :-)
>
> Cheers,
> simon

That is not a package problem but a Guix interface problem. I have been
saying for a while that there needs to be an option to disable all
non-trivial local builds by default when you know your machine can't
handle them.
Alternatively the CI could record some basic resource utilization
information, so users could for example set a limit on RAM. (Although
this gets tricky for parallel builds.)
S
S
Simon Tournier wrote on 11 Sep 2023 09:58
(name . Csepp)(address . raingloom@riseup.net)
CAJ3okZ1XsNiTr6V4b0ogzVbrrXwfhfssWxEFS5q3BuD-Y1xX3Q@mail.gmail.com
Hi,

On Mon, 11 Sept 2023 at 09:33, Csepp <raingloom@riseup.net> wrote:

Toggle quote (5 lines)
> That is not a package problem but a Guix interface problem. I have been
> saying for a while that there needs to be an option to disable all
> non-trivial local builds by default when you know your machine can't
> handle them.

IMHO, your proposal is orthogonal with the issue at hand: broken
packages. Other said, the issue is: how to deal with the set of
packages that will not build and we already know it (since weeks,
months or even years for some).

My workstation can handle all the compilations that are required. My
laptop is able offload to it. The issue about broken packages is not
about the resources. It is about burning resources for nothing.

About the issue you are speaking about, we already had discussions in
this direction -- you are not the only one saying "the fix needs to do
X" for a while but please keep in mind that "talking does not cook the
rice". ;-) Well, maybe you could open a ticket with a concrete
use-case.

Cheers,
simon
D
D
Dr. Arne Babenhauserheide wrote on 11 Sep 2023 10:30
Re: bug#65391: Acknowledgement (People need to report failing builds even though we have ci.guix.gnu.org for that)
(name . Simon Tournier)(address . zimon.toutoune@gmail.com)
874jk183wx.fsf@web.de
Simon Tournier <zimon.toutoune@gmail.com> writes:
Toggle quote (27 lines)
> On Wed, 30 Aug 2023 at 12:39, "Dr. Arne Babenhauserheide" <arne_bab@web.de> wrote:
>> Please don’t remove packages that are broken on the CI. I often had a
>> case where no substitute was available but the package built just fine
>> locally. This is not a perfect situation (nicer would be to track why it
>> doesn’t come from CI — sometimes it’s just a resource problem on the
>> CI), but if you removed a package I use that would break all updates for
>> me.
>
> Well, I do not think that any policy will mark a package for removal on
> the first build failure. However, if the same package is still failing
> after several X <duration> or attempts, it means something is wrong.
> Marking it as a candidate for removal implies:
>
> 1. check if the failure is from CI when it builds locally,
> 2. keep a set of packages that we know they are installable.
>
> For instance, ocaml4.07-* packages are failing since more or less April.
>
> https://data.guix.gnu.org/repository/1/branch/master/package/ocaml4.07-ppxlib/output-history
>
> Does it make sense to keep them? For another example, some perl6-*
> packages are failing since… 2021.
>
> https://data.guix.gnu.org/repository/1/branch/master/package/perl6-xml-writer/output-history
>
> Does it make sense to keep them?

This is a good example, but not for removing broken packages. For
perl6-xml-writer removing the package would keep breakage in Guix.

I just checked the build, and this looks like a Guix packaging error
that breaks the tests due to a change to some unrelated package:
/gnu/store/ap404x14l604wm0gvaj439ga2vjzwnl7-perl6-tap-harness-0.0.7/bin/prove6: /gnu/store/ap404x14l604wm0gvaj439ga2vjzwnl7-perl6-tap-harness-0.0.7/bin/.prove6-real: perl6: bad interpreter: No such file or directory

Disabling the tests makes the package build and work.

So here, removing a package would start at the wrong place: some change
between 2021-02-01 and 2021-04-30 broke the perl6-tap-harness and we did
not detect that.

This is a problem that would get hidden by removing broken packages.

The problem is that we (large inclusive we that stands for all users of
Guix) did not track down this problem that causes the build to fail.

From this I see two distinct cases:

- packages broken upstream
- packages broken by changes in Guix

If a package is broken upstream and not going to get fixed and this
requires regular patching in Guix, I agree that we have to remove it at
some point.

If however a change in Guix breaks packages, that change should get
rolled back / reverted and fixed, so it does not break the packages.

8 | ocaml-migrate-parsetree
^^^^^^^^^^^^^^^^^^^^^^^
Error: Library "ocaml-migrate-parsetree" not found.

This likely means that a change in the inherited package removed the
input, but the breakage wasn’t detected.

And that’s actually what happened in
386ad7d8d14dee2103927d3f3609acc63373156a
Fri Jan 13 10:54:36 2023 +0000

This commit broke ocaml4.07-ppxlib by cleaning up the inputs of
ocaml-ppxlib (not naming names, this is not about shaming but about
detecting the deeper problem).

It should have been rejected (somehow) by CI. The change it would have
required is this:

Toggle diff (79 lines)
diff --git a/gnu/packages/ocaml.scm b/gnu/packages/ocaml.scm
index 8ff755aea9..042432be9a 100644
--- a/gnu/packages/ocaml.scm
+++ b/gnu/packages/ocaml.scm
@@ -6845,6 +6845,9 @@ (define-public ocaml4.07-ppxlib
(base32
"0my9x7sxb329h0lzshppdaawiyfbaw6g5f41yiy7bhl071rnlvbv"))))
(build-system dune-build-system)
+ (propagated-inputs
+ (modify-inputs (package-propagated-inputs ocaml-ppxlib)
+ (prepend ocaml-migrate-parsetree)))
(arguments
`(#:phases
(modify-phases %standard-phases

So for both the cases you named for removal, such a removal would have
caused us to miss actual problems in our process.

This does not mean that there will never be a case in which a package
has to be removed, but given that both cases you showed are likely
self-induced breakage due to changes that should have been rejected as
breaking seemingly unrelated packages, it rather looks like the
situation where removing the package is the right way forward is the
exceptional case.

The norm is that our CI should have detected a problem in the commit
causing the breakage.
(this is reasoning from only two datapoints, so take it with a grain of
salt …)

Can we automatically rebuild all inheriting packages when a package gets
changed?

> The usual situation is that CI is able to build the packages. The set
> of packages that CI is not able to build is very limited and it is the
> exception.
>
> Having a rule to deal with the regular broken packages appears to me a
> good thing and very helpful to keep Guix reliable. And that rule cannot
> be based on rare exceptional cases.

A rule should work with known cases, otherwise it causes known breakage.

Also see above: in the two cases you selected, removing the package
would be the wrong path forward.

> On Wed, 30 Aug 2023 at 12:39, "Dr. Arne Babenhauserheide" <arne_bab@web.de> wrote:
>> If a change in packages breaks my manifest, that is extremely painful.
>
> Yeah, and such rule for dealing with broken packages will be helpful for
> detecting such change and so avoid such situation.

Since a manifest is strictly dependent on all packages defined in it,
removing a single referenced package means that the manifest is broken:
no update works anymore. No security updates come in anymore — even if
the package in question worked locally. This is a situation we should
not cause.

If we had a way to have placeholder packages (similar to the renamings)
that emit warnings for missing packages but do not break the build, that
would reduce the damage done by removing a package. But I think such a
mechanism must be in place and tested before adding a rule to remove
packages.

And as we’ve seen from the two packages you selected, removal wouldn’t
have been the right decision.

The more important question is (serious question and *not* for assigning
blame, but to see whether we can improve processes): with the time we
already spent in this discussion, we could have fixed a lot of packages.
Why did we not do that?

Best wishes,
Arne
--
Unpolitisch sein
heißt politisch sein,
ohne es zu merken.
draketo.de
-----BEGIN PGP SIGNATURE-----

iQJEBAEBCAAuFiEE801qEjXQSQPNItXAE++NRSQDw+sFAmT+2y8QHGFybmVfYmFi
QHdlYi5kZQAKCRAT741FJAPD66SUEADOhqoigJp4lgsKZ2KEctmTfjAthc5Jj7+q
DX4uUtLaqjCrU6SbnMGtyV7V58DxjVdS7e3y2xlDBTWTTXTMQq2xrNya7AkV9dW+
+SnJsdJ2hSZZxwWyD0+H1AjFrQCd2Sqn2E5T5+HIuWUGGHTP8Us9KuygapEMhfGh
Y3B0e+sk6EITF08Tyvrosvqx1eWMD4yc5Q1YNExBiuHyosx6arTdNt9ZdJJTesyw
lxGbk6fK7XzIXkvScsCQhP8/3NmxG2rX0chHpBEAtFns39qMUjsTxycUbHPgO+Ll
MsyDlmBsViczm4pi62F3dxkc88DoQ195ipZu64X0GxFAzpQeQjGGmjx3O5rRdkAE
Og9szgesBqGMYMLdva9VniNzfFMV1Kfq+n3s7ZEiL2x2Hgw6OrYSZWoIxdNBes4x
L70kCHOUJGDffrk/mshMrdcBLIKyonDgODSN3d7KO4T+QZNn+s13NK/amblYhWPz
9XxuDRel0kYBrEX2akn6kCVUjQ7FCXaeYvoZc3+OHAFaM9KciIqVm416dQimQNpG
RmEk7o2gIwVaKU2BV6uMKRWad8KKowc/9d183LHqvEeI5MFg5eWnkFi1OHmOg2Vc
fwtBKxMLYyats9rmWjytx2qQMPxjzPei9Cl7id7lB0Fm7rtcqB6yK2E+bmebm1Lc
y83yKgKTDYjEBAEBCAAuFiEE3Si95tmHXKvOSosd3M8NswvBBUgFAmT+20EQHGFy
bmVfYmFiQHdlYi5kZQAKCRDczw2zC8EFSPAXBACEfo4+TAuuI1p1+j21EwvKvtoC
yJS8/Zi5Jz+1rTkjMI25M9AO863viuzqBcYU6wG4p6Mwn2xKmeBv5lJVQnhKWsk3
cj5h+qWr150/xDqqRTddHssN7Tlo/vL4D9eLxMmm7CU4YzJvv2NAyBQa9rWkrnLJ
D9UZM3Sm3g0EKObyDQ==
=RlhY
-----END PGP SIGNATURE-----

S
S
Simon Tournier wrote on 11 Sep 2023 16:00
(name . Dr. Arne Babenhauserheide)(address . arne_bab@web.de)
87fs3kizd9.fsf@gmail.com
Hi Arne,

( I have not re-read all the thread. )

On Mon, 11 Sep 2023 at 10:30, "Dr. Arne Babenhauserheide" <arne_bab@web.de> wrote:

Toggle quote (13 lines)
>> Well, I do not think that any policy will mark a package for removal on
>> the first build failure. However, if the same package is still failing
>> after several X <duration> or attempts, it means something is wrong.
>> Marking it as a candidate for removal implies:
>>
>> 1. check if the failure is from CI when it builds locally,
>> 2. keep a set of packages that we know they are installable.

> This is a good example, but not for removing broken packages. For
> perl6-xml-writer removing the package would keep breakage in Guix.
>
> I just checked the build, and this looks like a Guix packaging error

This is exactly the effect if we have a policy. :-)

Please, do not read a policy for the removal of broken packages as an
automatic process. As you, I think an automatic process for removing
would be a bad thing about the user experience.

Maybe I misunderstand what a policy is. For me, a policy is a plan that
is used as a basis for making decisions, a policy helps in reaching
conclusion which then can lead to some actions.

Somehow this discussion is the implementation of the policy I am
proposing and that would help the maintenance, IMHO. I have manually
marked this package for removal and…

Toggle quote (2 lines)
> that breaks the tests due to a change to some unrelated package:

…surprise, surprise, someone has checked. :-)

A policy for removal about the broken packages would allow to know what
to do. If the same package is still failing after several X <duration>
or attempts, it means something is wrong.

Currently, either you hit a broken package when doing some Guix
operations. And that is a very poor experience, IMHO. Either one have
to open the dashboard from CI [1], select some red buttons and
investigate. And we can count with few fingers the number of people
doing that.



Toggle quote (2 lines)
> Disabling the tests makes the package build and work.

Here is the point of my proposal to have a policy for removal of broken
packages: automatically check how many times they have failed to build
and automatically tag them when they are considered problematic. If no
one care and these tagged packages are not fixed, then let remove them.

It would drastically help in the maintenance. Otherwise, your help is
very welcome in monitoring all the failures. :-)


Toggle quote (4 lines)
> So here, removing a package would start at the wrong place: some change
> between 2021-02-01 and 2021-04-30 broke the perl6-tap-harness and we did
> not detect that.

Yes, that’s where QA should help: detect unrelated change that have a
long distance impact on unrelated packages.

Changes to the branching/commit policy
Christopher Baines <mail@cbaines.net>
Thu, 08 Jun 2023 15:24:37 +0100
id:87y1kuyqew.fsf@cbaines.net

[bug#63459] [PATCH] doc: Rewrite the branching strategy.
Christopher Baines <mail@cbaines.net>
Fri, 12 May 2023 08:55:20 +0100
id:f339d15842370b97558b704593848e318462b68d.1683878120.git.mail@cbaines.net



Toggle quote (7 lines)
> This does not mean that there will never be a case in which a package
> has to be removed, but given that both cases you showed are likely
> self-induced breakage due to changes that should have been rejected as
> breaking seemingly unrelated packages, it rather looks like the
> situation where removing the package is the right way forward is the
> exceptional case.

We are miscommunicating. Or we have a very different vision about what
should be the reliability of Guix.

As a regular user, I need perl6-tap-harness, so I type “guix install
perl6-tap-harness”, and bang, it fails.

As a regular user, I do not mind if the problem is coming from some
change between 2021-02-01 and 2021-04-30 or if it comes from something
else. What I want is that “guix install perl6-tap-harness” just works.

Having a clear policy for removal – again not an automatic removal
procedure – would help all, IMHO.


Toggle quote (6 lines)
> The norm is that our CI should have detected a problem in the commit
> causing the breakage.
>
> Can we automatically rebuild all inheriting packages when a package gets
> changed?

CI builds all the commits pushed to Savannah. Not exactly all but
that’s another story and it does not matter for this discussion.

AFAIK, no one is checking that the commit they are pushing does not lead
to break something. Else they would not push it I guess. ;-)

Instead, it is QA that builds “pre-commit“ (patches). Thanks to
tireless Chris’s work since years, we have some tools for monitoring the
impact of one change on the whole package set. Somehow, if I have
correctly understood, QA uses the Build Coordinator to list all the
derivations and then build all the new ones generated by the change.

So the answer to your question is yes. :-) Aside, help is welcome for
improving QA.


Toggle quote (3 lines)
> Also see above: in the two cases you selected, removing the package
> would be the wrong path forward.

Removing a package that is broken since 2021 is the good path forward.

If you care about one package that is marked to be removed soon, then
you fix it or raise your concern. Else it means no one care and so what
is the point to keep broken packages that no one uses?


Toggle quote (12 lines)
>> On Wed, 30 Aug 2023 at 12:39, "Dr. Arne Babenhauserheide" <arne_bab@web.de> wrote:
>>> If a change in packages breaks my manifest, that is extremely painful.
>>
>> Yeah, and such rule for dealing with broken packages will be helpful for
>> detecting such change and so avoid such situation.
>
> Since a manifest is strictly dependent on all packages defined in it,
> removing a single referenced package means that the manifest is broken:
> no update works anymore. No security updates come in anymore — even if
> the package in question worked locally. This is a situation we should
> not cause.

Again, I am not proposing an automatic removal process but a policy. A
policy could imply some news or some message saying: these packages will
be removed soon because they are broken.

Assuming this case: the package fails on CI and pass on your machine.
Let assume you have not been enough annoyed for reporting the failure of
the substitutes.

Currently, the situation can stay like that for a long time. It means
that each time something in the dependency graph of that package is
changed, then we burn electricity for re-building it for nothing.

What I am proposing is: if the same package is still failing after
several X <duration> or attempts, then we mark it as ‘broken’ and it
becomes a candidate for a removal. People who care raise their hand.
And we have a better idea about the real status.


Toggle quote (4 lines)
> The more important question is (serious question and *not* for assigning
> blame, but to see whether we can improve processes): with the time we
> already spent in this discussion, we could have fixed a lot of packages.

This was exactly what I was going to answer you. :-)

Toggle quote (2 lines)
> Why did we not do that?

I speak for myself, for many packages that are broken, my first question
is: is it worth to investigate? My estimate starts with a mix between
do I need them? and will the user experience be better compared to my
time spent to investigate.


Cheers,
simon
C
Re: People need to report failing builds even though we have ci.guix.gnu.org for that
(name . Simon Tournier)(address . zimon.toutoune@gmail.com)
cucedj4v0ju.fsf@riseup.net
Simon Tournier <zimon.toutoune@gmail.com> writes:

Toggle quote (27 lines)
> Hi,
>
> On Mon, 11 Sept 2023 at 09:33, Csepp <raingloom@riseup.net> wrote:
>
>> That is not a package problem but a Guix interface problem. I have been
>> saying for a while that there needs to be an option to disable all
>> non-trivial local builds by default when you know your machine can't
>> handle them.
>
> IMHO, your proposal is orthogonal with the issue at hand: broken
> packages. Other said, the issue is: how to deal with the set of
> packages that will not build and we already know it (since weeks,
> months or even years for some).
>
> My workstation can handle all the compilations that are required. My
> laptop is able offload to it. The issue about broken packages is not
> about the resources. It is about burning resources for nothing.
>
> About the issue you are speaking about, we already had discussions in
> this direction -- you are not the only one saying "the fix needs to do
> X" for a while but please keep in mind that "talking does not cook the
> rice". ;-) Well, maybe you could open a ticket with a concrete
> use-case.
>
> Cheers,
> simon

I was hoping to get some consensus on whether this is actually a
bug/feature that others consider worth tracking, so I kept discussion of
it mostly to guix-devel, but sure, I can make a proper issue for it.
D
D
Dr. Arne Babenhauserheide wrote on 12 Sep 2023 01:12
Re: bug#65391: Acknowledgement (People need to report failing builds even though we have ci.guix.gnu.org for that)
(name . Simon Tournier)(address . zimon.toutoune@gmail.com)
87v8cg7068.fsf@web.de
Hi,


I’m skipping a lot to get only to the most important points (save time
for us all).

Simon Tournier <zimon.toutoune@gmail.com> writes:
Toggle quote (9 lines)
> Instead, it is QA that builds “pre-commit“ (patches). Thanks to
> tireless Chris’s work since years, we have some tools for monitoring the
> impact of one change on the whole package set. Somehow, if I have
> correctly understood, QA uses the Build Coordinator to list all the
> derivations and then build all the new ones generated by the change.
>
> So the answer to your question is yes. :-) Aside, help is welcome for
> improving QA.

So something was missing there that let the change to the ocaml package
slip through this january. This should have raised red flags somewhere.

Do we have documentation on the process? (link?)

Toggle quote (5 lines)
>> Since a manifest is strictly dependent on all packages defined in it,
>> removing a single referenced package means that the manifest is broken:
>> no update works anymore. No security updates come in anymore — even if
>> the package in question worked locally. This is a situation we should
>> not cause.
Toggle quote (5 lines)
> What I am proposing is: if the same package is still failing after
> several X <duration> or attempts, then we mark it as ‘broken’ and it
> becomes a candidate for a removal. People who care raise their hand.
> And we have a better idea about the real status.

This means with the current functionality that the manifest is broken at
that point. Nothing can be updated anymore. I’ve been in that situation
a few times already with broken packages and it caused weeks of not
being able to update because I didn’t have the time to investigate.

That’s why I wrote the following:

Toggle quote (6 lines)
> If we had a way to have placeholder packages (similar to the renamings)
> that emit warnings for missing packages but do not break the build, that
> would reduce the damage done by removing a package. But I think such a
> mechanism must be in place and tested before adding a rule to remove
> packages.

This would cause us to collect a slowly growing list of removed packages
that will be ignored (except for the warning) in manifests.

That way we would avoid breaking the setup when removing a package.


(define-public-removed the-package-variable
(removed-package
(name "the-package-name")
(reason-for-removal "upstream stopped working a decade ago")))


The key difference between your scenario "some package is broken and I
cannot install it" and my scenario "I have a package in my manifest that
gets removed, breaking my manifest" is that mine is much more painful
because an update breaks changing a working system.

In my scenario I don’t just see "oh, this doesn’t work, let’s choose
another way", but a way I’ve been using and building on gets broken.

Also I experienced that at least twice already. That I had to go and
investigate before I could add a package to my manifest, because the
manifest was broken by a removed package. In at least one instance I had
not been able to update for several weeks before that and didn’t have
time and energy to investigate.

Once I had missed that my system had not updated in months, because I
did reconfigure in a cron job and a removed package had broken
/etc/config.scm


And we actually select for such breakage, because I cannot see locally
whether a package failed on CI, so while I can see (and have to fix)
packages that fail locally, on-CI-failures are invisible.


So instead of removing a package, I think the first step in a process
should be to warn everyone with that package in the manifest that it’s
broken on CI ⇒ add a warning to that package, like the rename warnings.

If no one takes it up for a few months, replace it with a
removed-package placeholder that warns to clean up the manifest. And
just keep that placeholder in place to avoid breaking manifests.


Best wishes,
Arne
--
Unpolitisch sein
heißt politisch sein,
ohne es zu merken.
draketo.de
-----BEGIN PGP SIGNATURE-----

iQJEBAEBCAAuFiEE801qEjXQSQPNItXAE++NRSQDw+sFAmT/pGAQHGFybmVfYmFi
QHdlYi5kZQAKCRAT741FJAPD66b3D/wK3RPNoLuuKMK32qyomMxgEhArxdR3Zvcq
sR+3l1Hmwzn21YneCOfpm4miVyT6JQ7Yq9R6KTFC9ax7kBmPwrtdCT5obzdMimPy
vTT4vDK82Mg2rt1+E2IbbwHT98kgW6tNpF2NHjEi5A4mSltO28nxNwcZehWx17lU
biPw1OaLpfAckOHAAeWEYIsow9bxyAqtrBr981AjxFUieM51VUcmaAK0zu6Tquet
0r7yXYASP9YnVB8xpA9pojBb95ZyZ2GOr3TQAA7rvTIIW2a9qhjaQgsrLxDftTDk
2OO2Z1Th+E8hZ3NRUZ2lvcoWK396Zpp+Y7i6jc4np+nzKylebSt7mwMU/hlr4RbE
9io59DyTT+/CE0SZjtQt3LMXVdgr0bFqOpu+ibqmZcJXEXoy02/UQTbPNLUaZmHd
dZj97z9EJg1kfMTVe0Z/jHTdTKJv6v/tjeIlsVahYPJ/YOM/hvPjTCyM+OEZCiup
oZc9pTg64tgOHzd7dZF8xxrAVsi/qiX6kyveLsI1UKy/N8i6UBASdF+tz7Gck1GY
uJOyW+sHii1Fmt6wtLpty5HpDWI0N+nTJLqDQBSakfv9XlX3wmaRzAW//ILXDKYt
vbChmLMi3+cUKAmBMLtlMqxu7rveocRC6JihR62rgBi7BksdhZ2jNp4zs7KtF4wv
K3tafyacI4jEBAEBCAAuFiEE3Si95tmHXKvOSosd3M8NswvBBUgFAmT/pGAQHGFy
bmVfYmFiQHdlYi5kZQAKCRDczw2zC8EFSEUMA/0Y9+/x/x04FOrBKcwek/GdwfdJ
8vJ/MXZVN0LlOqgAFQniw0VppwyZLRFqZsLvrDFpf5CT3dBUDE/Le7Iw+KFvDO59
cSb0A9ouj9e0lVFd0gQKl/UNg7x48hVwlte8isFhrH4A8vs2NurJ0Sq7DyJDE+1o
XYKBUr+rmeAmEDTUjQ==
=NF+6
-----END PGP SIGNATURE-----

S
S
Simon Tournier wrote on 12 Sep 2023 02:39
(name . Dr. Arne Babenhauserheide)(address . arne_bab@web.de)
86r0n4xm0h.fsf@gmail.com
Hi Arne,

On Tue, 12 Sep 2023 at 01:12, "Dr. Arne Babenhauserheide" <arne_bab@web.de> wrote:

Toggle quote (3 lines)
> I’m skipping a lot to get only to the most important points (save time
> for us all).

Good initiative, let me do the same. :-)


Toggle quote (19 lines)
> That’s why I wrote the following:
>
>> If we had a way to have placeholder packages (similar to the renamings)
>> that emit warnings for missing packages but do not break the build, that
>> would reduce the damage done by removing a package. But I think such a
>> mechanism must be in place and tested before adding a rule to remove
>> packages.
>
> This would cause us to collect a slowly growing list of removed packages
> that will be ignored (except for the warning) in manifests.
>
> That way we would avoid breaking the setup when removing a package.
>
> (define-public-removed the-package-variable
> (removed-package
> (name "the-package-name")
> (reason-for-removal "upstream stopped working a decade ago")))
>

Here you are defining a policy:

1. set a rule for replacing the package by ’removed-package’
2. set a rule for effectively removing this package

Somehow you are discussing to have a rule to deal with the broken
packages. A policy, no? :-)

Having a rule to deal with the regular broken packages appears to me a
good thing and very helpful to keep Guix reliable.

Therefore, we agree that making a policy for dealing with broken
packages is worth and it would help to have a better Guix.

It appears to me better to know what I can expect as an user than to
have some surprise after each “guix pull”. I have in mind the sudden
removal of Python 2 packages for instance. With such policy, it would
have been smoother, IMHO.

That’s said, two minor points that does not matter much. :-)

I do not understand your explanations with the manifest because I do not
see the difference if one element of your manifest is broken or if this
very same element is removed. For the both cases, your manifest is
broken, no? From the point of view of the profile generation, broken or
removed does not change the result, isn’t it? Broken or removed only
changes the process for investigating and try to fix, no?

The only case where it could matter is if your manifest relies on
package variant. That case, if the package becomes broken, the variant
could not be. Well, if that’s the case, I would suggest that you
maintain these packages using a plain copy of the inherited package.
Because a perfectly working update could break your variant. I mean, if
your manifest relies on package variant, then this manifest is highly
dependant on the changes whatever the status of the package.

In all cases, I share your concerns, and as you, I am time to time
bitten by stuff that break. If I am honest, I barely update my base
system. Before an update, I carefully check a commit using “guix
time-machine” and test that my config works. Somehow I often use the
command-line “guix time-machine -- shell -m”.

On a side note, I am not convinced we will have the resource to change
the package definition as your proposing. That’s another story and it
appears to me the part of the discussion for a policy (strategy) for
removing packages. I guess. :-)

That’s long enough. ;-)

Cheers,
simon
M
M
Maxime Devos wrote on 12 Sep 2023 20:43
bug#65391: Acknowledgement (People need to report failing builds even though we have ci.guix.gnu.org for that)
bd82096e-a122-95d2-f52d-b5839c85e7d7@telenet.be
Op 07-09-2023 om 13:32 schreef Simon Tournier:
Toggle quote (27 lines)
> Hi,
>
> On Tue, 29 Aug 2023 at 10:45, Maxim Cournoyer <maxim.cournoyer@gmail.com> wrote:
>
>> It's frustrating for users when a package is missing, but it's also
>> frustrating/inefficient for maintainers to stumble upon broken packages
>> when checking if an upgrade broke dependent packages (it takes time to
>> build them just to find out they fail, and researching they already
>> did), so a balance is needed.
>
> There is nothing worse as an user to have this experience:
>
> guix search foobar
>
> oh cool, foobar is there, let try it,
>
> guix shell foobar
>
> … wait …
> … stuff are building …
> … laptop is burning …
> … wait …
> Bang!
>
> Keeping broken packages is just annoyances. Contributor are annoyed
> because as said by the paragraph above. And user are annoyed as
> described just above.
>
Toggle quote (1 lines)
> I am in favor to set a policy for removing then.
You don't need to keep broken packages, they can be fixed instead.
Although given later e-mails, I suppose that this hypothetical policy
for removing them would contain things about fixing them instead.
It's this focus on 'broken -> delete' that bothers me, why is the first
reaction ‘delete them’, not ‘fix them’?
Toggle quote (4 lines)
> Op 11-09-2023 om 16:00 schreef Simon Tournier:
>> If you care about one package that is marked to be removed soon, then
>> you fix it or raise your concern. Else it means no one care and so what
>> is the point to keep broken packages that no one uses?
It doesn't mean that. As I wrote previously:
Toggle quote (6 lines)
>> We could bump the expiry time to 180 days, or even 365 days (a full
>> year). If nobody opens an issue for a broken package in that amount of
>> time, it's probably not used much if at all and may not be worth the
>> maintenance burden.
> [...]
> No, it doesn't mean that that the package is not used much, it could instead mean that the people using the package (or interested in using the package, if it was already broken when they discovered it) thought that the existence of ci.guix.gnu.org means that contributors doing Guix maintenance already know that the package is broken and assumed that it would be fixed, and that a new bug report would just be annoying the contributors because they already have a bug report: the build failure on ci.guix.gnu.org.
---
> The more important question is (serious question and *not* for
> assigning blame, but to see whether we can improve processes): with
> the time we already spent in this discussion, we could have fixed a
> lot of packages. Why did we not do that?
Speaking only for myself:
* (because I chose to mostly not work on Guix anymore for reasons that
aren't relevant to this discussion)
* if I were to fix broken packages, I would like others to avoid
creating new breakage (and if breakage occurs, then fix it it
early). (Otherwise, not much point to it ...)
Hence, there needs be some discussion to ensure that other people
don't do that new breakage in the future.
* hearing ‘delete it’ as first reaction to ‘broken package’ is rather
demoralising to people fixing packages. It's so ... defeatist.
Sure people with this reaction add a few qualifiers to when it is
to _not_ be removed, but it sounds rather hollow.
Instead of having a ‘removal policy’ that lays down exceptions that
indicate when the package should instead be kept, I would rather have a
‘fixing policy’ that has exceptions indicating when the package may
instead be removed.
In a sense, those are technically equivalent, but the different framing
makes a difference in motivation.
Best regards,
Maxime Devos.
Attachment: OpenPGP_signature
A
A
Andreas Enge wrote on 14 Feb 10:13 +0100
Close
(address . 65391-done@debbugs.gnu.org)
ZcyEPIwweV0TPUJH@jurong
After reading through the first tenth of what seems to be an interesting
discussion and skimming through the remainder, I take the liberty to close
this bug. Such a discussion had better take place on guix-devel; the report
itself does not start with an actionable proposal: "People need to..."
looks more like an infinite task to me that cannot be closed as finished
if taken literally.

I understand that certain concrete proposals coming from the discussion
have been filed as separate issues, and would suggest that people
interested in the topic continue to do so.

Andreas
Closed
?
Your comment

This issue is archived.

To comment on this conversation send an email to 65391@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 65391
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch