People need to report failing builds even though we have ci.guix.gnu.org for that

Done

Details

10 participants

Andreas Enge
Dr. Arne Babenhauserheide
Andy Tai
Giovanni Biscuolo
宋文武
Maxim Cournoyer
Maxime Devos
Bruno Victal
Csepp
Simon Tournier

Owner: unassigned

Submitted by: Maxime Devos

Severity: normal

Debbugs page

Maxime Devos wrote 2 years ago

Recipients:(name . bug-guix)(address . bug-guix@gnu.org)

Message-ID:295ef8c8-574a-4169-98f3-6d9aaeb773f1@telenet.be

For example, naev used to work just fine, yet apparently it doesn't 
anymore: https://issues.guix.gnu.org/65390.

Given that Guix has ci.guix.gnu.org, I would expect such new problems to 
be detected and resolved early, and it was detected by ci.guix.gnu.org, 
yet going by issues.guix.gnu.org it was never even investigated.

(Yes, there is a delay, but that doesn't matter at all, as there's this 
dashboard https://ci.guix.gnu.org/eval/668365/dashboard.)

Do people really need to report 33% of all jobs 
(https://ci.guix.gnu.org/eval/668365/dashboard)before those failures 
are taken seriously, instead of the ‘there don't seem to be that much 
more build failures from the core-updates/... merge, let's solve them 
later (i.e., never)’ that seems to be  status quo?

Best regards,
Maxime Devos

Attachment: OpenPGP_0x49E3EE22191725EE.asc

Attachment: OpenPGP_signature

Csepp wrote 2 years ago

Recipients:(name . Maxime Devos)(address . maximedevos@telenet.be)

Message-ID:874jkqeiox.fsf@riseup.net

Maxime Devos <maximedevos@telenet.be> writes:

Toggle quote (25 lines)> [[PGP Signed Part:Undecided]]
> For example, naev used to work just fine, yet apparently it doesn't
> anymore: https://issues.guix.gnu.org/65390.
>
> Given that Guix has ci.guix.gnu.org, I would expect such new problems
> to be detected and resolved early, and it was detected by
> ci.guix.gnu.org, yet going by issues.guix.gnu.org it was never even
> investigated.
>
> (Yes, there is a delay, but that doesn't matter at all, as there's
> this dashboard <https://ci.guix.gnu.org/eval/668365/dashboard>.)
>
> Do people really need to report 33% of all jobs
> (https://ci.guix.gnu.org/eval/668365/dashboard) before those failures
> are taken seriously, instead of the ‘there don't seem to be that much
> more build failures from the core-updates/... merge, let's solve them
> later (i.e., never)’ that seems to be  status quo?
>
> Best regards,
> Maxime Devos
>
> [2. OpenPGP public key --- application/pgp-keys; OpenPGP_0x49E3EE22191725EE.asc]...
>
> [[End of PGP Signed Part]]

I tried signing up to the CI mailing list and it immediately became
overwhelming.
Also the CI UI could use some improvements.  I'm pretty sure I've
mentioned this before, but there is no easy way to find out which inputs
I need to fix to make a dependency failure disappear.  I think everyone
has better things to do than perform a linear search by hand.
So I rely on my own installations for detecting errors, that way I at
least know that I don't get flooded with notifications for packages I
know nothing about.
One possible improvement I have been thinking about is making it easy
for users to filter CI output to the packages in their profile closure,
so for example they would get advance notice of any broken packages
*before* attempting to install them.
Teams could also have their own filters.

Simon Tournier wrote 2 years ago

Recipients:(address . 65391@debbugs.gnu.org)

Message-ID:86zg2gyd7n.fsf@gmail.com

Hi,

On Wed, 23 Aug 2023 at 01:45, Csepp <raingloom@riseup.net> wrote:

Toggle quote (6 lines)

> One possible improvement I have been thinking about is making it easy

> for users to filter CI output to the packages in their profile closure,

> so for example they would get advance notice of any broken packages

> *before* attempting to install them.

> Teams could also have their own filters.

Maybe I am missing what you would like, from my understanding, that’s

already possible using time-machine and weather. For example,

guix time-machine -- weather -m manifest.scm

allow to know the status of the last commit. What is missing is a clear

return code for chaining. For instance, see this proposal:

subject: guix weather exit status?

from: Leo Famulari <leo@famulari.name>

date: Thu, 08 Jul 2021 16:35:03 -0400

message-id: id:YOdhd7FfMOvKjTQe@jasmine.lan

https://yhetil.org/guix/YOdhd7FfMOvKjTQe@jasmine.lan

However, I agree that the next step (find the log of the broken package)

for teams is a bit convoluted.

Cheers,

simon

Maxime Devos wrote 2 years ago

Recipients:(name . Csepp)(address . raingloom@riseup.net)

Message-ID:ad986d87-4da7-3df4-0cd5-0fb156d0498c@telenet.be

Op 23-08-2023 om 01:45 schreef Csepp:

Toggle quote (4 lines)

> Also the CI UI could use some improvements. I'm pretty sure I've

> mentioned this before, but there is no easy way to find out which inputs

> I need to fix to make a dependency failure disappear. I think everyone

> has better things to do than perform a linear search by hand.

Go to the package of a failed build, e.g.
https://ci.guix.gnu.org/build/1840209/details. The dependencies you 
need to fix are marked with a red cross or a red danger triangle. In 
case of a danger triangle, you need to look at the dependencies of the 
dependency, which you can visit via the hyperlink.

I don't see any linear search here.

Best regards,
Maxime Devos.

Attachment: OpenPGP_0x49E3EE22191725EE.asc

Attachment: OpenPGP_signature

Maxime Devos wrote 2 years ago

Recipients:(name . Csepp)(address . raingloom@riseup.net)(address . 65391@debbugs.gnu.org)

Message-ID:69663b24-0736-df6f-9ee4-95a77ea77f18@telenet.be

Op 23-08-2023 om 01:45 schreef Csepp:

Toggle quote (2 lines)

> I tried signing up to the CI mailing list and it immediately became

> overwhelming.

If the CI list was split in ‘broken’ and ‘fixed’, such that you have the 
option to only subscribe to ‘broken’, would that help?  A large fraction 
of messages is for fixed packages, which do not need to be acted upon.

Toggle quote (4 lines)

> One possible improvement I have been thinking about is making it easy

> for users to filter CI output to the packages in their profile closure,

> so for example they would get advance notice of any broken packages

> *before* attempting to install them.

I assume you meant s/install/update.

How is this an improvement?  I mean, how does this make

‘People need to report failing builds even though we have 
ci.guix.gnu.org for that.’

less true?

Best regards,
Maxime Devos.

Attachment: OpenPGP_0x49E3EE22191725EE.asc

Attachment: OpenPGP_signature

Csepp wrote 2 years ago

Recipients:(name . Simon Tournier)(address . zimon.toutoune@gmail.com)

Message-ID:87h6oobbcm.fsf@riseup.net

Simon Tournier <zimon.toutoune@gmail.com> writes:

Toggle quote (30 lines)> Hi,
>
> On Wed, 23 Aug 2023 at 01:45, Csepp <raingloom@riseup.net> wrote:
>
>> One possible improvement I have been thinking about is making it easy
>> for users to filter CI output to the packages in their profile closure,
>> so for example they would get advance notice of any broken packages
>> *before* attempting to install them.
>> Teams could also have their own filters.
>
> Maybe I am missing what you would like, from my understanding, that’s
> already possible using time-machine and weather.  For example,
>
>    guix time-machine -- weather -m manifest.scm
>
> allow to know the status of the last commit.  What is missing is a clear
> return code for chaining.  For instance, see this proposal:
>
>         subject: guix weather exit status?
>         from: Leo Famulari <leo@famulari.name>
>         date: Thu, 08 Jul 2021 16:35:03 -0400
>         message-id: id:YOdhd7FfMOvKjTQe@jasmine.lan
>         https://yhetil.org/guix/YOdhd7FfMOvKjTQe@jasmine.lan
>
> However, I agree that the next step (find the log of the broken package)
> for teams is a bit convoluted.
>
> Cheers,
> simon

Thanks, I was not aware of this solution, but it also kind of isn't a
complete solution.
A pull is a quite costly operation, why should I have to perform one on
my netbook when what I'm trying to find out is which commit is actually
worth pulling to?

Csepp wrote 2 years ago

Recipients:(name . Maxime Devos)(address . maximedevos@telenet.be)

Message-ID:87cyzcbau0.fsf@riseup.net

Maxime Devos <maximedevos@telenet.be> writes:

Toggle quote (24 lines)> [[PGP Signed Part:Undecided]]
>
>
> Op 23-08-2023 om 01:45 schreef Csepp:
>> Also the CI UI could use some improvements.  I'm pretty sure I've
>> mentioned this before, but there is no easy way to find out which inputs
>> I need to fix to make a dependency failure disappear.  I think everyone
>> has better things to do than perform a linear search by hand.
>
> Go to the package of a failed build, e.g.
> <https://ci.guix.gnu.org/build/1840209/details>. The dependencies you
> need to fix are marked with a red cross or a red danger triangle. In
> case of a danger triangle, you need to look at the dependencies of the
> dependency, which you can visit via the hyperlink.
>
> I don't see any linear search here.
>
> Best regards,
> Maxime Devos.
>
> [2. OpenPGP public key --- application/pgp-keys; OpenPGP_0x49E3EE22191725EE.asc]...
>
> [[End of PGP Signed Part]]

That is precisely what the linear search algorithm is.  I should not
have to look through the dependency tree to figure out if two package
failures have the same cause, or to know how many (possibly indirect)
dependencies of a package are failing.
As an example, pandoc often fails to build on i686, but when you look at
the CI page, you see that it was caused by several of its inputs
failing, all due to some of *their* dependencies.
Now, you could dig down on one branch of the dependency DAG and find one
failing package, but that doesn't *actually* answer the question: "what
packages do I need to fix to enable this one?", because it could have
multiple failing inputs instead of just one.  The only way to tell is to
look at each page, that means having to visually find each failing input
on the page, wait for their CI pages to load, and repeat the whole
process.
If your browser is not particularly fast or you aren't so quick at
navigating a webpage, this can take a while.
But for the CI server, generating this information would take less than
a second.
Maybe some people value their time so little that they are fine with
doing this the manual way, but personally I have better things to do.

Csepp wrote 2 years ago

Recipients:(name . Maxime Devos)(address . maximedevos@telenet.be)

Message-ID:875y54baeh.fsf@riseup.net

Maxime Devos <maximedevos@telenet.be> writes:

Toggle quote (10 lines)> [[PGP Signed Part:Undecided]]
> Op 23-08-2023 om 01:45 schreef Csepp:
>> I tried signing up to the CI mailing list and it immediately became
>> overwhelming.
>
> If the CI list was split in ‘broken’ and ‘fixed’, such that you have
> the option to only subscribe to ‘broken’, would that help?  A large
> fraction of messages is for fixed packages, which do not need to be
> acted upon.

Yup, that would be an improvement. Or some way to group messages by

package.

Toggle quote (21 lines)>> One possible improvement I have been thinking about is making it easy
>> for users to filter CI output to the packages in their profile closure,
>> so for example they would get advance notice of any broken packages
>> *before* attempting to install them.
>
> I assume you meant s/install/update.
>
> How is this an improvement?  I mean, how does this make
>
> ‘People need to report failing builds even though we have
> ci.guix.gnu.org for that.’
>
> less true?
>
> Best regards,
> Maxime Devos.
>
> [2. OpenPGP public key --- application/pgp-keys; OpenPGP_0x49E3EE22191725EE.asc]...
>
> [[End of PGP Signed Part]]

A user is more likely to be able and motivated to fix a package that
they are using.  Getting notifications as a stream is a recipe for alert
fatigue.  There needs to be a way to at the very least move actionable
alert to the top of the list and to deduplicate alerts.
TLDR: alert fatigue is bad and it should not be the casual contributor's
job to fight it on their own.  If its filtering and grouping is expected
to be done on the client side then there should be guides for setting
those filters up.
Personally, it already takes enough time for me to read the bug
discussions.

宋文武 wrote 2 years ago

Recipients:(name . Maxime Devos)(address . maximedevos@telenet.be)

Message-ID:871qfpxp76.fsf@envs.net

Maxime Devos <maximedevos@telenet.be> writes:

Toggle quote (8 lines)

> For example, naev used to work just fine, yet apparently it doesn't

> anymore: https://issues.guix.gnu.org/65390.

> Given that Guix has ci.guix.gnu.org, I would expect such new problems

> to be detected and resolved early, and it was detected by

> ci.guix.gnu.org, yet going by issues.guix.gnu.org it was never even

> investigated.

Yes, honestly I only look for build failures from bug reports, not from

CI if i'm not doing a "request for merge" from another branch.

Toggle quote (4 lines)

> (Yes, there is a delay, but that doesn't matter at all, as there's

> this dashboard <https://ci.guix.gnu.org/eval/668365/dashboard>.)

I found the dashboard inconvenient to use, it show failures for both

builds and dependencies in the same red color, and can't be searched.

What I usually do is:

1. download the job status json with:

wget -O jobs.json 'https://ci.guix.gnu.org/api/jobs?evaluation=692229&system=x86_64-linux'

2. use jq to show package names with build failures:

cat jobs.json | jq '. | map(select(.status == 1)) | .[].name' -r

3. select interested one to investigate (if doing merge, diff the failures from

working branch with master).

Toggle quote (7 lines)

> Do people really need to report 33% of all jobs

> (https://ci.guix.gnu.org/eval/668365/dashboard) before those failures

> are taken seriously, instead of the ‘there don't seem to be that much

> more build failures from the core-updates/... merge, let's solve them

> later (i.e., never)’ that seems to be status quo?

Maybe we can automatically report the failures as bugs, say every 7

days, and remove a package if it still fail to build in 90 days?

As for now, x86_64 master (eval 668365) has 696 build failures, 604

dependencies failures, 30 unknown (canceld?) failures, total 1330

failures according to the jobs.json data.

Should we open a bug report for each of those 696 build failures?

Attachment: ooo

Maxim Cournoyer wrote 2 years ago

Recipients:(name . 宋文武)(address . iyzsong@envs.net)

Message-ID:87sf85i289.fsf@gmail.com

Hello,

宋文武 <iyzsong@envs.net> writes:

Toggle quote (30 lines)> Maxime Devos <maximedevos@telenet.be> writes:
>
>> For example, naev used to work just fine, yet apparently it doesn't
>> anymore: https://issues.guix.gnu.org/65390.
>>
>> Given that Guix has ci.guix.gnu.org, I would expect such new problems
>> to be detected and resolved early, and it was detected by
>> ci.guix.gnu.org, yet going by issues.guix.gnu.org it was never even
>> investigated.
>
> Yes, honestly I only look for build failures from bug reports, not from
> CI if i'm not doing a "request for merge" from another branch.
>
>>
>> (Yes, there is a delay, but that doesn't matter at all, as there's
>> this dashboard <https://ci.guix.gnu.org/eval/668365/dashboard>.)
>
> I found the dashboard inconvenient to use, it show failures for both
> builds and dependencies in the same red color, and can't be searched.
> What I usually do is:
>
> 1. download the job status json with:
>   wget -O jobs.json 'https://ci.guix.gnu.org/api/jobs?evaluation=692229&system=x86_64-linux'
>
> 2. use jq to show package names with build failures:
>   cat jobs.json  | jq '. | map(select(.status == 1)) | .[].name' -r
>
> 3. select interested one to investigate (if doing merge, diff the failures from
> working branch with master).

Maybe we should open Cuirass feature requests on our bug tracker to

remember what would be valuable to implement.

Toggle quote (9 lines)

>> Do people really need to report 33% of all jobs

>> (https://ci.guix.gnu.org/eval/668365/dashboard) before those failures

>> are taken seriously, instead of the ‘there don't seem to be that much

>> more build failures from the core-updates/... merge, let's solve them

>> later (i.e., never)’ that seems to be status quo?

> Maybe we can automatically report the failures as bugs, say every 7

> days, and remove a package if it still fail to build in 90 days?

That's sounds reasonable to me.

Toggle quote (6 lines)

> As for now, x86_64 master (eval 668365) has 696 build failures, 604

> dependencies failures, 30 unknown (canceld?) failures, total 1330

> failures according to the jobs.json data.

> Should we open a bug report for each of those 696 build failures?

I'm not against, though that sounds like a lot of work unless automated.

Thanks,

Maxim

Maxim Cournoyer wrote 2 years ago

Recipients:(name . 宋文武)(address . iyzsong@envs.net)

Message-ID:87o7iti25x.fsf@gmail.com

Hi again,

宋文武 <iyzsong@envs.net> writes:

Toggle quote (13 lines)> Maxime Devos <maximedevos@telenet.be> writes:
>
>> For example, naev used to work just fine, yet apparently it doesn't
>> anymore: https://issues.guix.gnu.org/65390.
>>
>> Given that Guix has ci.guix.gnu.org, I would expect such new problems
>> to be detected and resolved early, and it was detected by
>> ci.guix.gnu.org, yet going by issues.guix.gnu.org it was never even
>> investigated.
>
> Yes, honestly I only look for build failures from bug reports, not from
> CI if i'm not doing a "request for merge" from another branch.

Another idea I had was we could add some feature to Cuirass where it'd
notify a team (by email) when a package under their scope has been
broken on the master branch.

-- 
Thanks,
Maxim

Bruno Victal wrote 2 years ago

Recipients:

Message-ID:283d99d4-4682-4577-b69c-f064ff5cd179@makinata.eu

On 2023-08-27 02:13, 宋文武 wrote:

Toggle quote (3 lines)

> Maybe we can automatically report the failures as bugs, say every 7

> days, and remove a package if it still fail to build in 90 days?

I'm not so sure about removing packages, personally if I'm in need of

a package that happens to be broken I find it easier to fix it given

that some work has already been put into writing the package definition

than starting from scratch.

Furthermore, I consider that nonfree software must be eradicated.

Cheers,

Bruno.

宋文武 wrote 2 years ago

[Cuirass] feature requests for dashboard

Recipients:(address . bug-guix@gnu.org)

Message-ID:877cpgc336.fsf_-_@envs.net

Hello, I think the current CI dashboard (eg: https://ci.guix.gnu.org/eval/693369/dashboard)

is a little inconvenient to use, and I'd like it have:

1. different colors for build failures (status=1) and dependencies

failures (status=2), and other type failures. Maybe yellow for

dependencies failures, and grey for other.

2. more search options in addition to job name, eg:

status:failed

status:failed-dependency

status:canceled

team:python

also a help like in mumi https://issues.guix.gnu.org/help#searchfor

those options.

3. for a failed build, show a link to its bug report on

issues.guix.gnu.org if one existed.

eg, for: https://ci.guix.gnu.org/build/1170869/details

add a Issue row with link to https://issues.guix.gnu.org/65392

so we can know this build failure is known.

Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

Toggle quote (16 lines)>> I found the dashboard inconvenient to use, it show failures for both
>> builds and dependencies in the same red color, and can't be searched.
>> What I usually do is:
>>
>> 1. download the job status json with:
>>   wget -O jobs.json 'https://ci.guix.gnu.org/api/jobs?evaluation=692229&system=x86_64-linux'
>>
>> 2. use jq to show package names with build failures:
>>   cat jobs.json  | jq '. | map(select(.status == 1)) | .[].name' -r
>>
>> 3. select interested one to investigate (if doing merge, diff the failures from
>> working branch with master).
>
> Maybe we should open Cuirass feature requests on our bug tracker to
> remember what would be valuable to implement.

Okay, I'll open one here.

Giovanni Biscuolo wrote 2 years ago

Re: bug#65391: People need to report failing builds even though we have ci.guix.gnu.org for that

Recipients:

Message-ID:87sf84a5ho.fsf@xelera.eu

Bruno Victal <mirai@makinata.eu> writes:

Toggle quote (4 lines)

> On 2023-08-27 02:13, 宋文武 wrote:

>> Maybe we can automatically report the failures as bugs, say every 7

>> days, and remove a package if it still fail to build in 90 days?

maybe precedeed by an automated email notification (to guix-bugs) so

that interested people have the chance to step in and fix it?

Toggle quote (5 lines)

> I'm not so sure about removing packages, personally if I'm in need of

> a package that happens to be broken I find it easier to fix it given

> that some work has already been put into writing the package definition

> than starting from scratch.

You don't need to start from scratch if you want, you just have to

checkout the right git commit (before the package was deleted) and start

from that, if needed: WDYT?

Happy hacking! Gio'

[...]

Giovanni Biscuolo

Xelera IT Infrastructures

Andy Tai wrote 2 years ago

Re: bug#65391: People need to report failing builds even

Recipients:(address . guix-patches@gnu.org)

Message-ID:CAJsg1E-mWem-j-weaEBKpMjTN+GXBotHZBA2k3NCSQgE_p1JkQ@mail.gmail.com

On 2023-08-27 02:13, 宋文武 wrote:

Toggle quote (3 lines)

> Maybe we can automatically report the failures as bugs, say every 7

> days, and remove a package if it still fail to build in 90 days?

Hi, maybe build failures should be limited to certain platforms that can
cause this treatment, such as (32-bit) x86, x86-64 and arm-64, so build
failures on other platforms would not make a package removed if build
failure not fixed

The reason is that most people do not have arm32, PowerPC or Risc-V
hardware so these platforms may be more likely to suffer build failures and
for most people x86 and 64-bit arm platforms are what they use.  Build
failures on the less common platforms can be fixed if there are people with
resources and interests, and wanting to fix them

Attachment: file

Maxime Devos wrote 2 years ago

Re: bug#65391: Acknowledgement (People need to report failing builds even though we have ci.guix.gnu.org for that)

Recipients:

Message-ID:6a62aced-9138-0496-fb01-d5d8e89ba8d6@telenet.be

(I did not receive the e-mails from Andy Tai and 宋文武, I had to look at 
https://issues.guix.gnu.org/65391.)

Toggle quote (2 lines)

> Maybe we can automatically report the failures as bugs, say every 7

> days, and remove a package if it still fail to build in 90 days?

The first part looks reasonable to me (though I would decrease 7 days to 
daily or even hourly, as I don't see a point in the delay), but how does 
the second part (removing packages) make sense at all?

I mean, if you do that:

   1. Build failures happen (independent of whether you do that).
   2. Hence, by doing that, the distro shrinks over time.
   3. Leading to frustrated users(*), because the packages they were 
using and which were working well were suddenly removed for no good 
reason (**).
   4. Leading to less people fixing build failures (because of the 
frustration).

which seems rather counter-productive to me.

(I suppose the feedback loop eventually stabilises by ‘less people -> 
less changes made to Guix -> less new build failures -> less 
frustration’, but that's not really a good thing.)

Instead, what about:

 > Maybe we can automatically report the failures as bugs, say every
 > hour, and revert the commit(s) causing the new build failures if they
 > haven't been fixed in a week.

(3 months seems to have to high a chance of merge conflicts and 
decreased motivation to fix the mistakes to me.)

Expanding upon this a bit more:

    * Expecting that people fix build failures of X when updating X seems
      reasonable to me, and I think this is not in dispute.

    * Expecting that people using X fix build failures of X or risk the
      package X being deleted when someone else changed a dependency Y of
      X seems unreasonable to me.   More generally, I am categorically
      opposed to:

      ‘If you change something and it breaks something else, you should
      leave fixing the something else to someone (unless you want to
      fix it yourself).’

      (I can think of some situations where this is a good thing, but not
      in general and in particular not in this Guix situation.)

      I mean, I don't know about you, but for me it fails the categorical
      imperative and the so-called Golden Rule.

(*) making no distinction between users and developers here, as the 
latter are users too.

(**) I can think of four classes of causes of new build failures, in all 
of which removing the package usually makes no sense:

     + Non-determinism.  While fixing the non-determinism would be ideal,
       instead of removing the package, you could just retry the build.

     + Time-bombs.  These tend to be simple to fix.  Often they are in
       tests, which at worst you could simply disable, instead of
       removing the package.

     + Update of dependency that is incompatible with the dependent.

       That should be caught at review time -- if there is anything
       that should be removed, it's the update (i.e., revert it).

       Also, Guix supports having multiple versions of a package,
       you could use that?  Or if it is a simple change, you could
       patch things while things haven't diverged much yet (and
       maybe upstream even already has an update to make things
       compatible!)

     + Out-of-memory problems and the like: see non-determinism.

Best regards,
Maxime Devos

Attachment: OpenPGP_0x49E3EE22191725EE.asc

Attachment: OpenPGP_signature

Maxim Cournoyer wrote 2 years ago

Recipients:(name . Maxime Devos)(address . maximedevos@telenet.be)

Message-ID:87h6ohc3gk.fsf@gmail.com

Hi Maxime,

Maxime Devos <maximedevos@telenet.be> writes:

Toggle quote (20 lines)> (I did not receive the e-mails from Andy Tai and 宋文武, I had to look
> at <https://issues.guix.gnu.org/65391>.)
>
>> Maybe we can automatically report the failures as bugs, say every 7
>> days, and remove a package if it still fail to build in 90 days?
>
> The first part looks reasonable to me (though I would decrease 7 days
> to daily or even hourly, as I don't see a point in the delay), but how
> does the second part (removing packages) make sense at all?
>
> I mean, if you do that:
>
>   1. Build failures happen (independent of whether you do that).
>   2. Hence, by doing that, the distro shrinks over time.
>   3. Leading to frustrated users(*), because the packages they were
>   using and which were working well were suddenly removed for no good
>   reason (**).
>   4. Leading to less people fixing build failures (because of the
>   frustration).

We could bump the expiry time to 180 days, or even 365 days (a full
year).  If nobody opens an issue for a broken package in that amount of
time, it's probably not used much if at all and may not be worth the
maintenance burden.  It can always be resurrected from the git history
if someone is motivated to pick it up.  Looking for removed packages
from the git history could become a second instinct if this was made
policy.  We already have a yasnippet snippet that automates commit
message for package removal: 'remove... TAB', which makes it easy to
search for:

Toggle snippet (13 lines)

git log --grep='gnu: Remove'

commit 72abf72062f0e813efb633e05b42c99c4bc78cff

Author: Maxim Cournoyer <me>

Date: Fri Aug 11 21:29:54 2023 -0400

gnu: Remove qtquickcontrols2.

* gnu/packages/qt.scm (qtquickcontrols2): Delete variable.

(pyotherside) [inputs]: Remove qtquickcontrols2.

[...]

It's frustrating for users when a package is missing, but it's also
frustrating/inefficient for maintainers to stumble upon broken packages
when checking if an upgrade broke dependent packages (it takes time to
build them just to find out they fail, and researching they already
did), so a balance is needed.

Toggle quote (36 lines)> which seems rather counter-productive to me.
>
> (I suppose the feedback loop eventually stabilises by ‘less people ->
> less changes made to Guix -> less new build failures -> less
> frustration’, but that's not really a good thing.)
>
> Instead, what about:
>
>> Maybe we can automatically report the failures as bugs, say every
>> hour, and revert the commit(s) causing the new build failures if they
>> haven't been fixed in a week.


> (3 months seems to have to high a chance of merge conflicts and
> decreased motivation to fix the mistakes to me.)
>
> Expanding upon this a bit more:
>
>    * Expecting that people fix build failures of X when updating X seems
>      reasonable to me, and I think this is not in dispute.
>
>    * Expecting that people using X fix build failures of X or risk the
>      package X being deleted when someone else changed a dependency Y of
>      X seems unreasonable to me.   More generally, I am categorically
>      opposed to:
>
>      ‘If you change something and it breaks something else, you should
>      leave fixing the something else to someone (unless you want to
>      fix it yourself).’
>
>      (I can think of some situations where this is a good thing, but not
>      in general and in particular not in this Guix situation.)
>
>      I mean, I don't know about you, but for me it fails the categorical
>      imperative and the so-called Golden Rule.

I think we can all assume contributors are acting in good faith and

are ready to fix any problems resulting from their installed changes;

but they need to be made aware of these failures.

Which to me suggests we (again) need better tooling (that's already

improved much with the QA service, thanks to Christopher's efforts).

It can still be improved; the QA could for example notify contributors

by email when their patch or series have broken something, like the CI

of forges typically do, or other UI improvements to make it easier to

see what has been broken. Cuirass in particular would benefit from a

status:failed-new (freshly broken) query ability. I think the data is

already there, it just needs to be exposed.

I've opened new feature requests for the CI to help with that:

https://issues.guix.gnu.org/65594("[feature] [qa] Notify users by email

of problems") and https://issues.guix.gnu.org/65595("[feature]

[cuirass] Add ability to filter builds for status:failed-new").

Thanks for weighing in!

Thanks,

Maxim

Maxime Devos wrote 2 years ago

Recipients:(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)

Message-ID:ac627ec6-f21a-fd24-1151-b47d6d2c84b3@telenet.be

Toggle quote (13 lines)>> The first part looks reasonable to me (though I would decrease 7 days
>> to daily or even hourly, as I don't see a point in the delay), but how
>> does the second part (removing packages) make sense at all?
>> 
>> I mean, if you do that:
>> 
>>   1. Build failures happen (independent of whether you do that).
>>   2. Hence, by doing that, the distro shrinks over time.
>>   3. Leading to frustrated users(*), because the packages they were
>>   using and which were working well were suddenly removed for no good
>>   reason (**).
>>   4. Leading to less people fixing build failures (because of the
>>   frustration).

Toggle quote (4 lines)

> We could bump the expiry time to 180 days, or even 365 days (a full

> year). If nobody opens an issue for a broken package in that amount of

> time, it's probably not used much if at all and may not be worth the

> maintenance burden.

Please read the subject line of the original message, subject lines 
aren't just fluff.

More to the point, no, it doesn't mean that that the package is not used 
much, it could instead mean that the people using the package (or 
interested in using the package, if it was already broken when they 
discovered it) thought that the existence of ci.guix.gnu.org means that 
contributors doing Guix maintenance already know that the package is 
broken and assumed that it would be fixed, and that a new bug report 
would just be annoying the contributors because they already have a bug 
report: the build failure on ci.guix.gnu.org.

 > It can always be resurrected from the git history if someone is
 > motivated to pick it up. Looking for removed packages from the git
 > history could become a second instinct if this was made policy.
 > Looking for removed packages from the git history could become a
 > second instinct if this was made policy.  [trimmed yasnippet stuff]

Yes, all this could be done.  But how does any of this address my 
arguments you quoted at all?

Op 29-08-2023 om 16:45 schreef Maxim Cournoyer:

Toggle quote (5 lines)> It's frustrating for users when a package is missing, but it's also
> frustrating/inefficient for maintainers to stumble upon broken packages
> when checking if an upgrade broke dependent packages (it takes time to
> build them just to find out they fail, and researching they already
> did), so a balance is needed.

This part, OTOH, actually has something to do with what you quoted.

Again, as I wrote previously, maintainers are users too -- if something 
is frustrating to users it is frustrating to users because 
maintainers⊆users.  What remains is the quantity of frustration, which 
is a valid point, but how would you even quantify that?  I don't know 
about you, but I don't know how to do that, so while a valid point, it 
doesn't seem a useful point to me because it seems impossible to 
determine whether it is a point for or against.

Also, the amount of frustration would be less than what you appear to 
believe it to be:

If maintainers check that no new build failures are created, then over 
time the total amount of old build failures becomes roughly zero 
(roughly, because of occasional mistake and new timebombs).

Then, the frustration of researching they already did mostly disappears. 
(Other sources of inefficiency and frustration remain.)

Also, I believe there shouldn't be a balance, or IOW, the balance should 
tilt almost completely towards no new broken packages and no removals (*).

I mean, having reliable non-broken packages (and services, installation 
etc.) is the whole point of a distro, and if that inherently results in 
frustration for people modifying the distro, IMO that means the 
frustration should be minimised (see e.g. better tooling suggestions) or 
computers should stop being used, not that Guix should stop being a distro.

(*) Sometimes upstream is really not with the times instead of slightly 
out of touch, sometimes the broken package has a good replacement and 
often security updates need to be performed before they existed, but the 
‘remove packages’ proposal is not limited to such exceptions.

 >> [some other part]

Toggle quote (19 lines)>> Expanding upon this a bit more:
>> 
>>    * Expecting that people fix build failures of X when updating X seems
>>      reasonable to me, and I think this is not in dispute.
>> 
>>    * Expecting that people using X fix build failures of X or risk the
>>      package X being deleted when someone else changed a dependency Y of
>>      X seems unreasonable to me.   More generally, I am categorically
>>      opposed to:
>> 
>>      ‘If you change something and it breaks something else, you should
>>      leave fixing the something else to someone (unless you want to
>>      fix it yourself).’
>> 
>>      (I can think of some situations where this is a good thing, but not
>>      in general and in particular not in this Guix situation.)
>> 
>>      I mean, I don't know about you, but for me it fails the categorical
>>      imperative and the so-called Golden Rule.

Toggle quote (3 lines)

> I think we can all assume contributors are acting in good faith and

> are ready to fix any problems resulting from their installed changes;

> but they need to be made aware of these failures. [...]

Again, how does this reply addresses what you quoted?   Like, this is a 
valuable reply (and I mostly agree with it, but I would qualify 
‘contributors’ as ‘most regular contributors’ (**)) ... but it is not a 
good reply to what you quoted.

   * if you left out the quote or separated your reply from the quote
     (more explicitly, you could e.g. start with ‘On related matters,
     ...’), it would be fine.

   * but if you don't, then you're blatantly ignoring what I wrote, which
     is not fine at all.

It's something I have encountered and pointed out (less explicitly) in 
the past in other threads as well.

(**) If you want me to, I could sent you an example of someone writing a 
single message (and no other messages to Guix) in bad faith by PM.

 > [tooling / QA improval suggestions]

Agreed.

Best regards,
Maxime Devos.

Attachment: OpenPGP_0x49E3EE22191725EE.asc

Attachment: OpenPGP_signature

Maxime Devos wrote 2 years ago

Re: bug#65391: People need to report failing builds even though we have ci.guix.gnu.org for that

Recipients:(address . 65391@debbugs.gnu.org)

Message-ID:fdf86517-7aac-a2e3-223b-23e6ef7a90d5@telenet.be

Toggle quote (1 lines)

> [Two mails previously]

 > Also the CI UI could use some improvements.  I'm pretty sure I've
 > mentioned this before, but there is no easy way to find out which
 > inputs I need to fix to make a dependency failure disappear.

Toggle quote (20 lines)> [...]
> That is precisely what the linear search algorithm is.  I should not
> have to look through the dependency tree to figure out if two package
> failures have the same cause, or to know how many (possibly indirect)
> dependencies of a package are failing.
> As an example, pandoc often fails to build on i686, but when you look at
> the CI page, you see that it was caused by several of its inputs
> failing, all due to some of *their* dependencies.
> Now, you could dig down on one branch of the dependency DAG and find one
> failing package, but that doesn't *actually* answer the question: "what
> packages do I need to fix to enable this one?", because it could have
> multiple failing inputs instead of just one.  The only way to tell is to
> look at each page, that means having to visually find each failing input
> on the page, wait for their CI pages to load, and repeat the whole
> process.
> If your browser is not particularly fast or you aren't so quick at
> navigating a webpage, this can take a while.
> But for the CI server, generating this information would take less than
> a second > Maybe some people value their time so little that they are fine with
> doing this the manual way, but personally I have better things to do.

ci.guix.gnu.org loads fast enough for me in my experience, but I do 
agree that more automation is good!

(I usually don't respond to e-mails I agree with except for 
superficialities, but I was wondering if such non-replies are actually 
interpreted as such, or as disagreements, or neither.)

Best regards,
Maxime Devos.

Attachment: OpenPGP_0x49E3EE22191725EE.asc

Attachment: OpenPGP_signature

Maxim Cournoyer wrote 2 years ago

Re: bug#65391: Acknowledgement (People need to report failing builds even though we have ci.guix.gnu.org for that)

Recipients:(name . Maxime Devos)(address . maximedevos@telenet.be)

Message-ID:87zg296z7y.fsf@gmail.com

Hi Maxime,

Maxime Devos <maximedevos@telenet.be> writes:

Toggle quote (20 lines)>>> The first part looks reasonable to me (though I would decrease 7 days
>>> to daily or even hourly, as I don't see a point in the delay), but how
>>> does the second part (removing packages) make sense at all?
>>> I mean, if you do that:
>>>   1. Build failures happen (independent of whether you do that).
>>>   2. Hence, by doing that, the distro shrinks over time.
>>>   3. Leading to frustrated users(*), because the packages they were
>>>   using and which were working well were suddenly removed for no good
>>>   reason (**).
>>>   4. Leading to less people fixing build failures (because of the
>>>   frustration).
>
>> We could bump the expiry time to 180 days, or even 365 days (a full
>> year).  If nobody opens an issue for a broken package in that amount of
>> time, it's probably not used much if at all and may not be worth the
>> maintenance burden.
>
> Please read the subject line of the original message, subject lines
> aren't just fluff.

Believe it or not, I actually did! :-) I was replying to the first part
of your message, where you mentioned you were against packages removal.
My reply was giving support to devising policy that would define when
it's acceptable to prune the distribution of broken/unmaintained
packages, which is tangentially related to the topic of reporting broken
packages.  These are just ideas and if we decide to turn some of them
into policy we could write it in a way that would favor resolving
problems instead of just making them disappear.

[...]

Toggle quote (4 lines)

> If maintainers check that no new build failures are created, then over

> time the total amount of old build failures becomes roughly zero

> (roughly, because of occasional mistake and new timebombs).

You mean that the building vs failing ratio improves, right? I'm all

for giving a best effort to keep as many packages as we have the

capacity to do, but at some point the Pareto principle kicks in and you

realize there's not that much value in spending 3 days trying to fix a

hardly maintained leaf package that has been failing to build for a year

or two.

[...]

Toggle quote (6 lines)

> (*) Sometimes upstream is really not with the times instead of

> slightly out of touch, sometimes the broken package has a good

> replacement and often security updates need to be performed before

> they existed, but the ‘remove packages’ proposal is not limited to

> such exceptions.

This is the kind of considerations that we could mention in a package

removal policy (basically mention it's a last resort thing).

Toggle quote (37 lines)>>> [some other part]
>>> Expanding upon this a bit more:
>>>    * Expecting that people fix build failures of X when updating X
>>> seems
>>>      reasonable to me, and I think this is not in dispute.
>>>    * Expecting that people using X fix build failures of X or risk
>>> the
>>>      package X being deleted when someone else changed a dependency Y of
>>>      X seems unreasonable to me.   More generally, I am categorically
>>>      opposed to:
>>>      ‘If you change something and it breaks something else, you
>>> should
>>>      leave fixing the something else to someone (unless you want to
>>>      fix it yourself).’
>>>      (I can think of some situations where this is a good thing,
>>> but not
>>>      in general and in particular not in this Guix situation.)
>>>      I mean, I don't know about you, but for me it fails the
>>> categorical
>>>      imperative and the so-called Golden Rule.
>>
>> I think we can all assume contributors are acting in good faith and
>> are ready to fix any problems resulting from their installed changes;
>> but they need to be made aware of these failures.  [...]
>
> Again, how does this reply addresses what you quoted?   Like, this is
> a valuable reply (and I mostly agree with it, but I would qualify
> ‘contributors’ as ‘most regular contributors’ (**)) ... but it is not
> a good reply to what you quoted.
>
>   * if you left out the quote or separated your reply from the quote
>     (more explicitly, you could e.g. start with ‘On related matters,
>     ...’), it would be fine.
>
>   * but if you don't, then you're blatantly ignoring what I wrote, which
>     is not fine at all.

The text of yours I quoted was to provide some context as to what I was

answering to; I replied to the essence of your argument I synthesized

from it, not point by point as I agreed with it and it wouldn't have

added much to do so.

Toggle quote (3 lines)

> It's something I have encountered and pointed out (less explicitly) in

> the past in other threads as well.

I think it's a common reaction when faced with a detailed text -- some
people may simply ignore it, feeling overwhelmed, or they may synthesize
the essence of it to keep it high level and the discussion more fluid.
I don't think it should be perceived as mean; a partial reply is still
better than none.

-- 
Thanks,
Maxim

Maxim Cournoyer wrote 2 years ago

Re: bug#65391: People need to report failing builds even though we have ci.guix.gnu.org for that

Recipients:(name . Maxime Devos)(address . maximedevos@telenet.be)

Message-ID:87sf816yu1.fsf@gmail.com

Hi Maxime,

Maxime Devos <maximedevos@telenet.be> writes:

[...]

Toggle quote (4 lines)

> (I usually don't respond to e-mails I agree with except for

> superficialities, but I was wondering if such non-replies are actually

> interpreted as such, or as disagreements, or neither.)

I'd say it's safer to assume neither, though perhaps with a slight bias
toward agreement, especially if the person was otherwise actively
participating in the conversation (as I would expect people are most
likely to post a reply when they disagree with something).

-- 
Thanks,
Maxim

宋文武 wrote 2 years ago

Re: bug#65391: Acknowledgement (People need to report failing builds even though we have ci.guix.gnu.org for that)

Recipients:(name . Maxime Devos)(address . maximedevos@telenet.be)

Message-ID:87edjkvn6b.fsf@envs.net

Maxime Devos <maximedevos@telenet.be> writes:

Toggle quote (8 lines)

>> Maybe we can automatically report the failures as bugs, say every 7

>> days, and remove a package if it still fail to build in 90 days?

> The first part looks reasonable to me (though I would decrease 7 days

> to daily or even hourly, as I don't see a point in the delay), but how

> does the second part (removing packages) make sense at all?

Oh, to be more clear I didn't mean automatically remove a package, but

notify guix-devel to consider removing one if its "fail to build" issue

had existed for a long time and no one care.

Toggle quote (8 lines)

> [...]

> Instead, what about:

>> Maybe we can automatically report the failures as bugs, say every

>> hour, and revert the commit(s) causing the new build failures if they

>> haven't been fixed in a week.

Yes, automatically report bugs would be helpful. And I'll leave the

reverting rights to committers, which usually need some research and

maybe risky.

Toggle quote (21 lines)> [...]
> Expanding upon this a bit more:
>
>    * Expecting that people fix build failures of X when updating X seems
>      reasonable to me, and I think this is not in dispute.
>
>    * Expecting that people using X fix build failures of X or risk the
>      package X being deleted when someone else changed a dependency Y of
>      X seems unreasonable to me.   More generally, I am categorically
>      opposed to:
>
>      ‘If you change something and it breaks something else, you should
>      leave fixing the something else to someone (unless you want to
>      fix it yourself).’
>
>      (I can think of some situations where this is a good thing, but not
>      in general and in particular not in this Guix situation.)
>
>      I mean, I don't know about you, but for me it fails the categorical
>      imperative and the so-called Golden Rule.

I agree. Well sometimes if breaks are overlooked by me, then it's very

welcome for other to give me a hand.

Thanks.

Dr. Arne Babenhauserheide wrote 2 years ago

Recipients:(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)

Message-ID:871qfkg65s.fsf@web.de

Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

Toggle quote (7 lines)

> Believe it or not, I actually did! :-) I was replying to the first part

> of your message, where you mentioned you were against packages removal.

> My reply was giving support to devising policy that would define when

> it's acceptable to prune the distribution of broken/unmaintained

> packages, which is tangentially related to the topic of reporting broken

> packages.

Please don’t remove packages that are broken on the CI. I often had a
case where no substitute was available but the package built just fine
locally. This is not a perfect situation (nicer would be to track why it
doesn’t come from CI — sometimes it’s just a resource problem on the
CI), but if you removed a package I use that would break all updates for
me.

I had that in the past. It’s not a nice situation, because it not only
break that one package but also prevents getting security updates until
you find time to inspect what exactly is broken.

And if you depend on that package, stuff stops working. Example: The
changes to the Texlive packages currently break the PDF export of many
pages for me — I have not found the deeper reason yet. And I usually
cannot investigate such problems right-away, because I can’t just drop
everything for hobby automation that should just keep working.

If a change in packages breaks my manifest, that is extremely painful.

Best wishes,
Arne
-- 
Unpolitisch sein
heißt politisch sein,
ohne es zu merken.
draketo.de

Maxime Devos wrote 2 years ago

Recipients:(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)

Message-ID:32c84040-e788-b87d-589c-16f13e2e6c93@telenet.be

Toggle quote (28 lines)> [...]
> Maxime Devos<maximedevos@telenet.be>  writes:
> 
>>>> The first part looks reasonable to me (though I would decrease 7 days
>>>> to daily or even hourly, as I don't see a point in the delay), but how
>>>> does the second part (removing packages) make sense at all?
>>>> I mean, if you do that:
>>>>    1. Build failures happen (independent of whether you do that).
>>>>    2. Hence, by doing that, the distro shrinks over time.
>>>>    3. Leading to frustrated users(*), because the packages they were
>>>>    using and which were working well were suddenly removed for no good
>>>>    reason (**).
>>>>    4. Leading to less people fixing build failures (because of the
>>>>    frustration).
>>> We could bump the expiry time to 180 days, or even 365 days (a full
>>> year).  If nobody opens an issue for a broken package in that amount of
>>> time, it's probably not used much if at all and may not be worth the
>>> maintenance burden.
>> Please read the subject line of the original message, subject lines
>> aren't just fluff.
> Believe it or not, I actually did! :-) I was replying to the first part
> of your message, where you mentioned you were against packages removal.
> My reply was giving support to devising policy that would define when
> it's acceptable to prune the distribution of broken/unmaintained
> packages, which is tangentially related to the topic of reporting broken
> packages.  These are just ideas and if we decide to turn some of them
> into policy we could write it in a way that would favor resolving
> problems instead of just making them disappear.

OK sounds good.

Toggle quote (6 lines)> [...]
> 
>> If maintainers check that no new build failures are created, then over
>> time the total amount of old build failures becomes roughly zero
>> (roughly, because of occasional mistake and new timebombs).
> You mean that the building vs failing ratio improves, right?

That too, but in relation to what I replied to, I meant what I wrote, 
which is a stronger statement.

Toggle quote (8 lines)> I'm all
> for giving a best effort to keep as many packages as we have the
> capacity to do, but at some point the Pareto principle kicks in and you
> realize there's not that much value in spending 3 days trying to fix a
> hardly maintained leaf package that has been failing to build for a year
> or two.
> 
> [...]

The point is that this situation wouldn't happen if build failures were 
addressed soon after their introduction.

If it is noticed that Guix has exceeded its capacity to maintain its 
packages and needs to trim its package set to maintain the remaining 
packages effectively, then while that's unfortunate and possibly 
frustrating to users, I don't have any better option available, but the 
original (^) proposal did not have this ‘if capacity is exceeded’ 
qualifier attached.

(^) In a new e-mail, 宋文武 has amended it a bit.

(It fails the ‘distro ≃ reliable packages’ property since packages were 
removed, but with this approach, it could be a one-time intervention 
with a promise to in the future try to stay within capacity, and future 
package removals could have a nuanced deprecation policy that avoids 
making the packages unreliable(*).)

(*) I was searching for whatever Debian's package removal policy is (as 
an example to base things on), but I only found "apt-get remove" etc.. 
Actually I don't know if Debian has one, but probably I'm just looking 
in the wrong places.

It's important _how_ it is trimmed.  In the original proposal by 宋文武, 
packages are simply removed for failing to build -- there were no 
regards to how difficult it would be to fix the build failure, how 
popular the package is (or would be if it built and people knew about 
it), how useful it is, etc..

On that matter, I think it would be useful to set up a variant of 
something like Debian's popcon, in order to have actual statistics on 
what's popular (sure statistics would be flawed, but I'd think it's easy 
to do better than ‘package fails to build -> unpopular’).  I say 
variant, such that it could also count packages that aren't actually 
installed because they failed to build.  (Maybe have separate ‘desired’ 
and ‘used’ manifests?)

Toggle quote (8 lines)> 
>> (*) Sometimes upstream is really not with the times instead of
>> slightly out of touch, sometimes the broken package has a good
>> replacement and often security updates need to be performed before
>> they existed, but the ‘remove packages’ proposal is not limited to
>> such exceptions.
> This is the kind of considerations that we could mention in a package
> removal policy (basically mention it's a last resort thing).

If there is an actual nuanced package removal policy instead of ‘fails 
to build -> remove it’, my objection pretty much goes away.

 >> [...]

Toggle quote (4 lines)

> The text of yours I quoted was to provide some context as to what I was

> answering to; I replied to the essence of your argument I synthesized

> from it, not point by point as I agreed with it and it wouldn't have

> added much to do so.

OK, but I don't share your optimism -- while I would (mostly) agree that 
_currently_ most contributors are acting in good faith etc., I would say 
that after the proposed change the frequency of such contributors could 
easily decrease, because:

   * the proposal has no actual ‘acting in good faith etc.’ clause, so
     it's quite vulnerable to rules-lawyering.  I mean, look at
     difference between how I interpreted the proposal and between what
     宋文武 actually wrote -- in retrospect I read too much in it and I
     didn't even try to rules-lawyer.

   * there is (indirectly) an incentive for breaking packages, because
     the motivation for changing a package and the motivation for fixing
     the consequences of that change are different.  (Whether
     motivation change <, = or > motivation fixing consequences depends.)

   * there is little to no incentive for fixing packages you aren't
     personally interested in

Maybe things would work out and people in it for self-interest also are 
in do enlightened self-interest ... (I don't know which way things would 
go.)

Toggle quote (7 lines)>> It's something I have encountered and pointed out (less explicitly) in
>> the past in other threads as well.
> I think it's a common reaction when faced with a detailed text -- some
> people may simply ignore it, feeling overwhelmed, or they may synthesize
> the essence of it to keep it high level and the discussion more fluid.
> I don't think it should be perceived as mean; a partial reply is still
> better than none.

k, but I'm ignoring the 'common' part -- common does not imply good.

Best regards,
Maxime Devos

Attachment: OpenPGP_0x49E3EE22191725EE.asc

Attachment: OpenPGP_signature

Maxim Cournoyer wrote 2 years ago

Recipients:(name . Dr. Arne Babenhauserheide)(address . arne_bab@web.de)

Message-ID:87bkeo73av.fsf@gmail.com

Hi Arne,

"Dr. Arne Babenhauserheide" <arne_bab@web.de> writes:

[...]

Toggle quote (7 lines)

> Please don’t remove packages that are broken on the CI. I often had a

> case where no substitute was available but the package built just fine

> locally. This is not a perfect situation (nicer would be to track why it

> doesn’t come from CI — sometimes it’s just a resource problem on the

> CI), but if you removed a package I use that would break all updates for

> me.

I agree!  It'd be important, if we decide to have such a policy, to add
add guards such that packages are only removed as a last resort, after
options have been considered, and when it's been broken for a while with
an issue opened for it, and when it's a real problem with the package,
not with our CI.

-- 
Thanks,
Maxim

Simon Tournier wrote 2 years ago

Recipients:

Message-ID:87tts6kym3.fsf@gmail.com

Hi,

On Tue, 29 Aug 2023 at 10:45, Maxim Cournoyer <maxim.cournoyer@gmail.com> wrote:

Toggle quote (6 lines)

> It's frustrating for users when a package is missing, but it's also

> frustrating/inefficient for maintainers to stumble upon broken packages

> when checking if an upgrade broke dependent packages (it takes time to

> build them just to find out they fail, and researching they already

> did), so a balance is needed.

There is nothing worse as an user to have this experience:

guix search foobar

oh cool, foobar is there, let try it,

guix shell foobar

… wait …

… stuff are building …

… laptop is burning …

… wait …

Bang!

Keeping broken packages is just annoyances. Contributor are annoyed

because as said by the paragraph above. And user are annoyed as

described just above.

I am in favor to set a policy for removing then.

The question is the way to detect them. QA can do whatever we want but

until people are helping Chris because, IMHO, Chris is already enough

busy to keep stuff running, we probably need to keep our process simple

enough in order to stay actionable and avoid some vacuum of “coulda,

shoulda or woulda”. For what my opinion is worth on that. :-)

Cheers,

simon

Simon Tournier wrote 2 years ago

Recipients:

Message-ID:87pm2ukxn5.fsf@gmail.com

Hi,

On Wed, 30 Aug 2023 at 12:39, "Dr. Arne Babenhauserheide" <arne_bab@web.de> wrote:

Toggle quote (7 lines)

> Please don’t remove packages that are broken on the CI. I often had a

> case where no substitute was available but the package built just fine

> locally. This is not a perfect situation (nicer would be to track why it

> doesn’t come from CI — sometimes it’s just a resource problem on the

> CI), but if you removed a package I use that would break all updates for

> me.

Well, I do not think that any policy will mark a package for removal on

the first build failure. However, if the same package is still failing

after several X <duration> or attempts, it means something is wrong.

Marking it as a candidate for removal implies:

1. check if the failure is from CI when it builds locally,

2. keep a set of packages that we know they are installable.

For instance, ocaml4.07-* packages are failing since more or less April.

https://data.guix.gnu.org/repository/1/branch/master/package/ocaml4.07-ppxlib/output-history

Does it make sense to keep them? For another example, some perl6-*

packages are failing since… 2021.

https://data.guix.gnu.org/repository/1/branch/master/package/perl6-xml-writer/output-history

Does it make sense to keep them?

The usual situation is that CI is able to build the packages. The set

of packages that CI is not able to build is very limited and it is the

exception.

Having a rule to deal with the regular broken packages appears to me a

good thing and very helpful to keep Guix reliable. And that rule cannot

be based on rare exceptional cases.

Toggle quote (2 lines)

> If a change in packages breaks my manifest, that is extremely painful.

Yeah, and such rule for dealing with broken packages will be helpful for

detecting such change and so avoid such situation.

Cheers,

simon

Csepp wrote 2 years ago

Re: People need to report failing builds even though we have ci.guix.gnu.org for that

Recipients:(name . Simon Tournier)(address . zimon.toutoune@gmail.com)

Message-ID:cuc34zl6u64.fsf@riseup.net

(changing the subject back to the intended one. I think the fact that

someone replies to an automated acknowledgement email like once a week

says indicates that the emails are not communicating clearly what their

purpose is. anyways, on to the actual issue at hand.)

Simon Tournier <zimon.toutoune@gmail.com> writes:

Toggle quote (39 lines)> Hi,
>
> On Tue, 29 Aug 2023 at 10:45, Maxim Cournoyer <maxim.cournoyer@gmail.com> wrote:
>
>> It's frustrating for users when a package is missing, but it's also
>> frustrating/inefficient for maintainers to stumble upon broken packages
>> when checking if an upgrade broke dependent packages (it takes time to
>> build them just to find out they fail, and researching they already
>> did), so a balance is needed.
>
> There is nothing worse as an user to have this experience:
>
>     guix search foobar
>
> oh cool, foobar is there, let try it,
>
>     guix shell foobar
>
>     … wait …
>     … stuff are building …
>     … laptop is burning …
>     … wait …
>     Bang!
>
> Keeping broken packages is just annoyances.  Contributor are annoyed
> because as said by the paragraph above.  And user are annoyed as
> described just above.
>
> I am in favor to set a policy for removing then.
>
> The question is the way to detect them.  QA can do whatever we want but
> until people are helping Chris because, IMHO, Chris is already enough
> busy to keep stuff running, we probably need to keep our process simple
> enough in order to stay actionable and avoid some vacuum of “coulda,
> shoulda or woulda”.  For what my opinion is worth on that. :-)
>
> Cheers,
> simon

That is not a package problem but a Guix interface problem.  I have been
saying for a while that there needs to be an option to disable all
non-trivial local builds by default when you know your machine can't
handle them.
Alternatively the CI could record some basic resource utilization
information, so users could for example set a limit on RAM.  (Although
this gets tricky for parallel builds.)

Simon Tournier wrote 2 years ago

Recipients:(name . Csepp)(address . raingloom@riseup.net)

Message-ID:CAJ3okZ1XsNiTr6V4b0ogzVbrrXwfhfssWxEFS5q3BuD-Y1xX3Q@mail.gmail.com

Hi,

On Mon, 11 Sept 2023 at 09:33, Csepp <raingloom@riseup.net> wrote:

Toggle quote (5 lines)

> That is not a package problem but a Guix interface problem. I have been

> saying for a while that there needs to be an option to disable all

> non-trivial local builds by default when you know your machine can't

> handle them.

IMHO, your proposal is orthogonal with the issue at hand: broken
packages.  Other said, the issue is: how to deal with the set of
packages that will not build and we already know it (since weeks,
months or even years for some).

My workstation can handle all the compilations that are required.  My
laptop is able offload to it.  The issue about broken packages is not
about the resources.  It is about burning resources for nothing.

About the issue you are speaking about, we already had discussions in
this direction -- you are not the only one saying "the fix needs to do
X" for a while but please keep in mind that "talking does not cook the
rice". ;-)  Well, maybe you could open a ticket with a concrete
use-case.

Cheers,
simon

Dr. Arne Babenhauserheide wrote 2 years ago

Re: bug#65391: Acknowledgement (People need to report failing builds even though we have ci.guix.gnu.org for that)

Recipients:(name . Simon Tournier)(address . zimon.toutoune@gmail.com)

Message-ID:874jk183wx.fsf@web.de

Simon Tournier <zimon.toutoune@gmail.com> writes:

Toggle quote (27 lines)> On Wed, 30 Aug 2023 at 12:39, "Dr. Arne Babenhauserheide" <arne_bab@web.de> wrote:
>> Please don’t remove packages that are broken on the CI. I often had a
>> case where no substitute was available but the package built just fine
>> locally. This is not a perfect situation (nicer would be to track why it
>> doesn’t come from CI — sometimes it’s just a resource problem on the
>> CI), but if you removed a package I use that would break all updates for
>> me.
>
> Well, I do not think that any policy will mark a package for removal on
> the first build failure.  However, if the same package is still failing
> after several X <duration> or attempts, it means something is wrong.
> Marking it as a candidate for removal implies:
>
>  1. check if the failure is from CI when it builds locally,
>  2. keep a set of packages that we know they are installable.
>
> For instance, ocaml4.07-* packages are failing since more or less April.
>
> https://data.guix.gnu.org/repository/1/branch/master/package/ocaml4.07-ppxlib/output-history
>
> Does it make sense to keep them?  For another example, some perl6-*
> packages are failing since… 2021.
>
> https://data.guix.gnu.org/repository/1/branch/master/package/perl6-xml-writer/output-history
>
> Does it make sense to keep them?

This is a good example, but not for removing broken packages. For

perl6-xml-writer removing the package would keep breakage in Guix.

I just checked the build, and this looks like a Guix packaging error

that breaks the tests due to a change to some unrelated package:

/gnu/store/ap404x14l604wm0gvaj439ga2vjzwnl7-perl6-tap-harness-0.0.7/bin/prove6: /gnu/store/ap404x14l604wm0gvaj439ga2vjzwnl7-perl6-tap-harness-0.0.7/bin/.prove6-real: perl6: bad interpreter: No such file or directory

Disabling the tests makes the package build and work.

So here, removing a package would start at the wrong place: some change

between 2021-02-01 and 2021-04-30 broke the perl6-tap-harness and we did

not detect that.

This is a problem that would get hidden by removing broken packages.

The problem is that we (large inclusive we that stands for all users of

Guix) did not track down this problem that causes the build to fail.

From this I see two distinct cases:

- packages broken upstream

- packages broken by changes in Guix

If a package is broken upstream and not going to get fixed and this

requires regular patching in Guix, I agree that we have to remove it at

some point.

If however a change in Guix breaks packages, that change should get

rolled back / reverted and fixed, so it does not break the packages.

8 | ocaml-migrate-parsetree

^^^^^^^^^^^^^^^^^^^^^^^

Error: Library "ocaml-migrate-parsetree" not found.

This likely means that a change in the inherited package removed the

input, but the breakage wasn’t detected.

And that’s actually what happened in

386ad7d8d14dee2103927d3f3609acc63373156a

Fri Jan 13 10:54:36 2023 +0000

This commit broke ocaml4.07-ppxlib by cleaning up the inputs of

ocaml-ppxlib (not naming names, this is not about shaming but about

detecting the deeper problem).

It should have been rejected (somehow) by CI. The change it would have

required is this:

Toggle diff (79 lines)diff --git a/gnu/packages/ocaml.scm b/gnu/packages/ocaml.scm
index 8ff755aea9..042432be9a 100644
--- a/gnu/packages/ocaml.scm
+++ b/gnu/packages/ocaml.scm
@@ -6845,6 +6845,9 @@ (define-public ocaml4.07-ppxlib
          (base32
           "0my9x7sxb329h0lzshppdaawiyfbaw6g5f41yiy7bhl071rnlvbv"))))
      (build-system dune-build-system)
+     (propagated-inputs
+      (modify-inputs (package-propagated-inputs ocaml-ppxlib)
+        (prepend ocaml-migrate-parsetree)))
      (arguments
       `(#:phases
         (modify-phases %standard-phases

So for both the cases you named for removal, such a removal would have
caused us to miss actual problems in our process.

This does not mean that there will never be a case in which a package
has to be removed, but given that both cases you showed are likely
self-induced breakage due to changes that should have been rejected as
breaking seemingly unrelated packages, it rather looks like the
situation where removing the package is the right way forward is the
exceptional case.

The norm is that our CI should have detected a problem in the commit
causing the breakage.
(this is reasoning from only two datapoints, so take it with a grain of
salt …)

Can we automatically rebuild all inheriting packages when a package gets
changed?

> The usual situation is that CI is able to build the packages.  The set
> of packages that CI is not able to build is very limited and it is the
> exception.
>
> Having a rule to deal with the regular broken packages appears to me a
> good thing and very helpful to keep Guix reliable.  And that rule cannot
> be based on rare exceptional cases.

A rule should work with known cases, otherwise it causes known breakage.

Also see above: in the two cases you selected, removing the package
would be the wrong path forward.

> On Wed, 30 Aug 2023 at 12:39, "Dr. Arne Babenhauserheide" <arne_bab@web.de> wrote:
>> If a change in packages breaks my manifest, that is extremely painful.
>
> Yeah, and such rule for dealing with broken packages will be helpful for
> detecting such change and so avoid such situation.

Since a manifest is strictly dependent on all packages defined in it,
removing a single referenced package means that the manifest is broken:
no update works anymore. No security updates come in anymore — even if
the package in question worked locally. This is a situation we should
not cause.

If we had a way to have placeholder packages (similar to the renamings)
that emit warnings for missing packages but do not break the build, that
would reduce the damage done by removing a package. But I think such a
mechanism must be in place and tested before adding a rule to remove
packages.

And as we’ve seen from the two packages you selected, removal wouldn’t
have been the right decision.

The more important question is (serious question and *not* for assigning
blame, but to see whether we can improve processes): with the time we
already spent in this discussion, we could have fixed a lot of packages.
Why did we not do that?

Best wishes,
Arne
-- 
Unpolitisch sein
heißt politisch sein,
ohne es zu merken.
draketo.de

Simon Tournier wrote 2 years ago

Recipients:(name . Dr. Arne Babenhauserheide)(address . arne_bab@web.de)

Message-ID:87fs3kizd9.fsf@gmail.com

Hi Arne,

( I have not re-read all the thread. )

On Mon, 11 Sep 2023 at 10:30, "Dr. Arne Babenhauserheide" <arne_bab@web.de> wrote:

Toggle quote (13 lines)

>> Well, I do not think that any policy will mark a package for removal on

>> the first build failure. However, if the same package is still failing

>> after several X <duration> or attempts, it means something is wrong.

>> Marking it as a candidate for removal implies:

>> 1. check if the failure is from CI when it builds locally,

>> 2. keep a set of packages that we know they are installable.

> This is a good example, but not for removing broken packages. For

> perl6-xml-writer removing the package would keep breakage in Guix.

> I just checked the build, and this looks like a Guix packaging error

This is exactly the effect if we have a policy. :-)

Please, do not read a policy for the removal of broken packages as an

automatic process. As you, I think an automatic process for removing

would be a bad thing about the user experience.

Maybe I misunderstand what a policy is. For me, a policy is a plan that

is used as a basis for making decisions, a policy helps in reaching

conclusion which then can lead to some actions.

Somehow this discussion is the implementation of the policy I am

proposing and that would help the maintenance, IMHO. I have manually

marked this package for removal and…

Toggle quote (2 lines)

> that breaks the tests due to a change to some unrelated package:

…surprise, surprise, someone has checked. :-)

A policy for removal about the broken packages would allow to know what

to do. If the same package is still failing after several X <duration>

or attempts, it means something is wrong.

Currently, either you hit a broken package when doing some Guix

operations. And that is a very poor experience, IMHO. Either one have

to open the dashboard from CI [1], select some red buttons and

investigate. And we can count with few fingers the number of people

doing that.

1: https://ci.guix.gnu.org/eval/741273/dashboard

Toggle quote (2 lines)

> Disabling the tests makes the package build and work.

Here is the point of my proposal to have a policy for removal of broken

packages: automatically check how many times they have failed to build

and automatically tag them when they are considered problematic. If no

one care and these tagged packages are not fixed, then let remove them.

It would drastically help in the maintenance. Otherwise, your help is

very welcome in monitoring all the failures. :-)

Toggle quote (4 lines)

> So here, removing a package would start at the wrong place: some change

> between 2021-02-01 and 2021-04-30 broke the perl6-tap-harness and we did

> not detect that.

Yes, that’s where QA should help: detect unrelated change that have a

long distance impact on unrelated packages.

Changes to the branching/commit policy

Christopher Baines <mail@cbaines.net>

Thu, 08 Jun 2023 15:24:37 +0100

id:87y1kuyqew.fsf@cbaines.net

https://yhetil.org/guix/87y1kuyqew.fsf@cbaines.net

https://lists.gnu.org/archive/html/guix-devel/2023-06

[bug#63459] [PATCH] doc: Rewrite the branching strategy.

Christopher Baines <mail@cbaines.net>

Fri, 12 May 2023 08:55:20 +0100

id:f339d15842370b97558b704593848e318462b68d.1683878120.git.mail@cbaines.net

https://yhetil.org/guix/f339d15842370b97558b704593848e318462b68d.1683878120.git.mail@cbaines.net

https://issues.guix.gnu.org/msgid/f339d15842370b97558b704593848e318462b68d.1683878120.git.mail@cbaines.net

Toggle quote (7 lines)

> This does not mean that there will never be a case in which a package

> has to be removed, but given that both cases you showed are likely

> self-induced breakage due to changes that should have been rejected as

> breaking seemingly unrelated packages, it rather looks like the

> situation where removing the package is the right way forward is the

> exceptional case.

We are miscommunicating. Or we have a very different vision about what

should be the reliability of Guix.

As a regular user, I need perl6-tap-harness, so I type “guix install

perl6-tap-harness”, and bang, it fails.

As a regular user, I do not mind if the problem is coming from some

change between 2021-02-01 and 2021-04-30 or if it comes from something

else. What I want is that “guix install perl6-tap-harness” just works.

Having a clear policy for removal – again not an automatic removal

procedure – would help all, IMHO.

Toggle quote (6 lines)

> The norm is that our CI should have detected a problem in the commit

> causing the breakage.

> Can we automatically rebuild all inheriting packages when a package gets

> changed?

CI builds all the commits pushed to Savannah. Not exactly all but

that’s another story and it does not matter for this discussion.

AFAIK, no one is checking that the commit they are pushing does not lead

to break something. Else they would not push it I guess. ;-)

Instead, it is QA that builds “pre-commit“ (patches). Thanks to

tireless Chris’s work since years, we have some tools for monitoring the

impact of one change on the whole package set. Somehow, if I have

correctly understood, QA uses the Build Coordinator to list all the

derivations and then build all the new ones generated by the change.

So the answer to your question is yes. :-) Aside, help is welcome for

improving QA.

Toggle quote (3 lines)

> Also see above: in the two cases you selected, removing the package

> would be the wrong path forward.

Removing a package that is broken since 2021 is the good path forward.

If you care about one package that is marked to be removed soon, then

you fix it or raise your concern. Else it means no one care and so what

is the point to keep broken packages that no one uses?

Toggle quote (12 lines)>> On Wed, 30 Aug 2023 at 12:39, "Dr. Arne Babenhauserheide" <arne_bab@web.de> wrote:
>>> If a change in packages breaks my manifest, that is extremely painful.
>>
>> Yeah, and such rule for dealing with broken packages will be helpful for
>> detecting such change and so avoid such situation.
>
> Since a manifest is strictly dependent on all packages defined in it,
> removing a single referenced package means that the manifest is broken:
> no update works anymore. No security updates come in anymore — even if
> the package in question worked locally. This is a situation we should
> not cause.

Again, I am not proposing an automatic removal process but a policy. A

policy could imply some news or some message saying: these packages will

be removed soon because they are broken.

Assuming this case: the package fails on CI and pass on your machine.

Let assume you have not been enough annoyed for reporting the failure of

the substitutes.

Currently, the situation can stay like that for a long time. It means

that each time something in the dependency graph of that package is

changed, then we burn electricity for re-building it for nothing.

What I am proposing is: if the same package is still failing after

several X <duration> or attempts, then we mark it as ‘broken’ and it

becomes a candidate for a removal. People who care raise their hand.

And we have a better idea about the real status.

Toggle quote (4 lines)

> The more important question is (serious question and *not* for assigning

> blame, but to see whether we can improve processes): with the time we

> already spent in this discussion, we could have fixed a lot of packages.

This was exactly what I was going to answer you. :-)

Toggle quote (2 lines)

> Why did we not do that?

I speak for myself, for many packages that are broken, my first question

is: is it worth to investigate? My estimate starts with a mix between

do I need them? and will the user experience be better compared to my

time spent to investigate.

Cheers,

simon

Csepp wrote 1 years ago

Re: People need to report failing builds even though we have ci.guix.gnu.org for that

Recipients:(name . Simon Tournier)(address . zimon.toutoune@gmail.com)

Message-ID:cucedj4v0ju.fsf@riseup.net

Simon Tournier <zimon.toutoune@gmail.com> writes:

Toggle quote (27 lines)> Hi,
>
> On Mon, 11 Sept 2023 at 09:33, Csepp <raingloom@riseup.net> wrote:
>
>> That is not a package problem but a Guix interface problem.  I have been
>> saying for a while that there needs to be an option to disable all
>> non-trivial local builds by default when you know your machine can't
>> handle them.
>
> IMHO, your proposal is orthogonal with the issue at hand: broken
> packages.  Other said, the issue is: how to deal with the set of
> packages that will not build and we already know it (since weeks,
> months or even years for some).
>
> My workstation can handle all the compilations that are required.  My
> laptop is able offload to it.  The issue about broken packages is not
> about the resources.  It is about burning resources for nothing.
>
> About the issue you are speaking about, we already had discussions in
> this direction -- you are not the only one saying "the fix needs to do
> X" for a while but please keep in mind that "talking does not cook the
> rice". ;-)  Well, maybe you could open a ticket with a concrete
> use-case.
>
> Cheers,
> simon

I was hoping to get some consensus on whether this is actually a
bug/feature that others consider worth tracking, so I kept discussion of
it mostly to guix-devel, but sure, I can make a proper issue for it.

Dr. Arne Babenhauserheide wrote 1 years ago

Re: bug#65391: Acknowledgement (People need to report failing builds even though we have ci.guix.gnu.org for that)

Recipients:(name . Simon Tournier)(address . zimon.toutoune@gmail.com)

Message-ID:87v8cg7068.fsf@web.de

Hi,

I’m skipping a lot to get only to the most important points (save time

for us all).

Simon Tournier <zimon.toutoune@gmail.com> writes:

Toggle quote (9 lines)

> Instead, it is QA that builds “pre-commit“ (patches). Thanks to

> tireless Chris’s work since years, we have some tools for monitoring the

> impact of one change on the whole package set. Somehow, if I have

> correctly understood, QA uses the Build Coordinator to list all the

> derivations and then build all the new ones generated by the change.

> So the answer to your question is yes. :-) Aside, help is welcome for

> improving QA.

So something was missing there that let the change to the ocaml package

slip through this january. This should have raised red flags somewhere.

Do we have documentation on the process? (link?)

Toggle quote (5 lines)>> Since a manifest is strictly dependent on all packages defined in it,
>> removing a single referenced package means that the manifest is broken:
>> no update works anymore. No security updates come in anymore — even if
>> the package in question worked locally. This is a situation we should
>> not cause.

…

Toggle quote (5 lines)

> What I am proposing is: if the same package is still failing after

> several X <duration> or attempts, then we mark it as ‘broken’ and it

> becomes a candidate for a removal. People who care raise their hand.

> And we have a better idea about the real status.

This means with the current functionality that the manifest is broken at

that point. Nothing can be updated anymore. I’ve been in that situation

a few times already with broken packages and it caused weeks of not

being able to update because I didn’t have the time to investigate.

That’s why I wrote the following:

Toggle quote (6 lines)

> If we had a way to have placeholder packages (similar to the renamings)

> that emit warnings for missing packages but do not break the build, that

> would reduce the damage done by removing a package. But I think such a

> mechanism must be in place and tested before adding a rule to remove

> packages.

This would cause us to collect a slowly growing list of removed packages

that will be ignored (except for the warning) in manifests.

That way we would avoid breaking the setup when removing a package.

(define-public-removed the-package-variable

(removed-package

(name "the-package-name")

(reason-for-removal "upstream stopped working a decade ago")))

The key difference between your scenario "some package is broken and I

cannot install it" and my scenario "I have a package in my manifest that

gets removed, breaking my manifest" is that mine is much more painful

because an update breaks changing a working system.

In my scenario I don’t just see "oh, this doesn’t work, let’s choose

another way", but a way I’ve been using and building on gets broken.

Also I experienced that at least twice already. That I had to go and

investigate before I could add a package to my manifest, because the

manifest was broken by a removed package. In at least one instance I had

not been able to update for several weeks before that and didn’t have

time and energy to investigate.

Once I had missed that my system had not updated in months, because I

did reconfigure in a cron job and a removed package had broken

/etc/config.scm

And we actually select for such breakage, because I cannot see locally

whether a package failed on CI, so while I can see (and have to fix)

packages that fail locally, on-CI-failures are invisible.

So instead of removing a package, I think the first step in a process

should be to warn everyone with that package in the manifest that it’s

broken on CI ⇒ add a warning to that package, like the rename warnings.

If no one takes it up for a few months, replace it with a

removed-package placeholder that warns to clean up the manifest. And

just keep that placeholder in place to avoid breaking manifests.

Best wishes,

Arne

Unpolitisch sein

heißt politisch sein,

ohne es zu merken.

draketo.de

Simon Tournier wrote 1 years ago

Recipients:(name . Dr. Arne Babenhauserheide)(address . arne_bab@web.de)

Message-ID:86r0n4xm0h.fsf@gmail.com

Hi Arne,

On Tue, 12 Sep 2023 at 01:12, "Dr. Arne Babenhauserheide" <arne_bab@web.de> wrote:

Toggle quote (3 lines)

> I’m skipping a lot to get only to the most important points (save time

> for us all).

Good initiative, let me do the same. :-)

Toggle quote (19 lines)> That’s why I wrote the following:
>
>> If we had a way to have placeholder packages (similar to the renamings)
>> that emit warnings for missing packages but do not break the build, that
>> would reduce the damage done by removing a package. But I think such a
>> mechanism must be in place and tested before adding a rule to remove
>> packages.
>
> This would cause us to collect a slowly growing list of removed packages
> that will be ignored (except for the warning) in manifests.
>
> That way we would avoid breaking the setup when removing a package.
>
> (define-public-removed the-package-variable
>   (removed-package
>     (name "the-package-name")
>     (reason-for-removal "upstream stopped working a decade ago")))
>

Here you are defining a policy:

1. set a rule for replacing the package by ’removed-package’

2. set a rule for effectively removing this package

Somehow you are discussing to have a rule to deal with the broken

packages. A policy, no? :-)

Having a rule to deal with the regular broken packages appears to me a

good thing and very helpful to keep Guix reliable.

Therefore, we agree that making a policy for dealing with broken

packages is worth and it would help to have a better Guix.

It appears to me better to know what I can expect as an user than to

have some surprise after each “guix pull”. I have in mind the sudden

removal of Python 2 packages for instance. With such policy, it would

have been smoother, IMHO.

That’s said, two minor points that does not matter much. :-)

I do not understand your explanations with the manifest because I do not

see the difference if one element of your manifest is broken or if this

very same element is removed. For the both cases, your manifest is

broken, no? From the point of view of the profile generation, broken or

removed does not change the result, isn’t it? Broken or removed only

changes the process for investigating and try to fix, no?

The only case where it could matter is if your manifest relies on

package variant. That case, if the package becomes broken, the variant

could not be. Well, if that’s the case, I would suggest that you

maintain these packages using a plain copy of the inherited package.

Because a perfectly working update could break your variant. I mean, if

your manifest relies on package variant, then this manifest is highly

dependant on the changes whatever the status of the package.

In all cases, I share your concerns, and as you, I am time to time

bitten by stuff that break. If I am honest, I barely update my base

system. Before an update, I carefully check a commit using “guix

time-machine” and test that my config works. Somehow I often use the

command-line “guix time-machine -- shell -m”.

On a side note, I am not convinced we will have the resource to change

the package definition as your proposing. That’s another story and it

appears to me the part of the discussion for a policy (strategy) for

removing packages. I guess. :-)

That’s long enough. ;-)

Cheers,

simon

Maxime Devos wrote 1 years ago

bug#65391: Acknowledgement (People need to report failing builds even though we have ci.guix.gnu.org for that)

Recipients:

Message-ID:bd82096e-a122-95d2-f52d-b5839c85e7d7@telenet.be

Op 07-09-2023 om 13:32 schreef Simon Tournier:

Toggle quote (27 lines)> Hi,
> 
> On Tue, 29 Aug 2023 at 10:45, Maxim Cournoyer <maxim.cournoyer@gmail.com> wrote:
> 
>> It's frustrating for users when a package is missing, but it's also
>> frustrating/inefficient for maintainers to stumble upon broken packages
>> when checking if an upgrade broke dependent packages (it takes time to
>> build them just to find out they fail, and researching they already
>> did), so a balance is needed.
> 
> There is nothing worse as an user to have this experience:
> 
>      guix search foobar
> 
> oh cool, foobar is there, let try it,
> 
>      guix shell foobar
> 
>      … wait …
>      … stuff are building …
>      … laptop is burning …
>      … wait …
>      Bang!
> 
> Keeping broken packages is just annoyances.  Contributor are annoyed
> because as said by the paragraph above.  And user are annoyed as
> described just above.

Toggle quote (1 lines)

> I am in favor to set a policy for removing then.

You don't need to keep broken packages, they can be fixed instead. 
Although given later e-mails, I suppose that this hypothetical policy 
for removing them would contain things about fixing them instead.

It's this focus on 'broken -> delete' that bothers me, why is the first 
reaction ‘delete them’, not ‘fix them’?

Toggle quote (4 lines)

> Op 11-09-2023 om 16:00 schreef Simon Tournier:

>> If you care about one package that is marked to be removed soon, then

>> you fix it or raise your concern. Else it means no one care and so what

>> is the point to keep broken packages that no one uses?

It doesn't mean that.  As I wrote previously:

Toggle quote (6 lines)>> We could bump the expiry time to 180 days, or even 365 days (a full
>> year).  If nobody opens an issue for a broken package in that amount of
>> time, it's probably not used much if at all and may not be worth the
>> maintenance burden.
> [...] 
> No, it doesn't mean that that the package is not used much, it could instead mean that the people using the package (or interested in using the package, if it was already broken when they discovered it) thought that the existence of ci.guix.gnu.org means that contributors doing Guix maintenance already know that the package is broken and assumed that it would be fixed, and that a new bug report would just be annoying the contributors because they already have a bug report: the build failure on ci.guix.gnu.org.

---

 > The more important question is (serious question and *not* for
 > assigning blame, but to see whether we can improve processes): with
 > the time we already spent in this discussion, we could have fixed a
 > lot of packages.  Why did we not do that?

Speaking only for myself:

   * (because I chose to mostly not work on Guix anymore for reasons that
     aren't relevant to this discussion)

    * if I were to fix broken packages, I would like others to avoid
      creating new breakage (and if breakage occurs, then fix it it
      early).  (Otherwise, not much point to it ...)

      Hence, there needs be some discussion to ensure that other people
      don't do that new breakage in the future.

    * hearing ‘delete it’ as first reaction to ‘broken package’ is rather
      demoralising to people fixing packages.  It's so ... defeatist.
      Sure people with this reaction add a few qualifiers to when it is
      to _not_ be removed, but it sounds rather hollow.

Instead of having a ‘removal policy’ that lays down exceptions that 
indicate when the package should instead be kept, I would rather have a 
‘fixing policy’ that has exceptions indicating when the package may 
instead be removed.

In a sense, those are technically equivalent, but the different framing 
makes a difference in motivation.

Best regards,
Maxime Devos.

Attachment: OpenPGP_0x49E3EE22191725EE.asc

Attachment: OpenPGP_signature

Andreas Enge wrote 1 years ago

Recipients:(address . 65391-done@debbugs.gnu.org)

Message-ID:ZcyEPIwweV0TPUJH@jurong

After reading through the first tenth of what seems to be an interesting
discussion and skimming through the remainder, I take the liberty to close
this bug. Such a discussion had better take place on guix-devel; the report
itself does not start with an actionable proposal: "People need to..."
looks more like an infinite task to me that cannot be closed as finished
if taken literally.

I understand that certain concrete proposals coming from the discussion
have been filed as separate issues, and would suggest that people
interested in the topic continue to do so.

Andreas

Closed

Your comment

This issue is archived.

To comment on this conversation send an email to 65391@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it

mumi current 65391

Then, you may apply the latest patchset in this issue (with sign off)

mumi am -- -s

Or, compose a reply to this issue

mumi compose

Or, send patches to this issue

mumi send-email *.patch

You may also tag this issue. See list of standard tags. For example, to set the confirmed and easy tags

mumi command -t +confirmed -t +easy

Or, remove the moreinfo tag and set the help tag

mumi command -t -moreinfo -t +help

is:open	open issues
is:done	closed issues
submitter:<who>	search issue submitter
author:<who>	search by message author
date:yesterday..now	search by issue date
mdate:3m..2d	search by message date