Java packages do not appear to keep a reference to their inputs

  • Open
  • quality assurance status badge
Details
6 participants
  • Julien Lepiller
  • Liliana Marie Prikler
  • Maxim Cournoyer
  • Maxime Devos
  • Tobias Geerinckx-Rice
  • Mark H Weaver
Owner
unassigned
Submitted by
Maxim Cournoyer
Severity
normal
M
M
Maxim Cournoyer wrote on 17 Oct 2022 23:04
(name . bug-guix)(address . bug-guix@gnu.org)
87v8oixits.fsf@gmail.com
Hello,

I'm not a Java expert, but this appears to me problematic:

Toggle snippet (7 lines)
$ guix build java-commons-dbcp
/gnu/store/jghsa6fmh9vjcsmj7wwilk3w6iblvh32-java-commons-dbcp-2.6.0

$ guix gc -R /gnu/store/jghsa6fmh9vjcsmj7wwilk3w6iblvh32-java-commons-dbcp-2.6.0
/gnu/store/jghsa6fmh9vjcsmj7wwilk3w6iblvh32-java-commons-dbcp-2.6.0

Digging a bit more, peeking into the .jar file, which is a ZIP archive:

Toggle snippet (20 lines)
$ unzip /gnu/store/jghsa6fmh9vjcsmj7wwilk3w6iblvh32-java-commons-dbcp-2.6.0/\
share/java/java-commons-dbcp.jar -d /tmp/java-commons-dbcp.jar

$ grep -rin CLASSPATH /tmp/java-commons-dbcp.jar
$ grep -rin /gnu/store /tmp/java-commons-dbcp.jar
/tmp/java-commons-dbcp.jar/META-INF/INDEX.LIST:3:/gnu/store/jghsa6fmh9vjcsmj7wwilk3w6iblvh32-java-commons-dbcp-2.6.0/share/java/java-commons-dbcp.jar

$ cat /tmp/java-commons-dbcp.jar/META-INF/INDEX.LIST
JarIndex-Version: 1.0

/gnu/store/jghsa6fmh9vjcsmj7wwilk3w6iblvh32-java-commons-dbcp-2.6.0/share/java/java-commons-dbcp.jar
org
org/apache
org/apache/commons
org/apache/commons/dbcp2
org/apache/commons/dbcp2/cpdsadapter
org/apache/commons/dbcp2/datasources
org/apache/commons/dbcp2/managed

Still, no traces of the other libraries such as 'java-commons-pool'
which should be referenced.

I assume this means grafts doesn't currently work for Java libraries.

--
Thanks,
Maxim
J
J
Julien Lepiller wrote on 18 Oct 2022 00:03
Re: bug#58591: Java packages do not app ear to keep a reference to their inputs
025A8B84-E6C6-43EC-AAF6-CC93DC2F2BAC@lepiller.eu
You're right, java package don't retain references to there input, that's why we propagate required dependencies (mh… sometimes). I don't know how they could reference dependencies directly.

Le 17 octobre 2022 23:04:47 GMT+02:00, Maxim Cournoyer <maxim.cournoyer@gmail.com> a écrit :
Toggle quote (46 lines)
>Hello,
>
>I'm not a Java expert, but this appears to me problematic:
>
>--8<---------------cut here---------------start------------->8---
>$ guix build java-commons-dbcp
>/gnu/store/jghsa6fmh9vjcsmj7wwilk3w6iblvh32-java-commons-dbcp-2.6.0
>
>$ guix gc -R /gnu/store/jghsa6fmh9vjcsmj7wwilk3w6iblvh32-java-commons-dbcp-2.6.0
>/gnu/store/jghsa6fmh9vjcsmj7wwilk3w6iblvh32-java-commons-dbcp-2.6.0
>--8<---------------cut here---------------end--------------->8---
>
>Digging a bit more, peeking into the .jar file, which is a ZIP archive:
>
>--8<---------------cut here---------------start------------->8---
>$ unzip /gnu/store/jghsa6fmh9vjcsmj7wwilk3w6iblvh32-java-commons-dbcp-2.6.0/\
>share/java/java-commons-dbcp.jar -d /tmp/java-commons-dbcp.jar
>
>$ grep -rin CLASSPATH /tmp/java-commons-dbcp.jar
>$ grep -rin /gnu/store /tmp/java-commons-dbcp.jar
>/tmp/java-commons-dbcp.jar/META-INF/INDEX.LIST:3:/gnu/store/jghsa6fmh9vjcsmj7wwilk3w6iblvh32-java-commons-dbcp-2.6.0/share/java/java-commons-dbcp.jar
>
>$ cat /tmp/java-commons-dbcp.jar/META-INF/INDEX.LIST
>JarIndex-Version: 1.0
>
>/gnu/store/jghsa6fmh9vjcsmj7wwilk3w6iblvh32-java-commons-dbcp-2.6.0/share/java/java-commons-dbcp.jar
>org
>org/apache
>org/apache/commons
>org/apache/commons/dbcp2
>org/apache/commons/dbcp2/cpdsadapter
>org/apache/commons/dbcp2/datasources
>org/apache/commons/dbcp2/managed
>--8<---------------cut here---------------end--------------->8---
>
>Still, no traces of the other libraries such as 'java-commons-pool'
>which should be referenced.
>
>I assume this means grafts doesn't currently work for Java libraries.
>
>--
>Thanks,
>Maxim
>
>
>
Attachment: file
M
M
Mark H Weaver wrote on 18 Oct 2022 03:43
Re: bug#58591: Java packages do not appear to keep a reference to their inputs
87pmepsy78.fsf@netris.org
Julien Lepiller <julien@lepiller.eu> writes:

Toggle quote (4 lines)
> You're right, java package don't retain references to there input,
> that's why we propagate required dependencies (mh… sometimes). I don't
> know how they could reference dependencies directly.

A better workaround would be to add a phase that installs file(s) in the
output(s) that contain references to the required store items. They
could simply be text files with one line per reference. That would at
least protect the dependencies from the garbage collector.

The remaining unsolved problem is, of course, grafting.

Mark

--
Disinformation flourishes because many people care deeply about injustice
but very few check the facts. Ask me about https://stallmansupport.org.
M
M
Maxim Cournoyer wrote on 18 Oct 2022 04:45
(name . Julien Lepiller)(address . julien@lepiller.eu)(address . 58591@debbugs.gnu.org)
87edv5yhlp.fsf@gmail.com
Hi Julien,

Julien Lepiller <julien@lepiller.eu> writes:

Toggle quote (4 lines)
> You're right, java package don't retain references to there input,
> that's why we propagate required dependencies (mh… sometimes). I don't
> know how they could reference dependencies directly.

Could we, along with installing Java classes as directories instead of
.jar archive files [0] at a more specific prefix, define a search path
specification that'd set CLASSPATH? Currently I don't see anything
setting CLASSPATH outside of the build systems, so even if we propagate
Java things, I don't see how it'd find them in a profile.

[0] Not exactly sure how that's done yet, but it's mentioned here: https://docs.oracle.com/javase/8/docs/technotes/tools/windows/classpath.html

--
Thanks,
Maxim
M
M
Maxime Devos wrote on 18 Oct 2022 09:01
(address . 58591@debbugs.gnu.org)
f23e32f8-aba2-a9b7-db69-886dfb945dcb@telenet.be
On 18-10-2022 04:45, Maxim Cournoyer wrote:
Toggle quote (2 lines)
> [...] setting CLASSPATH outside of the build systems, so even if we propagate
> Java things, I don't see how it'd find them in a profile.
FWIW, when I used java things in Guix, I manually did
CLASSPATH=$GUIX_ENVIRONMENT/... or the CLI equivalent (some option
argument of 'java').
Some more automatisation, e.g. in the form of search paths as you
propose, would be nice though.
Greetings,
Maxime.
Attachment: OpenPGP_signature
L
L
Liliana Marie Prikler wrote on 18 Oct 2022 09:36
(address . 58591@debbugs.gnu.org)
0e0a5d5dd55ae78f2eda4e390517d6b5e0325b83.camel@ist.tugraz.at
Am Montag, dem 17.10.2022 um 22:45 -0400 schrieb Maxim Cournoyer:
Toggle quote (15 lines)
> Hi Julien,
>
> Julien Lepiller <julien@lepiller.eu> writes:
>
> > You're right, java package don't retain references to there input,
> > that's why we propagate required dependencies (mh… sometimes). I
> > don't
> > know how they could reference dependencies directly.
>
> Could we, along with installing Java classes as directories instead
> of .jar archive files [0] at a more specific prefix, define a search
> path specification that'd set CLASSPATH?  Currently I don't see
> anything setting CLASSPATH outside of the build systems, so even if
> we propagate Java things, I don't see how it'd find them in a
> profile.
I'd recommend writing an xml file like

<path id="${java-package-name}.classpath">
<pathelement location="${output-jar}" />
<pathelement path="${input1.classpath}" />
...
<pathelement path="${inputn.classpath}" /> 
</path>

to a well-known location. Then we could reuse those files in ant-
build-system.

Cheers
M
M
Maxim Cournoyer wrote on 18 Oct 2022 15:14
(name . Liliana Marie Prikler)(address . liliana.prikler@ist.tugraz.at)
87y1tdw9yc.fsf@gmail.com
Hello,

Liliana Marie Prikler <liliana.prikler@ist.tugraz.at> writes:

Toggle quote (28 lines)
> Am Montag, dem 17.10.2022 um 22:45 -0400 schrieb Maxim Cournoyer:
>> Hi Julien,
>>
>> Julien Lepiller <julien@lepiller.eu> writes:
>>
>> > You're right, java package don't retain references to there input,
>> > that's why we propagate required dependencies (mh… sometimes). I
>> > don't
>> > know how they could reference dependencies directly.
>>
>> Could we, along with installing Java classes as directories instead
>> of .jar archive files [0] at a more specific prefix, define a search
>> path specification that'd set CLASSPATH?  Currently I don't see
>> anything setting CLASSPATH outside of the build systems, so even if
>> we propagate Java things, I don't see how it'd find them in a
>> profile.
> I'd recommend writing an xml file like
>
> <path id="${java-package-name}.classpath">
> <pathelement location="${output-jar}" />
> <pathelement path="${input1.classpath}" />
> ...
> <pathelement path="${inputn.classpath}" /> 
> </path>
>
> to a well-known location. Then we could reuse those files in ant-
> build-system.

A nice read is [0], which mentions the existence of a 'Class-Path' main
attribute that can go in the manifest file. If using unpacked jars
works the same as .jars (which are just zip files) for Java, then we
could not only have dependency correctly referenced and loaded via
'Class-Path', but also the grafting mechanism would work, since the
paths would appear in clear (not obfuscated due to zip compression).

Our current usage of JarIndex doesn't suite the bill it was intended
for; this is a performance trick to index all the .jars of a .jar pack;
it'll only list its dependencies if they are packed in the same jar,
which is not what we do or want as a distribution.


--
Thanks,
Maxim
M
M
Maxim Cournoyer wrote on 18 Oct 2022 15:29
(name . Liliana Marie Prikler)(address . liliana.prikler@ist.tugraz.at)
87r0z5w983.fsf@gmail.com
Hi,

Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

[...]

Toggle quote (7 lines)
> A nice read is [0], which mentions the existence of a 'Class-Path' main
> attribute that can go in the manifest file. If using unpacked jars
> works the same as .jars (which are just zip files) for Java, then we
> could not only have dependency correctly referenced and loaded via
> 'Class-Path', but also the grafting mechanism would work, since the
> paths would appear in clear (not obfuscated due to zip compression).

Ugh, Class-Path only accepts relative path, not absolute paths:

The location of the JAR file or directory represented by this entry
is contained within the containing directory of the context JAR. Use
of "../" to navigate to the parent directory is not permitted, except
for the case when the context JAR is loaded from the file system.

Perhaps we could patch Java so that it's loader is more adapted for our
use case, or extend its manifest with a Guix-specific Guix-Class-Path
section that'd allow for absolute paths.

Toggle quote (2 lines)
--
Thanks,
Maxim
T
T
Tobias Geerinckx-Rice wrote on 18 Oct 2022 15:21
(name . Maxim Cournoyer)(address . maxim.cournoyer@gmail.com)
87k04xcksv.fsf@nckx
Hi Maxim,

Maxim Cournoyer ???
Toggle quote (2 lines)
> not obfuscated due to zip compression

Groan. Which package(s) compress .jars?

(I found a few in -checkouts, which is its own potential thing,
but that aside.)

Kind regards,

T G-R
-----BEGIN PGP SIGNATURE-----

iIMEARYKACsWIQT12iAyS4c9C3o4dnINsP+IT1VteQUCY06ssQ0cbWVAdG9iaWFz
LmdyAAoJEA2w/4hPVW155YYBALhG6NeNghzS/ubz6tKzmmnQYIMlSkD4L0+QeSkv
fvyVAP9mAyPWE5COlEQ2NmTojoSMn+EX6cKOStxI8NjcF+M6BQ==
=64XT
-----END PGP SIGNATURE-----

T
T
Tobias Geerinckx-Rice wrote on 18 Oct 2022 16:17
87fsflcj27.fsf@nckx
Tobias Geerinckx-Rice via Bug reports for GNU Guix ???
Toggle quote (2 lines)
> Groan. Which package(s) compress .jars?

OK, found one: openjdk@16.0.1's /lib/jrt-fs.jar.

Kind regards,

T G-R
-----BEGIN PGP SIGNATURE-----

iIMEARYKACsWIQT12iAyS4c9C3o4dnINsP+IT1VteQUCY061gA0cbWVAdG9iaWFz
LmdyAAoJEA2w/4hPVW15KPkA/R1b3WKx+GaeHWbDsfsbwbr3sjP1KYXJJ0cpXJKZ
HojCAQDMu8y2l3BeT1KzgZ5kvUfQjmFsHwQisJdOjz/y4Qs/Ag==
=dk7c
-----END PGP SIGNATURE-----

M
M
Maxim Cournoyer wrote on 18 Oct 2022 16:53
(name . Tobias Geerinckx-Rice)(address . me@tobias.gr)
87ilkhw5cu.fsf@gmail.com
Hi Tobias!

Tobias Geerinckx-Rice <me@tobias.gr> writes:

Toggle quote (7 lines)
> Hi Maxim,
>
> Maxim Cournoyer ???
>> not obfuscated due to zip compression
>
> Groan. Which package(s) compress .jars?

Oh, aren't they all? I hadn't realized .jar compression was optional.
I believe our ant-build-system produces compressed jars; in fact it uses
'zip' directly to pack them.

--
Thanks,
Maxim
M
M
Maxim Cournoyer wrote on 18 Oct 2022 16:56
(name . Tobias Geerinckx-Rice)(address . me@tobias.gr)
87edv5w58e.fsf@gmail.com
Hello,

Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

Toggle quote (2 lines)
> Tobias Geerinckx-Rice <me@tobias.gr> writes:

[...]

Toggle quote (4 lines)
>> Groan. Which package(s) compress .jars?
>
> Oh, aren't they all? I hadn't realized .jar compression was optional.

Actually, reading [0] again, it seems a JAR *is* a zip archive, so
cannot be either compressed or uncompressed.


--
Thanks,
Maxim
J
J
Julien Lepiller wrote on 18 Oct 2022 17:32
Re: bug#58591: Java packages do not app ear to keep a reference to their inputs
08A37CE8-730E-4FD8-96B5-64CC874BFA9B@lepiller.eu
Hi, replying to a few emails at once.

The ant-build-system uses zip -0 to produce an uncompressed archive. By default, jar produces a compressed one, so there's a repack phase for that:

Embedding the classpath in the manifest is possible but would not have the expected effect. That's because a line in the manifest cannot exceed 72 bytes (see "line length" in https://docs.oracle.com/javase/8/docs/technotes/guides/jar/jar.html#Notes_on_Manifest_and_Signature_Files),so the classpath will look like:

Class-Path: ../../../1234567891011
1213141516/share/java/foo.jar

Although java would read that fine, the grafter will not see it, nor be able to graft foo in a meaningful manner: java would still use the ungrafted version even if another file references foo.

Le 18 octobre 2022 16:56:01 GMT+02:00, Maxim Cournoyer <maxim.cournoyer@gmail.com> a écrit :
Toggle quote (20 lines)
>Hello,
>
>Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:
>
>> Tobias Geerinckx-Rice <me@tobias.gr> writes:
>
>[...]
>
>>> Groan. Which package(s) compress .jars?
>>
>> Oh, aren't they all? I hadn't realized .jar compression was optional.
>
>Actually, reading [0] again, it seems a JAR *is* a zip archive, so
>cannot be either compressed or uncompressed.
>
>[0] https://docs.oracle.com/en/java/javase/19/docs/specs/jar/jar.html
>
>--
>Thanks,
>Maxim
Attachment: file
M
M
Maxim Cournoyer wrote on 19 Oct 2022 01:25
Re: bug#58591: Java packages do not appear to keep a reference to their inputs
(name . Julien Lepiller)(address . julien@lepiller.eu)
87sfjkvhmi.fsf@gmail.com
Hello,

Julien Lepiller <julien@lepiller.eu> writes:

Toggle quote (7 lines)
> Hi, replying to a few emails at once.
>
> The ant-build-system uses zip -0 to produce an uncompressed
> archive. By default, jar produces a compressed one, so there's a
> repack phase for that:
> http://git.savannah.nongnu.org/cgit/guix.git/tree/guix/build/ant-build-system.scm#n226

Ah, I had missed the -0 == uncompressed part. Thank you.

Toggle quote (9 lines)
> Embedding the classpath in the manifest is possible but would not have
> the expected effect. That's because a line in the manifest cannot
> exceed 72 bytes (see "line length" in
> https://docs.oracle.com/javase/8/docs/technotes/guides/jar/jar.html#Notes_on_Manifest_and_Signature_Files),
> so the classpath will look like:
>
> Class-Path: ../../../1234567891011
> 1213141516/share/java/foo.jar

Although it looks like the 72 bytes line width limitation may has to do
with binary data:

Binary data of any form is represented as base64. Continuations are
required for binary data which causes line length to exceed 72
bytes. Examples of binary data are digests and signatures.

Worth a try in my opinion (I'm giving it a shot as I write this).

Thanks for the explanations!

Maxim
?