texlive: kpathsea doesn't use ls-R database

  • Open
  • quality assurance status badge
Details
2 participants
  • vicvbcun
  • Nicolas Goaziou
Owner
unassigned
Submitted by
vicvbcun
Severity
normal

Debbugs page

vicvbcun wrote 1 months ago
(address . bug-guix@gnu.org)
Z5dfkh14ZyWISprI@lambda
Hello Guix!

Consider the following example latex document:
Toggle snippet (7 lines)
\documentclass{article}
\usepackage{mathtools}

\begin{document}
hello world
\end{document}
Compiling it with LuaLaTeX under strace in a shell with
texlive-scheme-basic, texlive-collection-luatex and
texlive-collection-latexextra, it seems like most of the time is spent
recursively searching for input files:
Toggle snippet (10 lines)
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
27.70 0.080138 2 30174 getdents64
21.99 0.063605 4 15455 259 openat
17.44 0.050460 3 16179 32 newfstatat
14.37 0.041583 3 10440 10296 access
8.42 0.024348 1 15196 close
7.76 0.022456 1 15201 fstat
0.79 0.002278 1 1868 write
and similarly for pdflatex.

As an extreme example, consider
Toggle snippet (6 lines)
\documentclass{tudapub}

\begin{document}
hello world
\end{document}
compiled with
Toggle snippet (8 lines)
texlive-scheme-basic
texlive-collection-luatex
texlive-collection-latexextra
texlive-roboto texlive-urcls
texlive-xcharter
texlive-tuda-ci

This takes over 14 seconds (compared to about 2.7 seconds for lualatex
from Arch Linux) and from strace:
Toggle snippet (11 lines)
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
32.60 5.926537 3 1801518 getdents64
26.46 4.809462 5 900841 284 openat
20.90 3.799744 4 896057 895349 access
10.19 1.851520 2 900557 close
9.49 1.724891 1 900575 fstat
0.28 0.050743 2 17680 229 newfstatat
0.04 0.007077 1 6073 read

The cause for this seems to be kpathsea doesn't treat the ls-R database
as authoritative. It is opened but kpathsea falls back to recursive
searching.

In the package definition for texlive-libkpathsea, texmf.cnf is modified
such that the TEXMF variable is set without !! in front of
$TEXMFSYSCONFIG, $TEXMFSYSVAR and $TEXMFDIST.
If I override $TEXMF via --cnf-line like
Toggle snippet (5 lines)
lualatex \
--cnf-line='TEXMF =
{$TEXMFCONFIG,$TEXMFVAR,$TEXMFHOME,!!$TEXMFSYSCONFIG,!!$TEXMFSYSVAR,!!$TEXMFDIST}' \
example.ltx
compilation time for the extreme example above falls to about 2.5
seconds, without excessive searching.

The comment above the substitution says that the !! construct wouldn't
work for texlive-build-system or when building profiles. I don't know
if it would be possible to work around this but perhaps it could be
possible to work around this if installed in profile (or environment)?

vicvbcun
Nicolas Goaziou wrote 1 months ago
(address . bug-guix@gnu.org)
8734h18p87.fsf@nicolasgoaziou.fr
Hello,

vicvbcun <guix@ikherbers.com> writes:

Toggle quote (67 lines)
> Consider the following example latex document:
>
> --8<---------------cut here---------------start------------->8---
> \documentclass{article}
> \usepackage{mathtools}
>
> \begin{document}
> hello world
> \end{document}
> --8<---------------cut here---------------end--------------->8---
>
> Compiling it with LuaLaTeX under strace in a shell with
> texlive-scheme-basic, texlive-collection-luatex and
> texlive-collection-latexextra, it seems like most of the time is spent
> recursively searching for input files:
>
> --8<---------------cut here---------------start------------->8---
> % time seconds usecs/call calls errors syscall
> ------ ----------- ----------- --------- --------- ----------------
> 27.70 0.080138 2 30174 getdents64
> 21.99 0.063605 4 15455 259 openat
> 17.44 0.050460 3 16179 32 newfstatat
> 14.37 0.041583 3 10440 10296 access
> 8.42 0.024348 1 15196 close
> 7.76 0.022456 1 15201 fstat
> 0.79 0.002278 1 1868 write
> --8<---------------cut here---------------end--------------->8---
>
> and similarly for pdflatex.
>
> As an extreme example, consider
>
> --8<---------------cut here---------------start------------->8---
> \documentclass{tudapub}
>
> \begin{document}
> hello world
> \end{document}
> --8<---------------cut here---------------end--------------->8---
>
> compiled with
>
> --8<---------------cut here---------------start------------->8---
> texlive-scheme-basic
> texlive-collection-luatex
> texlive-collection-latexextra
> texlive-roboto texlive-urcls
> texlive-xcharter
> texlive-tuda-ci
> --8<---------------cut here---------------end--------------->8---
>
>
> This takes over 14 seconds (compared to about 2.7 seconds for lualatex
> from Arch Linux) and from strace:
>
> --8<---------------cut here---------------start------------->8---
> % time seconds usecs/call calls errors syscall
> ------ ----------- ----------- --------- --------- ----------------
> 32.60 5.926537 3 1801518 getdents64
> 26.46 4.809462 5 900841 284 openat
> 20.90 3.799744 4 896057 895349 access
> 10.19 1.851520 2 900557 close
> 9.49 1.724891 1 900575 fstat
> 0.28 0.050743 2 17680 229 newfstatat
> 0.04 0.007077 1 6073 read
> --8<---------------cut here---------------end--------------->8---

Thank you for the report. I confirm the issue, unfortunately.

Toggle quote (4 lines)
> The cause for this seems to be kpathsea doesn't treat the ls-R database
> as authoritative. It is opened but kpathsea falls back to recursive
> searching.

AFAIU, this should not happen. According to "The TeX Live Guide 2024":

If a file is not found in the database, by default Kpathsea goes ahead
and searches the disk. If a particular path element begins with ‘!!’,
however, only the database will be searched for that element, never
the disk.

IOW, even if the "!!" prefix is not there, Kpathsea should first look
for files in ls-R, and then on the disk. As you point out, it doesn’t
happen like this, and I don’t know why.

Toggle quote (15 lines)
> In the package definition for texlive-libkpathsea, texmf.cnf is modified
> such that the TEXMF variable is set without !! in front of
> $TEXMFSYSCONFIG, $TEXMFSYSVAR and $TEXMFDIST.
> If I override $TEXMF via --cnf-line like
>
> --8<---------------cut here---------------start------------->8---
> lualatex \
> --cnf-line='TEXMF =
> {$TEXMFCONFIG,$TEXMFVAR,$TEXMFHOME,!!$TEXMFSYSCONFIG,!!$TEXMFSYSVAR,!!$TEXMFDIST}' \
> example.ltx
> --8<---------------cut here---------------end--------------->8---
>
> compilation time for the extreme example above falls to about 2.5
> seconds, without excessive searching.

At least it proves our ls-R file is valid, at the expected location.

Toggle quote (5 lines)
> The comment above the substitution says that the !! construct wouldn't
> work for texlive-build-system or when building profiles. I don't know
> if it would be possible to work around this but perhaps it could be
> possible to work around this if installed in profile (or environment)?

I don’t understand what you want to install in a profile. The ls-R file
is already built during profile generation. See "guix/profiles.scm".

Maybe we could keep "!!" prefix and create a ls-R file each time
`texlive-build-system' builds a package and every time
`texlive-updmap.cfg' is an input used to build documentation. In this
case I'm not sure about what should be done for packages propagating TeX
Live libraries without actually using them.

In any case, this would require some experimentation. And it still is
a workaround for a problem we don’t understand yet.

Regards,
--
Nicolas Goaziou
vicvbcun wrote 1 months ago
Re: bug#75893: texlive: kpathsea doesn't use ls-R database
(name . Nicolas Goaziou)(address . mail@nicolasgoaziou.fr)
Z5v80Rea4QfC60x4@lambda
Attachment: file
vicvbcun wrote 1 months ago
Z5wDPOZ_ZAEpk4CP@lambda
On 2025-01-30T23:27:29+0100, vicvbcun wrote
Toggle quote (14 lines)
> [...]
>>>The comment above the substitution says that the !! construct wouldn't
>>>work for texlive-build-system or when building profiles. I don't know
>>>if it would be possible to work around this but perhaps it could be
>>>possible to work around this if installed in profile (or environment)?
>>
>>I don’t understand what you want to install in a profile. The ls-R file
>>is already built during profile generation. See "guix/profiles.scm".
>What I meant was that we could maybe use a horrible hack like somehow
>overwriting texmf.cnf or wrapping the engines — anything to avoid
>rebuilding the world. But on a second thought, LaTeX should mostly be
>a build time dependency so that grafting with a version capable of
>handling both the build environment and being installed should work
>well, right? At least until the next TeX Live release.
Actually, on a third thought, the following cursed approach might work:
Create a variant `texlive-libkpathsea/ls-R-authoritative' of
`texlive-libkpathsea' with the only difference being !! in front of
$TEXMFDIST in texmf.cnf and register it as a replacement for
`texlive-libkpathsea'. That way packages are built with the original,
ungrafted version but when a user installs TeX Live packages they get
the version for which the ls-R database is authoritative.

An issue with this would be that ungexp'ing a texlive-* package
referencing `texlive-libkpathsea' should yield the grafted version so
the profile hook would probably need to be changed.

vicvbcun
Nicolas Goaziou wrote 1 months ago
(name . vicvbcun)(address . guix@ikherbers.com)
87tt92doze.fsf@nicolasgoaziou.fr
Hello,

vicvbcun <guix@ikherbers.com> writes:

Toggle quote (28 lines)
> On 2025-01-30T23:27:29+0100, vicvbcun wrote
>> [...]
>>>> The comment above the substitution says that the !! construct
>>>> wouldn't work for texlive-build-system or when building profiles.
>>>> I don't know if it would be possible to work around this but
>>>> perhaps it could be possible to work around this if installed in
>>>> profile (or environment)?
>>>
>>>I don’t understand what you want to install in a profile. The ls-R file
>>>is already built during profile generation. See "guix/profiles.scm".
>> What I meant was that we could maybe use a horrible hack like
>> somehow overwriting texmf.cnf or wrapping the engines — anything to
>> avoid rebuilding the world. But on a second thought, LaTeX should
>> mostly be a build time dependency so that grafting with a version
>> capable of handling both the build environment and being installed
>> should work well, right? At least until the next TeX Live release.
> Actually, on a third thought, the following cursed approach might
> work: Create a variant `texlive-libkpathsea/ls-R-authoritative' of
> `texlive-libkpathsea' with the only difference being !! in front of
> $TEXMFDIST in texmf.cnf and register it as a replacement for
> `texlive-libkpathsea'. That way packages are built with the original,
> ungrafted version but when a user installs TeX Live packages they get
> the version for which the ls-R database is authoritative.
>
> An issue with this would be that ungexp'ing a texlive-* package
> referencing `texlive-libkpathsea' should yield the grafted version so
> the profile hook would probably need to be changed.

I pushed a tentative patch in "tex-team" branch. I’m in the process of
testing it but it could take a while as texlive-collection-latexextra
contains more than 1k packages.

Feedback welcome.

Regards,
--
Nicolas Goaziou
Nicolas Goaziou wrote 4 weeks ago
(name . vicvbcun)(address . guix@ikherbers.com)
87pljpebt8.fsf@nicolasgoaziou.fr
Nicolas Goaziou via Bug reports for GNU Guix <bug-guix@gnu.org> writes:

Toggle quote (4 lines)
> I pushed a tentative patch in "tex-team" branch. I’m in the process of
> testing it but it could take a while as texlive-collection-latexextra
> contains more than 1k packages.

It seems to be better. The "extreme" example in your original post takes
around 6.5 seconds on my machine (that’s still 2.5 times more than your
results but my laptop is old) on the second run. The first run takes
slightly longer because it needs to populate font cache.

I’m going to ask for an inclusion on master branch, but it will not
happen quickly considering the pending queue for merge requests.
?
Your comment

Commenting via the web interface is currently disabled.

To comment on this conversation send an email to 75893@debbugs.gnu.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 75893
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch
You may also tag this issue. See list of standard tags. For example, to set the confirmed and easy tags
mumi command -t +confirmed -t +easy
Or, remove the moreinfo tag and set the help tag
mumi command -t -moreinfo -t +help