ISO installer image is broken on i686

Done

Details

6 participants

Gábor Boskovits
Brice Waegeneire
Ludovic Courtès
pelzflorian (Florian Pelz)
Thomas Schmitt
swedebugia

Owner: unassigned

Submitted by: Ludovic Courtès

Severity: serious

Merged with

35136

Ludovic Courtès wrote on 6 Dec 2018 01:02

Recipients:(name . Bug Guix)(address . bug-guix@gnu.org)

Message-ID:87d0qfwmih.fsf@gnu.org

Hello,

The ISO installer image as produced on commit

4a0b87f0ec5b6c2dcf82b372dd20ca7ea6acdd9c by

guix system disk-image --file-system-type=iso9660 \

-s i686-linux gnu/system/install.scm

contains unreadable file(s), at least /var/guix/db/db.sqlite.

The build at https://hydra.gnu.org/build/3151513 (2018-11-12,

64461ba20a07a0cf3197de3e97cb44e0f66cebdc) seems is the only occurrence

of the problem I could find on the build farms: while running the

installation off the ISO image, it fails like this:

Toggle snippet (18 lines)+ guix --version
guix (GNU Guix) 0.15.0-6.f9a8fce
Copyright (C) 2018 the Guix authors
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
+ export GUIX_BUILD_OPTIONS=--no-grafts
+ GUIX_BUILD_OPTIONS=--no-grafts
+ guix build isc-dhcp
[   95.076694] attempt to access beyond end of device
[   95.080672] sr0: rw=524288, want=2118580, limit=2115840
[   95.082317] attempt to access beyond end of device
[   95.083730] sr0: rw=0, want=2118332, limit=2115840
[   95.097050] attempt to access beyond end of device
[   95.098175] sr0: rw=0, want=2118332, limit=2115840
guix build: error: build failed: cannot open Nix database `/var/guix/db/db.sqlite'

Indeed, if you spawn the image and run “cat /var/guix/db/db.sqlite”, it

fails with EIO and “attempt to access beyond end of device.” I suspect

the bugs Mark reported at https://issues.guix.info/issue/33362 and

https://issues.guix.info/issue/33555 are related.

My guess is that the bug has always existed on ‘core-updates’ since

https://berlin.guixsd.org/build/662745 (‘master’, 2018-11-30, i.e.,

just before ‘core-updates’ was merged) shows a successful installation.

I tried running the ISO image in qemu-system-{x86_64,i386}, with and

without KVM, and the I/O errors are always there, including with a

pre-core-updates QEMU.

I tried reverting xorriso to 1.4.8 to no avail (which is not surprising

since xorriso was upgraded on 2018-09-18 and the successful installation

above which 2018-11-30.)

At this point I can only suspect a toolchain issue, probably binutils or

libc since gcc didn’t change.

Thoughts?

This is holding the 0.16.0 release and I’m unavailable to do it next

week and with little time over the next few days. Thus I’m considering

exceptionally releasing without the i686 GuixSD install image; thoughts?

The rest is all fine and ready to ship.

Thanks,

Ludo’.

Ludovic Courtès wrote on 6 Dec 2018 08:15

control message for bug #33639

Recipients:(address . control@debbugs.gnu.org)

Message-ID:87in07m8h2.fsf@gnu.org

severity 33639 serious

Ludovic Courtès wrote on 6 Dec 2018 08:19

Re: bug#33639: ISO installer image is broken on i686

Recipients:(address . 33639@debbugs.gnu.org)

Message-ID:87efavm8b3.fsf@gnu.org

Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (8 lines)

> The ISO installer image as produced on commit

> 4a0b87f0ec5b6c2dcf82b372dd20ca7ea6acdd9c by

> guix system disk-image --file-system-type=iso9660 \

> -s i686-linux gnu/system/install.scm

> contains unreadable file(s), at least /var/guix/db/db.sqlite.

I can reproduce the I/O error by mounting the image:

Toggle snippet (18 lines)ludo@ribbon ~/src/guix$ sudo losetup /dev/loop0 /gnu/store/1yanxg3cz5wi6vhpvhipxvmjwm201fbm-image.iso
ludo@ribbon ~/src/guix$ sudo mount -t iso9660 /dev/loop /mnt/disk/
mount: /mnt/disk: WARNING: device write-protected, mounted read-only.
ludo@ribbon ~/src/guix$ cat < /mnt/disk/var/guix/db/db.sqlite > /dev/null
cat: -: Eraro de en-eligo
ludo@ribbon ~/src/guix$ dmesg |tail
[   41.186408] shepherd[1]: Service guix-daemon has been started.
[   45.725418] IPv6: ADDRCONF(NETDEV_UP): enp0s31f6: link is not ready
[   45.933911] IPv6: ADDRCONF(NETDEV_UP): enp0s31f6: link is not ready
[   49.496112] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[   49.496165] IPv6: ADDRCONF(NETDEV_CHANGE): enp0s31f6: link becomes ready
[  203.358136] ISO 9660 Extensions: RRIP_1991A
[  215.199352] attempt to access beyond end of device
[  215.199357] loop0: rw=524288, want=1903876, limit=1899264
[  215.199362] attempt to access beyond end of device
[  215.199363] loop0: rw=0, want=1903532, limit=1899264

So the problems lies with the VM that creates the image.

Ludo’.

swedebugia wrote on 6 Dec 2018 10:35

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)

Message-ID:8047bf42762c6c4f8106689097afa32d@riseup.net

On 2018-12-06 01:02, Ludovic Courtès wrote:

snip

Toggle quote (26 lines)> Indeed, if you spawn the image and run “cat /var/guix/db/db.sqlite”, it
> fails with EIO and “attempt to access beyond end of device.”  I suspect
> the bugs Mark reported at <https://issues.guix.info/issue/33362> and
> <https://issues.guix.info/issue/33555> are related.
> 
> My guess is that the bug has always existed on ‘core-updates’ since
> <https://berlin.guixsd.org/build/662745> (‘master’, 2018-11-30, i.e.,
> just before ‘core-updates’ was merged) shows a successful installation.
> 
> I tried running the ISO image in qemu-system-{x86_64,i386}, with and
> without KVM, and the I/O errors are always there, including with a
> pre-core-updates QEMU.
> 
> I tried reverting xorriso to 1.4.8 to no avail (which is not surprising
> since xorriso was upgraded on 2018-09-18 and the successful installation
> above which 2018-11-30.)
> 
> At this point I can only suspect a toolchain issue, probably binutils or
> libc since gcc didn’t change.
> 
> Thoughts?
> 
> This is holding the 0.16.0 release and I’m unavailable to do it next
> week and with little time over the next few days.  Thus I’m considering
> exceptionally releasing without the i686 GuixSD install image; thoughts?

Ok, I see.

Has anybody tested that guix pull from 0.15 -> 0.16 works on an install

ISO? (I don't know if we want/agreed to support this at all but 1 bug

suggests problems related to https: )

I say go for release and note it on the download page and provide

0.15-i686 image for now.

I'm using i686 GuixSD on my devlaptop.

Cheers

Swedebugia

Ludovic Courtès wrote on 6 Dec 2018 11:34

Recipients:

Message-ID:874lbrkkog.fsf@gnu.org

Dear Xorriso hackers,

While building an ISO for i686, running Xorriso 1.5.0 built for i686

(actually ‘grub-mkrescue’, but that’s just a wrapper around Xorriso) in

qemu-system-i386, we end up with an ISO image containing files that lead

to I/O errors (“attempt to access beyond end of device”):

Toggle snippet (18 lines)ludo@ribbon ~/src/guix$ sudo losetup /dev/loop0 /gnu/store/1yanxg3cz5wi6vhpvhipxvmjwm201fbm-image.iso
ludo@ribbon ~/src/guix$ sudo mount -t iso9660 /dev/loop /mnt/disk/
mount: /mnt/disk: WARNING: device write-protected, mounted read-only.
ludo@ribbon ~/src/guix$ cat < /mnt/disk/var/guix/db/db.sqlite > /dev/null
cat: -: Eraro de en-eligo
ludo@ribbon ~/src/guix$ dmesg |tail
[   41.186408] shepherd[1]: Service guix-daemon has been started.
[   45.725418] IPv6: ADDRCONF(NETDEV_UP): enp0s31f6: link is not ready
[   45.933911] IPv6: ADDRCONF(NETDEV_UP): enp0s31f6: link is not ready
[   49.496112] e1000e: enp0s31f6 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
[   49.496165] IPv6: ADDRCONF(NETDEV_CHANGE): enp0s31f6: link becomes ready
[  203.358136] ISO 9660 Extensions: RRIP_1991A
[  215.199352] attempt to access beyond end of device
[  215.199357] loop0: rw=524288, want=1903876, limit=1899264
[  215.199362] attempt to access beyond end of device
[  215.199363] loop0: rw=0, want=1903532, limit=1899264

The output of Xorriso and the kernel when it builds the image looks

good.

(More info at https://issues.guix.info/issue/33639.)

Using the exact same build process for x86_64 leads to valid ISO images.

Does that ring a bell or would you have advice to further debug it?

Thanks,

Ludo’.

Thomas Schmitt wrote on 6 Dec 2018 15:08

Recipients:(address . bug-xorriso@gnu.org)(address . 33639@debbugs.gnu.org)

Message-ID:22800682362436954162@scdbackup.webframe.org

Hi,

Toggle quote (2 lines)

> [ 215.199357] loop0: rw=524288, want=1903876, limit=1899264

This looks much like a truncated ISO image. (For what reason ever.)

There are at least 4612 blocks = ~ 9 MiB missing.

In the original message of https://issues.guix.info/issue/33639the

the minimum missing size is about 5 MiB.

Please consider local reasons for truncated ISO images.

In the following i will concentrate on a potential program bug.

Toggle quote (3 lines)

> [...] running Xorriso 1.5.0 built for i686 [...] I/O errors [...]

> Using the exact same build process for x86_64 leads to valid ISO images.

Well, this would explain why 1.5.0 passed a regression test on my 64 bit

system with repacking about 200 ISOs, mounting them, and comparing them

with the monted original ISOs.

I currently lack of opportunities to build 32 bit xorriso.

Is there such a damaged ISO available for download ?

How much effort would it be to create a Guix installation for building

xorriso, running your ISO production, and possibly running xorriso under

gdb ?

(Something for a run like

qemu-system-i386 \

-enable-kvm \

-nographic \

-m 512 \

-net nic \

-net user,hostfwd=tcp::5555-:22 \

-hda guix_on_qemu.img

with the opportunity to login from the host machine via SSH.

)

What do you get from this xorriso inspection run on a damaged ISO ?

(I tested it with the ISO from https://www.gnu.org/software/guix/download/):

xorriso -indev guixsd-install-0.15.0.i686-linux.iso \

-find / -sort_lba -exec report_lba -- \

>/tmp/xorriso_indev_find.txt 2>&1

In a preliminary test with

guixsd-install-0.15.0.i686-linux.iso

i get in /tmp/xorriso_indev_find.txt :

...

Media summary: 1 session, 454094 data blocks, 887m data, 384g free

...

Report layout: xt , Startlba , Blocks , Filesize , ISO image path

File data lba: 0 , 8527 , 1440 , 2949120 , '/efi.img'

... many other files ...

File data lba: 0 , 453781 , 122 , 249856 , '/var/guix/db/db.sqlite'

The ISO image file size is 929984512 bytes = 454094 blocks.

The image by its inner size counter also claims 454094 blocks.

The data file with the highest storage address ends before block

453781 + 122 = 453903.

That's 191 blocks before the image end. Padding and GPT backup follow.

(The data block size is 2048 bytes.)

So this image looks ok. Let's read all its files:

# mount guixsd-install-0.15.0.i686-linux.iso /mnt/iso

mount: /dev/loop0 is write-protected, mounting read-only

$ tar cf - /mnt/iso | wc

tar: Removing leading `/' from member names

7116387 35887498 1042391040

No i/o error.

Unrelated observation:

xorriso command -pvd_info reports that the ISO was made with xorriso-1.4.8

with

Creation Time: 1970010119010649

This means "1 Jan 1970 19:01:06". Something seems to be wrong with the

system clock of the producer machine.

Have a nice day :)

Thomas

Ludovic Courtès wrote on 6 Dec 2018 16:34

Recipients:(name . Thomas Schmitt)(address . scdbackup@gmx.net)

Message-ID:87va46is9h.fsf@gnu.org

Hi Thomas,

Thanks for the quick and insightful reply!

"Thomas Schmitt" <scdbackup@gmx.net> skribis:

Toggle quote (8 lines)

>> [ 215.199357] loop0: rw=524288, want=1903876, limit=1899264

> This looks much like a truncated ISO image. (For what reason ever.)

> There are at least 4612 blocks = ~ 9 MiB missing.

> In the original message of https://issues.guix.info/issue/33639 the

> the minimum missing size is about 5 MiB.

OK.

Toggle quote (2 lines)

> Please consider local reasons for truncated ISO images.

I’ve thought about this but that seem highly unlikely at this point.

Toggle quote (2 lines)

> Is there such a damaged ISO available for download ?

No.

Toggle quote (13 lines)> How much effort would it be to create a Guix installation for building
> xorriso, running your ISO production, and possibly running xorriso under
> gdb ?
> (Something for a run like
>
>   qemu-system-i386 \
>      -enable-kvm \
>      -nographic \
>      -m 512 \
>      -net nic \
>      -net user,hostfwd=tcp::5555-:22 \
>      -hda guix_on_qemu.img

You could install Guix on top of your distro following the instructions

https://www.gnu.org/software/guix/manual/en/html_node/Binary-Installation.html.

Then you would need to run “guix pull” to get a current Guix (0.15.0

itself didn’t have this bug.) And finally, run:

guix system disk-image --file-system-type=iso9660 \

-s i686-linux \

~/.config/guix/current/share/guile/site/2.2/gnu/system/install.scm

(This command works on an x86_64 machine.)

The result will be an ISO that’s corrupt.

Toggle quote (7 lines)

> What do you get from this xorriso inspection run on a damaged ISO ?

> (I tested it with the ISO from https://www.gnu.org/software/guix/download/):

> xorriso -indev guixsd-install-0.15.0.i686-linux.iso \

> -find / -sort_lba -exec report_lba -- \

> >/tmp/xorriso_indev_find.txt 2>&1

I get:

Toggle snippet (43 lines)GNU xorriso 1.5.0 : RockRidge filesystem manipulator, libburnia project.

libisoburn: WARNING : ISO image size 475636s larger than readable size 473456s
xorriso : NOTE : Loading ISO image tree from LBA 0
libburn : SORRY : Read start address 475635s larger than number of readable blocks 473456
xorriso : UPDATE :   46803 nodes read in 1 seconds
xorriso : NOTE : Detected El-Torito boot information which currently is set to be discarded
Drive current: -indev '/gnu/store/v13bryy1mrgrs694drsrknryf204q30j-image.iso'
Media current: stdio file, overwriteable
Media status : is written , is appendable
Boot record  : El Torito , MBR protective-msdos-label grub2-mbr cyl-align-off GPT APM
Media summary: 1 session, 473456 data blocks,  925m data, 45.6g free
Volume id    : 'GUIXSD_IMAGE'
xorriso : NOTE : Tolerated problem event of severity 'SORRY'
Report layout: xt , Startlba ,   Blocks , Filesize , ISO image path
File data lba:  0 ,     8612 ,      720 ,  1474560 , '/efi.img'
File data lba:  0 ,    25032 ,        0 ,        0 , '/gnu/store/1zzgag2ca7xzklss2j6phh4580cgkbl2-flac-1.3.2/share/doc/flac-1.3.2/FLAC.tag'
File data lba:  0 ,    25032 ,        0 ,        0 , '/gnu/store/55m1dng1zw7fq7ni73nm2v7b84wghpka-libx11-1.6.6/share/X11/locale/am_ET.UTF-8/XI18N_OBJS'
File data lba:  0 ,    25032 ,        0 ,        0 , '/gnu/store/55m1dng1zw7fq7ni73nm2v7b84wghpka-libx11-1.6.6/share/X11/locale/cs_CZ.UTF-8/XI18N_OBJS'
File data lba:  0 ,    25032 ,        0 ,        0 , '/gnu/store/55m1dng1zw7fq7ni73nm2v7b84wghpka-libx11-1.6.6/share/X11/locale/el_GR.UTF-8/XI18N_OBJS'
File data lba:  0 ,    25032 ,        0 ,        0 , '/gnu/store/55m1dng1zw7fq7ni73nm2v7b84wghpka-libx11-1.6.6/share/X11/locale/fi_FI.UTF-8/XI18N_OBJS'
File data lba:  0 ,    25032 ,        0 ,        0 , '/gnu/store/746645dl4fmz9h12x247nyznalswqyzp-groff-minimal-1.22.3/share/groff/1.22.3/tmac/mm/locale'
File data lba:  0 ,    25032 ,        0 ,        0 , '/gnu/store/746645dl4fmz9h12x247nyznalswqyzp-groff-minimal-1.22.3/share/groff/1.22.3/tmac/mm/se_locale'
File data lba:  0 ,    25032 ,        0 ,        0 , '/gnu/store/a1vpwa7wkxbxw18sz70rmp3cdfnf3jdj-libvorbis-1.3.6/share/doc/libvorbis-1.3.6/doxygen-build.stamp'
File data lba:  0 ,    25032 ,        0 ,        0 , '/mach_kernel'
File data lba:  0 ,    25034 ,     1173 ,  2400500 , '/boot/grub/fonts/unicode.pf2'
File data lba:  0 ,    26207 ,        1 ,     1520 , '/boot/grub/grub.cfg'
File data lba:  0 ,    26207 ,        1 ,     1520 , '/gnu/store/3zq39lvf12a87zcfrg87xgkllgfsyw3b-grub.cfg'
File data lba:  0 ,    26208 ,        5 ,     9928 , '/boot/grub/i386-efi/acpi.mod'

[…]

File data lba:  0 ,   475300 ,        1 ,     1651 , '/gnu/store/zrg4c2d0lvyw8z9xgh0darzglbxrm6b7-iptables-1.6.2/share/man/man8/iptables-restore.8.gz'
File data lba:  0 ,   475301 ,        1 ,     1137 , '/gnu/store/zrg4c2d0lvyw8z9xgh0darzglbxrm6b7-iptables-1.6.2/share/man/man8/iptables-save.8.gz'
File data lba:  0 ,   475302 ,        4 ,     7837 , '/gnu/store/zrg4c2d0lvyw8z9xgh0darzglbxrm6b7-iptables-1.6.2/share/man/man8/iptables.8.gz'
File data lba:  0 ,   475306 ,       47 ,    96256 , '/System/Library/CoreServices/boot.efi'
File data lba:  0 ,   475353 ,        1 ,      236 , '/System/Library/CoreServices/SystemVersion.plist'
File data lba:  0 ,   475354 ,        1 ,     1399 , '/System/Library/CoreServices/.disk_label'
File data lba:  0 ,   475355 ,        1 ,       10 , '/System/Library/CoreServices/.disk_label.contentDetails'
File data lba:  0 ,   475356 ,       88 ,   180224 , '/var/guix/db/db.sqlite'
xorriso : NOTE : -return_with SORRY 32 triggered by problem severity SORRY

Something’s fishy, and Xorriso is sorry. :-)

Let me know if I can provide more info.

In the meantime I’ll see if I can build the image from x86_64 instead.

Toggle quote (7 lines)

> Unrelated observation:

> xorriso command -pvd_info reports that the ISO was made with xorriso-1.4.8

> with

> Creation Time: 1970010119010649

> This means "1 Jan 1970 19:01:06". Something seems to be wrong with the

> system clock of the producer machine.

For reproducibility purposes we set timestamps and related things to the

Epoch. This pseudo-UUID/timestamps is actually derived from the config

of the operating system in the image. It’s expected. :-)

Thank you!

Ludo’.

Ludovic Courtès wrote on 6 Dec 2018 17:28

Recipients:(name . Thomas Schmitt)(address . scdbackup@gmx.net)

Message-ID:87k1kmipqk.fsf@gnu.org

Hi again,

"Thomas Schmitt" <scdbackup@gmx.net> skribis:

Toggle quote (8 lines)

>> [ 215.199357] loop0: rw=524288, want=1903876, limit=1899264

> This looks much like a truncated ISO image. (For what reason ever.)

> There are at least 4612 blocks = ~ 9 MiB missing.

> In the original message of https://issues.guix.info/issue/33639 the

> the minimum missing size is about 5 MiB.

Based on this and on a suggestion Ricardo made on IRC, I passed

“-padding 10m” and that solved the problem. \o/

I suppose you’ll have a scientific explanation, but I’m happy this

simple hacks works (and indeed, the documentation of “-padding” suggests

that this kind of problem is not uncommon.)

Thanks to both of you!

Ludo’.

Thomas Schmitt wrote on 6 Dec 2018 17:59

Recipients:(address . bug-xorriso@gnu.org)(address . 33639@debbugs.gnu.org)

Message-ID:13661682393159200289@scdbackup.webframe.org

Hi,

i see that probably the kernel log talks of blocks of 512 bytes.

So the minimum missing size shrinks to 2.3 and 1.4 MiB, respectively.

I wrote:

Toggle quote (2 lines)

> > Please consider local reasons for truncated ISO images.

Ludovic Courtès wrote:

Toggle quote (2 lines)

> I’ve thought about this but that seem highly unlikely at this point.

It still looks like writing of the ISO image aborted prematurely.

Do you have the xorriso messages from the grub-mkrescue run ?

(If there are none, add the following three arguments to the grub-mkrescue

run:

-- -- -report_about update

The second "--" shall work around an intermediate version of grub-mkrescue

which ate the first "--" instead of forwarding it to xorriso.

)

Reasoning:

Toggle quote (3 lines)

> libisoburn: WARNING : ISO image size 475636s larger than readable size 473456s

> File data lba: 0 , 475356 , 88 , 180224 , '/var/guix/db/db.sqlite'

When the ISO is assessed by libisoburn, its nominal block count is
192 blocks higher than the end of the last file. Insofar ok. But the
ISO image file is smaller than that.

After the warning, libisoburn corrects the displayed size to the readable
size. So the number in this subsequent message is rather insignificant:

Toggle quote (1 lines)

> Media summary: 1 session, 473456 data blocks, 925m data, 45.6g free

(Only good that you also showed above warning message.)

The nominal count is recorded in the Primary Volume Descriptor, the

equivalent of a superblock. (Byte offset in the ISO file is 32768+80,

first as 4 byte little-endian, then again as 4 byte big-endian.)

The readable size is based on the byte size of the ISO file.

At ISO production time, the nominal block count is determined by libisofs

in a first dry-run. In the subsequent real production run, libisofs sticks

to the determined file sizes of the first run, even if some file changed

size inbetween. It would truncate or pad the copied file bytes to the

planned size. Directory data are written as assessed in the first run.

So from normal operation of libisofs it is guaranteed that the written

amount of data is the same as the nominal amount.

-----------------------------------------------------------------------

Possible glitches would be that libisofs skips to write some scheduled

data blocks, or that libburn drops blocks which were submitted by libisofs.

Both scenarios do not give me an idea how the difference between 32 bit

and 64 bit systems could be involved.

The theory of intermediately missing data blocks could be verified or

defuted by checking the content of the last file which sits in the

readable area. If it bears the expected content, then no blocks were

skipped or dropped inbetween.

So please look in the file listing for the last file which begins before

block 473456 and does not step over that limit by adding its "Blocks"

count (exact hit on the limit is ok).

If the filesystem refuses to obtain it, then use

dd bs=2048 skip=$Startlba count=$Blocks

to cut it out from the ISO and then truncate it to the reported "Filesize".

In any case compare its content with the original.

If the contents match, then we have a flat premature end of file.

In this case there should be error messages from xorriso or its libraries.

(In case of GNU xorriso, the libraries are fixely compiled in from source.)

Have a nice day :)

Thomas

Thomas Schmitt wrote on 6 Dec 2018 18:29

Recipients:(address . bug-xorriso@gnu.org)(address . 33639@debbugs.gnu.org)

Message-ID:12559682391379993357@scdbackup.webframe.org

Hi,

Toggle quote (3 lines)

> Based on this and on a suggestion Ricardo made on IRC, I passed

> “-padding 10m” and that solved the problem. \o/

Ouchers. Do all files bear their expected content ?

Especially the last one: /var/guix/db/db.sqlite

If so, then something truncates the output stream of libisofs via libburn.

The only component that comes to my mind is the fifo between them.

The default fifo size is 4 MiB. Quite suspicious.

Try to reduce its size to the minimum by adding these grub-mkrescue

arguments:

-- -- -fs 64k -padding 64k

If the fifo is to blame, then a padding of 64k should suffice to protect

the valuable blocks from a premature end.

--------------------------------------------------------------------

A bit off topic:

Toggle quote (3 lines)

> the documentation of “-padding” suggests

> that this kind of problem is not uncommon.

It's normal purpose is to work around a traditional Linux kernel bug:

CDs written with write type Track-At-Once bear two unreadable blocks at

the end. Most CD drives report these blocks as part of the data range.

When Linux shall read a single block for isofs, it reads a larger chunk.

The chunk is not large enough to reach over the nominal end of the data

range, but it can reach the unreadable end blocks by mistake.

In this case Linux does not only miss the end blocks but also valid

payload blocks which are part of the filesystem. This yields I/O error.

The developer of cdrecord and the kernel people mistake this problem

for a "fuzziness" of a CD end by at most 2 seconds of audio play time.

This is wrong from reading the specs and from making experiments.

However, cdrecord introduced the tradition to add 150 blocks of padding

which would 2 seconds of sound.

As long as the read chunk of Linux is smaller than that, the padding

protects the operating system from touching the lead-out blocks of the

TAO track.

This cannot happen on hard disk or any optical media type other than CD.

If you write the CD by Session-At-Once it cannot happen. If you have one

of the rare CD drives which do not count the lead-out blocks to the

readable size of the CD, it cannot happen. (Currently 1 of my 7 drives

tells the truth.)

But who am i to stand against all others ?

So xorriso, too, adds 300 KiB of padding by default.

Have a nice day :)

Thomas

Ludovic Courtès wrote on 7 Dec 2018 23:51

Recipients:(name . Thomas Schmitt)(address . scdbackup@gmx.net)

Message-ID:87k1klar3e.fsf@gnu.org

Hello!

"Thomas Schmitt" <scdbackup@gmx.net> skribis:

Toggle quote (6 lines)

>> Based on this and on a suggestion Ricardo made on IRC, I passed

>> “-padding 10m” and that solved the problem. \o/

> Ouchers. Do all files bear their expected content ?

> Especially the last one: /var/guix/db/db.sqlite

It looks good, and there are no I/O errors left (I mounted it and run

“tar” over it.)

Note that the image is now available here:

https://alpha.gnu.org/gnu/guix/guixsd-install-0.16.0.i686-linux.iso.xz

(I haven’t tried smaller padding.)

Toggle quote (12 lines)> If so, then something truncates the output stream of libisofs via libburn.
> The only component that comes to my mind is the fifo between them.
> The default fifo size is 4 MiB. Quite suspicious.
>
> Try to reduce its size to the minimum by adding these grub-mkrescue
> arguments:
>
>   -- -- -fs 64k -padding 64k
>
> If the fifo is to blame, then a padding of 64k should suffice to protect
> the valuable blocks from a premature end.

OK, I’ll try to test this, but note that I’ll be largely unavailable for

a week.

Toggle quote (31 lines)>> the documentation of “-padding” suggests
>> that this kind of problem is not uncommon.
>
> It's normal purpose is to work around a traditional Linux kernel bug:
>
> CDs written with write type Track-At-Once bear two unreadable blocks at
> the end. Most CD drives report these blocks as part of the data range.
> When Linux shall read a single block for isofs, it reads a larger chunk.
> The chunk is not large enough to reach over the nominal end of the data
> range, but it can reach the unreadable end blocks by mistake.
> In this case Linux does not only miss the end blocks but also valid
> payload blocks which are part of the filesystem. This yields I/O error.
>
> The developer of cdrecord and the kernel people mistake this problem
> for a "fuzziness" of a CD end by at most 2 seconds of audio play time.
> This is wrong from reading the specs and from making experiments.
> However, cdrecord introduced the tradition to add 150 blocks of padding
> which would 2 seconds of sound.
> As long as the read chunk of Linux is smaller than that, the padding
> protects the operating system from touching the lead-out blocks of the
> TAO track.
>
> This cannot happen on hard disk or any optical media type other than CD.
> If you write the CD by Session-At-Once it cannot happen. If you have one
> of the rare CD drives which do not count the lead-out blocks to the
> readable size of the CD, it cannot happen. (Currently 1 of my 7 drives
> tells the truth.)
>
> But who am i to stand against all others ?
> So xorriso, too, adds 300 KiB of padding by default.

I see, thanks for explaining!

Ludo’.

Thomas Schmitt wrote on 8 Dec 2018 13:42

Recipients:(address . bug-xorriso@gnu.org)(address . 33639@debbugs.gnu.org)

Message-ID:14249682530673393275@scdbackup.webframe.org

Hi,

Toggle quote (3 lines)

> https://alpha.gnu.org/gnu/guix/guixsd-install-0.16.0.i686-linux.iso.xz

> (I haven’t tried smaller padding.)

I downloaded it and get on "xorriso -indev":

libisoburn: WARNING : ISO image size 481129s larger than readable size 479184s

So the lack of 2k blocks is 1945 = nearly 4 MiB.

This is suspiciously near to the default fifo size.

The content of cleartext files near the payload end looks plausible:

/System/Library/CoreServices/.disk_label

/System/Library/CoreServices/SystemVersion.plist

Whether the last file's content is as expected can only be told by

its reader program, i guess:

/var/guix/db/db.sqlite

So for now it indeed looks like plain truncation and not like a hickup

somewhere in the middle of ISO writing.

Several distros use xorriso to build their 32 bit ISOs. No complaints.

So i asked on debian-cd and debian-live mailing lists whether the ISOs

for 32-bit systems are indeed made on 32-bit systems. The answer is

"All our images have been made on amd64 for years now."

So i need a 32-bit GNU/Linux VM for regression tests.

Being an untalented sysadmin, this can last a while. (First searching

for old cheat sheets and then stepping into any possible puddle ...)

I would still appreciate a test with minmally sized fifo. Its outcome would

be a strong indication whether the Guix problem is related to the fifo

at all. The result can be checked by executing

xorriso -indev ...path.to.iso...

and looking for message

libisoburn: WARNING : ISO image size ...s larger than readable size ...s

If the difference is in the range of only 32s, then the fifo stays

main suspect.

Also, the xorriso messages of a run with grub-mkrescue add-on arguments

-- -- -report_about all

would be very welcome.

--------------------------------------------------------------------------

(Be invited to stop reading here. Only code musings follow.)

I reviewed the fifo code in libisofs and found no obvious opportunity for

a bug that would drop the final fifo content rather than offering it to

libburn:

https://dev.lovelyhq.com/libburnia/libisofs/raw/master/libisofs/buffer.c

(iso_ring_buffer_read() is exposed to libburn via libisofs/ecma119.c

function bs_read() which serves as struct burn_source member (*read)()

as defined in libburn/libburn.h.)

The condition for end of reading is a combination of

- no data are available in the ring buffer

- the writer has set the flag for having ended its work

while (buf->size == 0) {

...

if (buf->wend) {

The member buf->size is of type size_t. I.e. good for at least 4 GiB - 1

before it rolls over. Neither the fifo size nor the transaction size come

near to that number.

buf->wend is unsigned int :2 with defined values

0 not finished, 1 finished ok, 2 finish with error

Have a nice day :)

Thomas

Thomas Schmitt wrote on 15 Dec 2018 19:40

Recipients:(address . bug-xorriso@gnu.org)(address . 33639@debbugs.gnu.org)

Message-ID:23902682647998386729@scdbackup.webframe.org

Hi,

just to report that i did not forget this problem:

I have now a qemu-system-i386 VM with Debian GNU/Linux from

debian-9.6.0-i386-netinst.iso without desktop environment and reachable

via SSH. Very minimal. (I only did "apt-get install build-essential" to

feel not lonely without C compiler and friends.)

Then i followed the instructions of

https://www.gnu.org/software/guix/manual/en/html_node/Binary-Installation.html

with

https://alpha.gnu.org/gnu/guix/guix-binary-0.16.0.i686-linux.tar.xz

https://alpha.gnu.org/gnu/guix/guix-binary-0.16.0.i686-linux.tar.xz.sig

up to step 7 ("guix archive --authorize ...").

Then i made the mistake to do the proposed

guix package -i hello

It downloads and builds and blows away the free space on the virtual 8 GB

disk ... /gnu is growing steadily and /tmp breathes between 50 MB and 2 GB.

I abort this after 100 minutes before the virtual disk gets too full and

my CPU melts.

"guix pull" happily begins to build that gcc-5.5.0 which is too much for my

feeble VM.

Back to step 0 ("rm -r /gnu /var/guix") and again to step 7.

(A small fight starts between me and systemd, to get guix-daemon running.

"start" did not help. It had to be "restart".)

Then

# guix system disk-image --file-system-type=iso9660 \

> -s i686-linux \

> ~/.config/guix/current/share/guile/site/2.2/gnu/system/install.scm

and the activities to build the world start again. Extra verbose.

This time i abort after 30 minutes.

Everything i do ends up in enormous production of gcc-5.5.0 related

software.

-------------------------------------------------------------------------

So for xorriso and a 32-bit system:

# apt-get install xorriso

...

# xorriso -version

xorriso 1.4.6 : RockRidge filesystem manipulator, libburnia project.

...

I try what happens if i pack up the /gnu tree:

# xorriso -as mkisofs -o /tmp/test.iso -J /gnu

...

ISO image produced: 643046 sectors

Written to medium : 643046 sectors at LBA 0

Writing to 'stdio:/tmp/test.iso' completed successfully.

Inspection shows that the size ideas of xorriso match the image file size:

# xorriso -indev /tmp/test.iso

... no warning about size mismatch ...

Media summary: 1 session, 643046 data blocks, 1256m data, 3234m free

# ls -l /tmp/test.iso

-rw-r--r-- 1 root root 1316958208 Dec 15 19:17 /tmp/test.iso

# expr 1316958208 / 2048

643046

Now with GNU xorriso 1.5.0:

$ wget https://www.gnu.org/software/xorriso/xorriso-1.5.0.tar.gz

...

$ tar xzf xorriso-1.5.0.tar.gz

$ cd xorriso-1.5.0

$ ./configure && make

...

$ xorriso/xorriso -version

GNU xorriso 1.5.0 : RockRidge filesystem manipulator, libburnia project.

...

# rm /tmp/test.iso

# xorriso/xorriso -as mkisofs -o /tmp/test.iso -J /gnu

GNU xorriso 1.5.0 : RockRidge filesystem manipulator, libburnia project.

...

ISO image produced: 643046 sectors

Written to medium : 643046 sectors at LBA 0

...

Inspection yields the same result. No truncation.

-------------------------------------------------------------------------

If i shall try again with "guix system disk-image", then i need more

guidance. E.g. about the required disk size and ways to curb the build

effort.

Have a nice day :)

Thomas

Thomas Schmitt wrote on 15 Dec 2018 20:24

Recipients:(address . bug-xorriso@gnu.org)(address . 33639@debbugs.gnu.org)

Message-ID:16569682655711134021@scdbackup.webframe.org

Hi,

it comes to me that i can get nearer to the Guix ISO production:

# apt-get install grub-pc grub-efi-amd64-bin grub-efi-ia32-bin mtools

...

# grub-mkrescue -o /tmp/test.iso /gnu

xorriso 1.4.6 : RockRidge filesystem manipulator, libburnia project.

...

ISO image produced: 652920 sectors

Written to medium : 652920 sectors at LBA 0

# ls -l /tmp/test.iso

-rw-r--r-- 1 root root 1337180160 Dec 15 20:09 /tmp/test.iso

# expr 1337180160 / 2048

652920

# xorriso -indev /tmp/test.iso

... no complaints ...

And with GNU xorriso 1.5.0 :

# rm /tmp/test.iso

# grub-mkrescue --xorriso=/home/thomas/xorriso-1.5.0/xorriso/xorriso \

> -o /tmp/test.iso /gnu

GNU xorriso 1.5.0 : RockRidge filesystem manipulator, libburnia project.

...

ISO image produced: 652920 sectors

Written to medium : 652920 sectors at LBA 0

# ls -l /tmp/test.iso

-rw-r--r-- 1 root root 1337180160 Dec 15 20:15 /tmp/test.iso

# xorriso -indev /tmp/test.iso

... no complaints ...

All looks well.

Have a nice day :)

Thomas

Ludovic Courtès wrote on 16 Dec 2018 16:52

Recipients:(name . Thomas Schmitt)(address . scdbackup@gmx.net)

Message-ID:87ftuxtqn9.fsf@gnu.org

Hi Thomas,

"Thomas Schmitt" <scdbackup@gmx.net> skribis:

Toggle quote (5 lines)

> I have now a qemu-system-i386 VM with Debian GNU/Linux from

> debian-9.6.0-i386-netinst.iso without desktop environment and reachable

> via SSH. Very minimal. (I only did "apt-get install build-essential" to

> feel not lonely without C compiler and friends.)

If you’re testing in a VM you might just as well download the GuixSD VM

image from https://www.gnu.org/software/guix/download/. It should be

simpler than installing Debian and then installing Guix on top of

Debian.

Toggle quote (16 lines)> Then i followed the instructions of
>   https://www.gnu.org/software/guix/manual/en/html_node/Binary-Installation.html
> with
>   https://alpha.gnu.org/gnu/guix/guix-binary-0.16.0.i686-linux.tar.xz
>   https://alpha.gnu.org/gnu/guix/guix-binary-0.16.0.i686-linux.tar.xz.sig
> up to step 7 ("guix archive --authorize ...").
>
> Then i made the mistake to do the proposed
>
>   guix package -i hello
>
> It downloads and builds and blows away the free space on the virtual 8 GB
> disk ... /gnu is growing steadily and /tmp breathes between 50 MB and 2 GB.
> I abort this after 100 minutes before the virtual disk gets too full and
> my CPU melts.

Did you actually run “guix archive --authorize < …/ci.guix.info.pub”?

https://www.gnu.org/software/guix/manual/en/html_node/Substitute-Server-Authorization.html

If you didn’t, then you are not getting pre-built binaries and thus you

end up building the world.

HTH,

Ludo’.

Thomas Schmitt wrote on 16 Dec 2018 17:52

Recipients:(address . bug-xorriso@gnu.org)(address . 33639@debbugs.gnu.org)

Message-ID:2198768286307958861@scdbackup.webframe.org

Hi,

Ludovic Courtès wrote:

Toggle quote (3 lines)

> If you’re testing in a VM you might just as well download the GuixSD VM

> image from <https://www.gnu.org/software/guix/download/>.

There i only see only "x86_64" for QEMU, not "i686" like with ISO or Binary.

Toggle quote (2 lines)

> Did you actually run “guix archive --authorize < …/ci.guix.info.pub”?

I did step 7 of Binary-Installation.html:

guix archive --authorize < \

~root/.config/guix/current/share/guix/hydra.gnu.org.pub

The text "ci.guix.info.pub" does not appear in

https://www.gnu.org/software/guix/manual/en/html_node/Binary-Installation.html

Looking at the existing state:

# ls -l ~root/.config/guix/current/share/guix/

total 12

-r--r--r-- 1 root root 118 Jan 1 1970 berlin.guixsd.org.pub

-r--r--r-- 1 root root 118 Jan 1 1970 ci.guix.info.pub

-r--r--r-- 1 root root 1083 Jan 1 1970 hydra.gnu.org.pub

Shall i authorize the others too ?

If so: Is there need for clean-up actions after the aborted build runs ?

(If you find a bit of time, please run grub-mkrescue with some arbitrary

input tree of about the size of the Guix ISO and check whether it gets

truncated. If so, the messages from xorriso would be very interesting.)

Have a nice day :)

Thomas

Ludovic Courtès wrote on 18 Dec 2018 12:16

Recipients:(name . Thomas Schmitt)(address . scdbackup@gmx.net)

Message-ID:877eg7hypx.fsf@gnu.org

Hi Thomas,

"Thomas Schmitt" <scdbackup@gmx.net> skribis:

Toggle quote (6 lines)

> Ludovic Courtès wrote:

>> If you’re testing in a VM you might just as well download the GuixSD VM

>> image from <https://www.gnu.org/software/guix/download/>.

> There i only see only "x86_64" for QEMU, not "i686" like with ISO or Binary.

You’re right, my bad.

Toggle quote (10 lines)>> Did you actually run “guix archive --authorize < …/ci.guix.info.pub”?
>
> I did step 7 of Binary-Installation.html:
>
>    guix archive --authorize < \
>      ~root/.config/guix/current/share/guix/hydra.gnu.org.pub
>
> The text "ci.guix.info.pub" does not appear in
>   https://www.gnu.org/software/guix/manual/en/html_node/Binary-Installation.html

Oops, that was an omission that I’ve just fixed.

So yes, please authorize “ci.guix.info.pub” since https://ci.guix.info

is now the default substitute server.

HTH!

Ludo’.

Thomas Schmitt wrote on 18 Dec 2018 22:45

Recipients:(address . bug-xorriso@gnu.org)(address . 33639@debbugs.gnu.org)

Message-ID:10322683426128283104@scdbackup.webframe.org

Hi,

Toggle quote (2 lines)

> Oops, that was an omission that I’ve just fixed.

Sometimes you need a clueless test user to clean the pipes.

I now succeeded in running the ISO production command, but the truncation

problem is not reproducible here.

Please re-consider local reasons ... yada yada ... my main suspect would

be the immediate end of VM after the xorriso run. Maybe some buffers don't

get flushed down to the real disk ?

------------------------------------------------------------------------

What i did in detail:

I removed /gnu and /var/guix to get to a halfways clean state for

repeating steps 2 and 7 of

https://www.gnu.org/software/guix/manual/en/html_node/Binary-Installation.html

I.e. i unpacked the tarball, moved the trees to /gnu and /var/guix,

and authorized ci.guix.info.pub.

Then i did step 8

guix package -i glibc-locales

This lasted 12 minutes (mainly with building 7 packages).

Now the proposed command to "confirm that Guix is working":

guix package -i hello

lasted only about 30 seconds.

Scrolling back in my mailbox to

Date: Thu, 06 Dec 2018 16:34:02 +0100

Message-ID: <87va46is9h.fsf@gnu.org>

Toggle quote (3 lines)

> Then you would need to run “guix pull” to get a current Guix (0.15.0

> itself didn’t have this bug.)

Do i still need this ? My tarball was already "0.16.0":

guix-binary-0.16.0.i686-linux.tar.xz

I bet on omitting this step and go on with:

Toggle quote (4 lines)

> guix system disk-image --file-system-type=iso9660 \

> -s i686-linux \

> ~/.config/guix/current/share/guile/site/2.2/gnu/system/install.scm

After 5 minutes i see boot messages of a Linux kernel.

Oh. Qemu running on qemu. (The local power plant shifts one gear up.)

12 minutes elapsed and xorriso has started. Sloowly adding files:

registering 302 items

GNU xorriso 1.5.0 : RockRidge filesystem manipulator, libburnia project.

...

45981 files added in 94 seconds

...

xorriso : UPDATE : Thank you for being patient. Working since 265 seconds.

ISO image produced: 500069 sectors

Written to medium : 500069 sectors at LBA 0

Writing to 'stdio:/xchg/guixsd.iso' completed successfully.

So far the xorriso run looks ok.

...

/gnu/store/a8wwjfihb161maww0c8x4r797prdn8rr-image.iso

So this is where the ISO ended up.

# ls -l /gnu/store/a8wwjfihb161maww0c8x4r797prdn8rr-image.iso

-r--r--r-- 2 root root 1024141312 Jan 1 1970 /gnu/store/a8wwjfihb161maww0c8x4r797prdn8rr-image.iso

# expr 1024141312 / 2048

500069

# xorriso -indev /gnu/store/a8wwjfihb161maww0c8x4r797prdn8rr-image.iso

... no complaints about size mismatch ...

Media summary: 1 session, 500069 data blocks, 977m data, 3052m free

Well, then with

guix pull

and then again

guix system disk-image ...

lasts 30 minutes,

# time guix system disk-image --file-system-type=iso9660 \

-s i686-linux \

~/.config/guix/current/share/guile/site/2.2/gnu/system/install.scm

...

GUILEC gnu/packages/emacs.go

GC Warning: Failed to expand heap by 8388608 bytes

...

GC Warning: Out of Memory! Heap size: 943 MiB. Returning NULL!

...

guix system: error: build failed: build of `/gnu/store/vr5mhnh430qabrrc1a82pv954b89axws-guix-0.16.0-4.60b0402.drv' failed

real 21m55.875s

user 0m5.816s

sys 0m1.384s

Looks like my VM needs more memory for that stunt.

So again with 2 GiB.

... it seems that "guix pull" brought back the addiction to world building.

I abort after 50 minutes while it is doing some qemu tests.

------------------------------------------------------------------------

Have a nice day :)

Thomas

Ludovic Courtès wrote on 19 Dec 2018 15:05

Recipients:(name . Thomas Schmitt)(address . scdbackup@gmx.net)

Message-ID:87tvj9wr1v.fsf@gnu.org

Hello,

"Thomas Schmitt" <scdbackup@gmx.net> skribis:

Toggle quote (7 lines)

>> Oops, that was an omission that I’ve just fixed.

> Sometimes you need a clueless test user to clean the pipes.

> I now succeeded in running the ISO production command, but the truncation

> problem is not reproducible here.

It’s not reproducible because I “fixed” it:

https://git.savannah.gnu.org/cgit/guix.git/commit/?id=178be030c0e4fdeac5e1c968b5c99d84bb4691db

You should be able to reproduce it by running Guix from the parent

commit:

guix pull --commit=676c3adc14f63df0f7a549e518ac87481c0f3e37

‘guix pull’ populates ~/.config/guix/current/bin/guix so you’ll have to

make sure this is the one you’re running when you try to reproduce the

issue.

Thanks for your help and perseverance!

Ludo’.

Thomas Schmitt wrote on 19 Dec 2018 15:51

Recipients:(address . bug-xorriso@gnu.org)(address . 33639@debbugs.gnu.org)

Message-ID:17182683634737195681@scdbackup.webframe.org

Hi,

Toggle quote (2 lines)

> It’s not reproducible because I “fixed” it:

> https://git.savannah.gnu.org/cgit/guix.git/commit/?id=178be030c0e4fdeac5e1c968b5c99d84bb4691db

(This adds "-padding 10m" to the run of xorriso.)

No. The padding only moves the missing end piece into a region of the

image file where it does not matter for the filesystem payload files.

The ISO filesystem's meta data and the partition tables would still claim

the missing bytes of the image file, if the problem occured.

E.g. xorriso notices the mismatch in the ISO to which you pointed

me for download and which was most probably produced with -padding 10m:

$ xorriso -indev guixsd-install-0.16.0.i686-linux.iso

...

libisoburn: WARNING : ISO image size 481129s larger than readable size 479184s

...

libburn : SORRY : Read start address 481128s larger than number of readable blocks 479184

...

The GPT in the ISO says that its backup header is at 512-byte block

1,924,515 = block 481,128.75 in units of 2048 bytes.

Highest file block is 475879 + 87 = 475966

File data lba: 0 , 475879 , 88 , 180224 , '/var/guix/db/db.sqlite'

which is a bit more than than 10 MiB before the expected image file end.

Given the lack of 1945 blocks at the image file end, the payload file end

is still more than 6 MB away from the escarpment.

-----------------------------------------------------------------------

But the ISO which i produced myself is healthy in that aspect.

The used software version is obviously before the 10 MiB padding.

The ISO contains as many bytes

-r--r--r-- 2 root root 1024141312 Jan 1 1970 before_guix_pull.iso

as the ISO filesystem believes to cover, including the padding:

Media summary: 1 session, 500069 data blocks, 977m data, 2187m free

Highest data file block is 499788 + 87 = 499875 :

File data lba: 0 , 499788 , 88 , 180224 , '/var/guix/db/db.sqlite'

which means that at most 194 blocks are expected to follow the end of

this file, not 10 MiB.

The GPT in the image says that its backup header block is at 512-byte

address 2,000,275 which is 500,068.75 in blocks of 2048 bytes.

So the inner size counters and image file size do match exactly.

This was done with guix from

guix-binary-0.16.0.i686-linux.tar.xz

and with authorized ci.guix.info.pub.

Toggle quote (2 lines)

> guix pull --commit=676c3adc14f63df0f7a549e518ac87481c0f3e37

After "guix pull" the ISO production command indulges in building and

testing endlessly.

You will have to give me instructions how to get back to the ~ 12 minutes

of ISO production time which i had before trying "guix pull".

Have a nice day :)

Thomas

Thomas Schmitt wrote on 20 Dec 2018 14:38

Recipients:(address . bug-xorriso@gnu.org)(address . 33639@debbugs.gnu.org)

Message-ID:30813683585630400731@scdbackup.webframe.org

Hi,

aside from my problems with the building and testing after "guix pull"

i also stand puzzled in front of the 8 files named "/gnu/.../build/vm.scm"

which all start grub-mkrescue.

If i'd succeed in reproducing the ISO image file truncation:

Which vm.scm file would i have to modify in order to report the size of

the freshly emerged ISO image in the filesystem of the upper VM ?

(I would suspect that this size is still untruncated and that the file

in the underlying VM's filesystem is then truncated.)

And how to say "ls -l $target" in Guile ?

Have a nice day :)

Thomas

Ludovic Courtès wrote on 21 Dec 2018 21:44

Recipients:(name . Thomas Schmitt)(address . scdbackup@gmx.net)

Message-ID:87pntumwy8.fsf@gnu.org

Hi,

"Thomas Schmitt" <scdbackup@gmx.net> skribis:

Toggle quote (8 lines)

> aside from my problems with the building and testing after "guix pull"

> i also stand puzzled in front of the 8 files named "/gnu/.../build/vm.scm"

> which all start grub-mkrescue.

> If i'd succeed in reproducing the ISO image file truncation:

> Which vm.scm file would i have to modify in order to report the size of

> the freshly emerged ISO image in the filesystem of the upper VM ?

None of those under /gnu/store. /gnu/store is explicitly read-only.

The actual source code you’d edit is a checkout of Guix. See

https://www.gnu.org/software/guix/manual/en/html_node/Building-from-Git.html.

Toggle quote (2 lines)

> And how to say "ls -l $target" in Guile ?

In Scheme? You could use ‘scandir’:

Toggle snippet (5 lines)

scheme@(guile-user)> ,use (ice-9 ftw)

scheme@(guile-user)> (scandir "/")

$2 = ("." ".." "bin" "boot" "data" "dev" "etc" "gnu" "home" "lost+found" "mnt" "proc" "root" "run" "sys" "tmp" "var")

and also ‘lstat’, etc., but that’s not quite a “shell”.

HTH,

Ludo’.

Thomas Schmitt wrote on 21 Dec 2018 22:42

Recipients:(address . bug-xorriso@gnu.org)(address . 33639@debbugs.gnu.org)

Message-ID:25824683177226565276@scdbackup.webframe.org

Hi,

Toggle quote (2 lines)

> ‘lstat’

Probably this.

Toggle quote (2 lines)

> but that’s not quite a “shell”.

If i could reproduce the problem then i would want a long time visible

message about how large the ISO image file is after grub-mkrescue has

ended successfully.

This would give an opportunity to compare the size as produced in the VM

with the size later perceived on the host machine (which is a VM, too,

in my case).

If the sizes differ, then the VM contraption is to blame.

If the size is too small already in the VM that ran grub-mkrescue, then

xorriso or the VM operating system are to blame.

Since i am not yet able to reproduce the problem, i propose that you add

the necessary code to then end of make-iso9660-image in gnu/build/vm.scm.

Such a report message cannot harm, given the existing verbosity of the

ISO build command.

Next time you make an ISO, retrieve the last size messages of xorriso:

ISO image produced: 500069 sectors

Written to medium : 500069 sectors at LBA 0

the new message about the ISO image file size in bytes, and the size of

the ISO image file size when it is finally ready for exposure in the web.

(I have to stress that the problem is not fixed but only got a band aid

of which it is not known whether its size will always be large enough.)

Have a nice day :)

Thomas

pelzflorian (Florian Pelz) wrote on 7 Apr 2019 22:18

Recipients:(name . Thomas Schmitt)(address . scdbackup@gmx.net)

Message-ID:20190407201849.74qtwvazknbsaklg@pelzflorian.localdomain

I have what may be the same problem on my x86_64 machine building for

x86_64 when creating an ISO install image by running

guix system disk-image --file-system-type=iso9660 gnu/system/install.scm

Since commit 45c0d1d790f01ebc020fc4b2787a6abcdaa3f383 increased the

RAM for the VM that builds the iso image from 256 to 512, iso files

consistently were corrupt, until I added an lstat call, see below. On

a second and third attempt to build with lstat I got a corrupt image

again. Guix install iso files I tested from before that commit were

fine.

florian@florianmacbook ~$ fdisk /gnu/store/4nrwajlpab4s8pdph4d77ww7716sa3ir-image.iso

[…]

GPT PMBR size mismatch (3231107 != 3200391) will be corrected by write.

xorriso is sorry exactly like in Ludo’s message from December 06. The

numbers reported and file sizes are not consistent between corrupt

rebuilds.

On Fri, Dec 21, 2018 at 10:42:14PM +0100, Thomas Schmitt wrote:

Toggle quote (5 lines)

> […]

> Next time you make an ISO, retrieve the last size messages of xorriso:

> ISO image produced: 500069 sectors

> Written to medium : 500069 sectors at LBA 0

For the corrupt iso with lstat call:

ISO image produced: 807777 sectors

Written to medium : 807777 sectors at LBA 0

Toggle quote (2 lines)

> the new message about the ISO image file size in bytes,

Within the VM lstat consistently reports 1654327296 for non-corrupt

and corrupt images alike.

Toggle quote (4 lines)

> and the size of

> the ISO image file size when it is finally ready for exposure in the web.

ls -l on the result reports 1638600704.

On the non-corrupt image after adding the lstat call, both lstat

within the VM and ls -l outside the VM print the same size: 1654327296

in this case, i.e. the same as lstat reported on the corrupt images

within the VM.

(To be precise, for lstat I added the following local git commit to my

copy of the Guix repo at the end of the G-expression executed by the

VM:

Toggle diff (24 lines)

diff --git a/gnu/system/vm.scm b/gnu/system/vm.scm

index db9b1707d7..18ccb8970e 100644

--- a/gnu/system/vm.scm

+++ b/gnu/system/vm.scm

@@ -309,7 +309,8 @@ INPUTS is a list of inputs (as for packages)."

#:closures graphs

#:volume-id #$file-system-label

#:volume-uuid #$(and=> file-system-uuid

- uuid-bytevector))))))

+ uuid-bytevector))

+ (error (lstat "/xchg/guixsd.iso"))))))

#:system system

;; Keep a local file system for /tmp so that we can populate it directly as

and then reconfigured the system after customizing the guix package to

use said commit and disabling tests on the guix package. This

reported an lstat Scheme object as an error. Note that the error

procedure does not cause a failed build.)

Regards,

Florian

Thomas Schmitt wrote on 7 Apr 2019 23:35

Recipients:(address . bug-xorriso@gnu.org)

Message-ID:2660367208964033194@scdbackup.webframe.org

Hi,

Florian Pelz wrote:

Toggle quote (6 lines)

> fdisk /gnu/store/4nrwajlpab4s8pdph4d77ww7716sa3ir-image.iso

> [...]

> GPT PMBR size mismatch (3231107 != 3200391) will be corrected by write.

> For the corrupt iso with lstat call:

> and corrupt images alike.

The GPT Protective MBR counts with block size 512 up to the GPT backup
header block, not counting itself at block 0. So in blocks of 2048, the
expected size is
  3231108 / 4 = 807777 ISO 9660 blocks
But the perceived size is
  3200392 / 4 = 800098 ISO 9660 blocks

I wrote:

Toggle quote (8 lines)

> > retrieve the last size messages of xorriso:

> For the corrupt iso with lstat call:

> ISO image produced: 807777 sectors

> Written to medium : 807777 sectors at LBA 0

> Within the VM lstat consistently reports 1654327296 for non-corrupt

> and corrupt images alike.

1654327296 / 2048 = 807777

So from the view of the VM the ISO is as large as xorriso believes to have

written and as the GPT announces as position of the backup header block.

Toggle quote (5 lines)

> > and the size of

> > the ISO image file size when it is finally ready for exposure in the web.

> ls -l on the result reports 1638600704.

1638600704 / 2048 = 800098

This matches the perceived size from the fdisk complaint.

Toggle quote (3 lines)

> On the non-corrupt image after adding the lstat call, both lstat

> within the VM and ls -l outside the VM print the same size: 1654327296

The fact that the VM always sees the expected size but the host sees varying

sizes supports the suspicion that at the end of the VM its i/o buffers or

virtual disk are not always properly flushed to the i/o system of the host.

The varying success smells like a race condition.

Have a nice day :)

Thomas

Ludovic Courtès wrote on 8 Apr 2019 10:50

Recipients:(name . Thomas Schmitt)(address . scdbackup@gmx.net)

Message-ID:87h8b8284q.fsf@gnu.org

Hello,

"Thomas Schmitt" <scdbackup@gmx.net> skribis:

Toggle quote (5 lines)

> The fact that the VM always sees the expected size but the host sees varying

> sizes supports the suspicion that at the end of the VM its i/o buffers or

> virtual disk are not always properly flushed to the i/o system of the host.

> The varying success smells like a race condition.

Indeed, that rings a bell: I fixed a similar issue in commit

0dc7d298a33f83d5f02a962b5f1bd24ee0e8ef07.

Florian: could you check whether the patch below solves the problem for

you?

Thanks,

Ludo’.

Toggle diff (29 lines)diff --git a/gnu/system/vm.scm b/gnu/system/vm.scm
index db9b1707d7..3ee03c84a0 100644
--- a/gnu/system/vm.scm
+++ b/gnu/system/vm.scm
@@ -240,7 +240,11 @@ made available under the /xchg CIFS share."
                                   #:target-arm32? #$(target-arm32?)
                                   #:disk-image-format #$disk-image-format
                                   #:disk-image-size size
-                                  #:references-graphs graphs))))))
+                                  #:references-graphs graphs)
+
+                ;; Make sure I/O buffers get flushed.  This is particularly
+                ;; important when MAKE-DISK-IMAGE? is true.
+                (sync))))))
 
     (gexp->derivation name builder
                       ;; TODO: Require the "kvm" feature.
@@ -530,10 +534,7 @@ should set REGISTER-CLOSURES? to #f."
                  #$os
                  #:compressor '(#+(file-append gzip "/bin/gzip") "-9n")
                  #:creation-time (make-time time-utc 0 1)
-                 #:transformations `((,root-directory -> "")))
-
-                ;; Make sure the tarball is fully written before rebooting.
-                (sync))))))
+                 #:transformations `((,root-directory -> ""))))))))
     (expression->derivation-in-linux-vm
      name build
      #:make-disk-image? #f

pelzflorian (Florian Pelz) wrote on 10 Apr 2019 00:13

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)

Message-ID:20190409221313.b3uzvcj5bluoygp5@pelzflorian.localdomain

On Mon, Apr 08, 2019 at 10:50:29AM +0200, Ludovic Courtès wrote:

Toggle quote (19 lines)> Hello,
> 
> "Thomas Schmitt" <scdbackup@gmx.net> skribis:
> 
> > The fact that the VM always sees the expected size but the host sees varying
> > sizes supports the suspicion that at the end of the VM its i/o buffers or
> > virtual disk are not always properly flushed to the i/o system of the host.
> > The varying success smells like a race condition.
> 
> Indeed, that rings a bell: I fixed a similar issue in commit
> 0dc7d298a33f83d5f02a962b5f1bd24ee0e8ef07.
> 
> Florian: could you check whether the patch below solves the problem for
> you?
> 
> Thanks,
> Ludo’.
> 

No, sadly not. I reconfigured to a commit with the Guix package

changed to use your patch and I again got this:

GPT PMBR size mismatch (3231103 != 3187775) will be corrected by write.

libburn : SORRY : Read start address 807775s larger than number of readable blocks 796944

Thomas Schmitt wrote on 10 Apr 2019 13:17

Recipients:(address . bug-xorriso@gnu.org)

Message-ID:16217671677318139528@scdbackup.webframe.org

Hi,

Ludovic Courtès wrote:

Toggle quote (3 lines)

> > Florian: could you check whether the patch below solves the problem for

> > you?

Florian Pelz wrote:

Toggle quote (2 lines)

> No, sadly not.

Given the smell of a race condition, i would next try to let the VM

wait 10 or 15 seconds after xorriso is finished and before it shuts down.

Not as a final remedy but just as proof that the VM end is really the

culprit. (It could also be an i/o problem between VM and host which

is unrelated to the VM end.)

Have a nice day :)

Thomas

pelzflorian (Florian Pelz) wrote on 10 Apr 2019 23:23

Recipients:(name . Thomas Schmitt)(address . scdbackup@gmx.net)

Message-ID:20190410212310.iv2t72rblhupcmkt@pelzflorian.localdomain

On Wed, Apr 10, 2019 at 01:17:14PM +0200, Thomas Schmitt wrote:

Toggle quote (4 lines)

> Given the smell of a race condition, i would next try to let the VM

> wait 10 or 15 seconds after xorriso is finished and before it shuts down.

I added a (sleep 15) after ludo’s (sync). The first image worked but

now I got

libburn : SORRY : Read start address 807777s larger than number of readable blocks 798640

again.

Ludovic Courtès wrote on 12 Apr 2019 23:26

Recipients:

Message-ID:87o95alxtn.fsf@gnu.org

Hello Florian & Thomas,

I was able to reproduce the issue: ‘guix system disk-image

--file-system-format=iso9660’ would create partly unreadable images.

Since this was pretty much like the issue I had encountered with ‘guix

system docker-image’, which would produce truncated tarballs, and since

calling ‘sync’ wasn’t enough, I looked at our file system mount options…

The attached patch fixes the problem for me. In hindsight, it’s not

surprising that “cache=loose” on the /xchg mount point (used to exchange

data between the host and the guest) would have this effect.

Florian, it would be great if you could confirm. Just apply it on

‘master’, and then run:

./pre-inst-env guix system disk-image --file-system-format=iso9660 \

gnu/system/install.scm

Thanks, and apologies for blaming Xorriso, which presumably never had

anything to do with it!

Ludo’.

Toggle diff (49 lines)diff --git a/gnu/system/vm.scm b/gnu/system/vm.scm
index db9b1707d7..22e3fcc522 100644
--- a/gnu/system/vm.scm
+++ b/gnu/system/vm.scm
@@ -94,6 +94,12 @@
 (define %linux-vm-file-systems
   ;; File systems mounted for 'derivation-in-linux-vm'.  These are shared with
   ;; the host over 9p.
+  ;;
+  ;; The 9p documentation says that cache=loose is "intended for exclusive,
+  ;; read-only mounts", without additional details.  It's much faster than the
+  ;; default cache=none, especially when copying and registering store items.
+  ;; Thus, use cache=loose, except for /xchg where we want to ensure
+  ;; consistency.
   (list (file-system
           (mount-point (%store-prefix))
           (device "store")
@@ -102,18 +108,12 @@
           (flags '(read-only))
           (options "trans=virtio,cache=loose")
           (check? #f))
-
-        ;; The 9p documentation says that cache=loose is "intended for
-        ;; exclusive, read-only mounts", without additional details.  In
-        ;; practice it seems to work well for these, and it's much faster than
-        ;; the default cache=none, especially when copying and registering
-        ;; store items.
         (file-system
           (mount-point "/xchg")
           (device "xchg")
           (type "9p")
           (needed-for-boot? #t)
-          (options "trans=virtio,cache=loose")
+          (options "trans=virtio")
           (check? #f))
         (file-system
           (mount-point "/tmp")
@@ -530,10 +530,7 @@ should set REGISTER-CLOSURES? to #f."
                  #$os
                  #:compressor '(#+(file-append gzip "/bin/gzip") "-9n")
                  #:creation-time (make-time time-utc 0 1)
-                 #:transformations `((,root-directory -> "")))
-
-                ;; Make sure the tarball is fully written before rebooting.
-                (sync))))))
+                 #:transformations `((,root-directory -> ""))))))))
     (expression->derivation-in-linux-vm
      name build
      #:make-disk-image? #f

Thomas Schmitt wrote on 13 Apr 2019 08:37

Recipients:(address . bug-xorriso@gnu.org)

Message-ID:1173672442521511321@scdbackup.webframe.org

Hi,

Toggle quote (3 lines)

> apologies for blaming Xorriso, which presumably never had

> anything to do with it!

I will not complain that this time it was not my fault.

Have a nice day :)

Thomas

pelzflorian (Florian Pelz) wrote on 13 Apr 2019 15:46

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)

Message-ID:20190413134609.kwmx53hyawgtaaza@pelzflorian.localdomain

On Fri, Apr 12, 2019 at 11:26:28PM +0200, Ludovic Courtès wrote:

Toggle quote (7 lines)

> Florian, it would be great if you could confirm. Just apply it on

> ‘master’, and then run:

> ./pre-inst-env guix system disk-image --file-system-format=iso9660 \

> gnu/system/install.scm

Yes, it seems fixed, I can confirm.  Four rebuilds seem fine and are
bootable in QEMU.  They have the same size and `xorriso -indev` is
happy.  The content is different at the beginning of the ISO image
(maybe padding or timestamps in the file system) and in the EFI
partition at the very end of the ISO, but this seems insignificant.

Regards,
Florian

Thomas Schmitt wrote on 13 Apr 2019 18:20

Recipients:(address . bug-xorriso@gnu.org)

Message-ID:3867672606037906126@scdbackup.webframe.org

Hi,

Florian Pelz wrote:

Toggle quote (2 lines)

> Yes, it seems fixed, I can confirm.

Way back in december, Ludovic Courtès wrote:

Toggle quote (3 lines)

>...> Based on this and on a suggestion Ricardo made on IRC, I passed

>...> -padding 10m and that solved the problem. \o/

Please do not forget to remove this -padding command.

Florian Pelz wrote:

Toggle quote (3 lines)

> The content is different at the beginning of the ISO image

> (maybe padding or timestamps in the file system)

That's to expect if not environment SOURCE_DATE_EPOCH is set and exported.

SOURCE_DATE_EPOCH belongs to the specs of reproducible-builds.org. It

is supposed to be either undefined or to contain a decimal number which

tells the seconds since january 1st 1970. If it contains a number, then

it is used for all timestamps and as seed of pseudo-random numbers like

MBR id or GPT UUIDs.

If all files and directories have the same names and the same content,

then xorriso runs with the same arguments and the same SOURCE_DATE_EPOCH

value are supposed to create byte-identical result ISOs.

In december, i wrote:

Toggle quote (1 lines)

>...> > Creation Time: 1970010119010649

Ludovic Courtès wrote:

Toggle quote (3 lines)

>...> For reproducibility purposes we set timestamps and related things

>...> to the Epoch.

Is this independent of SOURCE_DATE_EPOCH ?

Have a nice day :)

Thomas

Ludovic Courtès wrote on 14 Apr 2019 17:03

control message for bug #33639

Recipients:(address . control@debbugs.gnu.org)

Message-ID:874l70ljdn.fsf@gnu.org

merge 33639 35136

Ludovic Courtès wrote on 14 Apr 2019 17:47

Re: bug#33639: ISO installer image is broken on i686

Recipients:(name . pelzflorian (Florian Pelz))(address . pelzflorian@pelzflorian.de)

Message-ID:87tvf0io7a.fsf@gnu.org

Hello,

"pelzflorian (Florian Pelz)" <pelzflorian@pelzflorian.de> skribis:

Toggle quote (11 lines)> On Fri, Apr 12, 2019 at 11:26:28PM +0200, Ludovic Courtès wrote:
>> Florian, it would be great if you could confirm.  Just apply it on
>> ‘master’, and then run:
>> 
>>   ./pre-inst-env guix system disk-image --file-system-format=iso9660 \
>>      gnu/system/install.scm
>> 
>
> Yes, it seems fixed, I can confirm.  Four rebuilds seem fine and are
> bootable in QEMU.

This is a happy end. :-)

Committed as 66ec389580d4f1e4b81e1c72afe2749a547a0e7c.

Thank you!

Ludo’.

Closed

Ludovic Courtès wrote on 14 Apr 2019 23:43

Recipients:(name . Thomas Schmitt)(address . scdbackup@gmx.net)

Message-ID:87h8b0i7ol.fsf@gnu.org

Hi Thomas,

"Thomas Schmitt" <scdbackup@gmx.net> skribis:

Toggle quote (9 lines)

> Florian Pelz wrote:

>> Yes, it seems fixed, I can confirm.

> Way back in december, Ludovic Courtès wrote:

>>...> Based on this and on a suggestion Ricardo made on IRC, I passed

>>...> -padding 10m and that solved the problem. \o/

> Please do not forget to remove this -padding command.

Done in f6e3f0f9b1287eca120517a0161e3d0b1ed6ed44.

Toggle quote (4 lines)

> If all files and directories have the same names and the same content,

> then xorriso runs with the same arguments and the same SOURCE_DATE_EPOCH

> value are supposed to create byte-identical result ISOs.

I’ve tried setting it but that doesn’t make any difference.

How did you visualize differences, Florian? Diffoscope fails for me

here (missing tools and scalability issue.)

Toggle quote (8 lines)

> In december, i wrote:

>>...> > Creation Time: 1970010119010649

> Ludovic Courtès wrote:

>>...> For reproducibility purposes we set timestamps and related things

>>...> to the Epoch.

> Is this independent of SOURCE_DATE_EPOCH ?

Yes.

Thanks,

Ludo’.

pelzflorian (Florian Pelz) wrote on 15 Apr 2019 08:07

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)

Message-ID:20190415060737.aw2msuviarkrd66a@pelzflorian.localdomain

On Sun, Apr 14, 2019 at 11:43:54PM +0200, Ludovic Courtï¿½s wrote:

Toggle quote (4 lines)

> How did you visualize differences, Florian? Diffoscope fails for me

> here (missing tools and scalability issue.)

For me diffoscope failed too. I used cmp as described here:

https://superuser.com/questions/125376/how-do-i-compare-binary-files-in-linux

and then looked at the addresses in ghex. It is not a nice method.

Sorry. It works though.

Regards,

Florian

Thomas Schmitt wrote on 15 Apr 2019 10:16

Recipients:(address . bug-xorriso@gnu.org)

Message-ID:3082867220863987596@scdbackup.webframe.org

Hi,

I wrote:

Toggle quote (4 lines)

> > If all files and directories have the same names and the same content,

> > then xorriso runs with the same arguments and the same SOURCE_DATE_EPOCH

> > value are supposed to create byte-identical result ISOs.

Ludovic Courtès wrote:

Toggle quote (2 lines)

> I’ve tried setting it but that doesn’t make any difference.

We should investigate this ...

... yes, there is some problem. But not always.

Timestamps of the root directory differ after mapping to an address

that is not the ISO root directory (here: /x):

xorriso -outdev test.iso -map x /x

xorriso -outdev test2.iso -map x /x

but not after mapping to the root directory:

xorriso -outdev test.iso -map x /

xorriso -outdev test2.iso -map x /

This would explain why my tests for Debian ISOs do not show this problem.

Do i get it right that gnu/build/vm.scm maps no files to "/" but all to

deeper paths:

"etc=/tmp/root/etc"

"var=/tmp/root/var"

"run=/tmp/root/run"

I am unsure about

"-path-list" "-"

I will now dig into the source to find the reason and maybe a preliminary

remedy.

Toggle quote (2 lines)

> How did you visualize differences, Florian?

(I'm aware that i am not Florian.)

I made myself a little program "hxd" for combined hex-cleartext-decimal dump,

positional diff, and (not to be focused too much) CD-Text decoding.

===========================================================================

$ export SOURCE_DATE_EPOCH=$(date +%s)

$ xorriso -outdev test.iso -map x /x

...

xorriso : NOTE : Environment variable SOURCE_DATE_EPOCH encountered with value 1555311212

...

$ xorriso -outdev test2.iso -map x /x

...

xorriso : NOTE : Environment variable SOURCE_DATE_EPOCH encountered with value 1555311212

...

$ hxd -diff test.iso test2.iso

32944 : 15 7 38 43 0 2 0 0 1 0 0 1 1 0 32 32

& +

000080b0 : 0f 07 26 2b 00 02 00 00 01 00 00 01 01 00 20 20

###

000080b0 : 0f 07 26 36 00 02 00 00 01 00 00 01 01 00 20 20

& 6

32944 : 15 7 38 54 0 2 0 0 1 0 0 1 1 0 32 32

... more differences ...

===========================================================================

It looks like the root directory got the current timestamp. The other

differences are with the ".." directory entries of the directories in

the first level under "/".

The source of "hxd" is pure C, no special dependencies, 8141 bytes.

Shall i upload it somewhere ?

Have a nice day :)

Thomas

Thomas Schmitt wrote on 15 Apr 2019 10:35

Recipients:(address . bug-xorriso@gnu.org)

Message-ID:3171667222963526138@scdbackup.webframe.org

Hi,

it seems to help if you explicitely set the timestamps of the "/" directory

export SOURCE_DATE_EPOCH=1555311212

xorriso -outdev test.iso -map x /x \

-alter_date b-c 1970010100000000 / -- \

-alter_date c 1970010100000000 / --

ISOs made with these xorriso commands match perfectly.

A bit more elegant than 1970 would be to use the seconds value from

SOURCE_DATE_EPOCH (prefix "=" announces date +%s format):

-alter_date b-c =$SOURCE_DATE_EPOCH / -- \

-alter_date c =$SOURCE_DATE_EPOCH / --

The -alter_date commands should be performed after all -map commands,

just to make sure that the timestamps do not get changed again.

I still need to find out where the current time sneaks in.

But this workaround should not do harm after the bug was corrected.

Have a nice day :)

Thomas

pelzflorian (Florian Pelz) wrote on 15 Apr 2019 18:54

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)

Message-ID:20190415165451.dpzngealeisbibc7@pelzflorian.localdomain

On Sat, Apr 13, 2019 at 03:46:09PM +0200, pelzflorian (Florian Pelz) wrote:

Toggle quote (2 lines)

> Yes, it seems fixed, I can confirm.

Well this is strange. I got fine ISO images each time (fine with no

complaints from xorriso or fdisk and bootable in QEMU without errors),

but after dd’ing them to different USB flash drives each time I get

kernel output when inserting the flash drive:

[ 10.025223] GPT:Primary header thinks Alt. header is not at the end of the disk.

[ 10.026735] GPT:3220583 != 7831551

[ 10.028235] GPT:Alternate GPT header not at the end of the disk.

[ 10.029764] GPT:3220583 != 7831551

[ 10.031290] GPT: Use GNU Parted to correct GPT errors.

Having such a USB flash drive inside my computer makes UEFI get stuck

on some computers but not on others.

Why is this? Are all my USB drives bad? I presume this is a

different bug, or is it?

Regards,

Florian

Thomas Schmitt wrote on 15 Apr 2019 19:55

Recipients:(address . bug-xorriso@gnu.org)

Message-ID:1582867226375139246@scdbackup.webframe.org

Hi,

Florian Pelz wrote:

Toggle quote (7 lines)

> Well this is strange. I got fine ISO images each time (fine with no

> complaints from xorriso or fdisk and bootable in QEMU without errors),

> but after dd’ing them to different USB flash drives each time I get

> kernel output when inserting the flash drive:

> [ 10.025223] GPT:Primary header thinks Alt. header is not at the end of

> the disk.

The alternative/backup header is a property of GPT which makes it

rather unsuitable for disk images. xorriso puts it correctly into the

last 512-byte block of the image. But when copied to a storage device,

it should move up to the last block of the device.

Even worse, the main GPT header at 512-byte LBA 1 needs to learn the

new address.

So i would rather advise to use a MBR partition table. Wonderfully dumb

and open ended.

I see from

http://git.savannah.gnu.org/cgit/guix.git/tree/gnu/build/vm.scm#n462

that program grub-mkrescue is in control of xorrisofs boot options.

Vladimir Serbinenko decided for GPT with no mountable ISO partition.

The libisoburn repo and tarball have a wrapper script by which other

boot layouts can be derived from the options which grub-mkrescue hands

over to xorrisofs:

https://dev.lovelyhq.com/libburnia/libisoburn/raw/master/frontend/grub-mkrescue-sed.sh

To get MBR instead of GPT do:

export MKRESCUE_SED_MODE=mbr_only

export MKRESCUE_SED_PROTECTIVE=""

and maybe

export MKRESCUE_SED_XORRISO=/...path/to/the/xorriso/binary/if/exotic...

Then start grub-mkrescue with the wrapper in the role of "xorriso":

grub-mkrescue --xorriso=...path/to/grub-mkrescue-sed.sh \

-partition_offset 16 \

-iso_mbr_part_type 0x83 \

...all.other.usual.arguments...

The mode "mbr_only" will move the EFI partition image out of the ISO

filesystem and rather append it after the ISO's end.

The option

-partition_offset 16

costs the space of a second superblock and directory tree. But it brings

as benefits:

- More normal partition layout with partition 1 starting at block 64

rather than at block 0.

- Nevertheless the partition 1 is mountable and shows the ISO content.

- The base device is mountable as the the same ISO too.

(The ISO superblock of the base device also serves on CD or DVD.)

- Th base device superblock claims not only the ISO in partition 1 but

also the EFI partition 2. So "/sbin/isosize" will tell the size of the

image file, not only of the ISO filesystem.

Option

-iso_mbr_part_type 0x83

chooses for partition 1 the MBR partitions type "Linux". (This is

purely ornamental. Nobody cares. But it looks good in partition editors.)

The partition layout of above wrapper run's output ISO will look like:

$ /sbin/fdisk -l output.iso

Disk output.iso: 16.5 MiB, 17338368 bytes, 33864 sectors

Units: sectors of 1 * 512 = 512 bytes

Sector size (logical/physical): 512 bytes / 512 bytes

I/O size (minimum/optimal): 512 bytes / 512 bytes

Disklabel type: dos

Disk identifier: 0x00000000

Device Boot Start End Sectors Size Id Type

output.iso1 * 64 28103 28040 13.7M 83 Linux

output.iso2 28104 33863 5760 2.8M ef EFI (FAT-12/16/32)

$ expr $(/sbin/isosize output.iso) / 512

33864

Have a nice day :)

Thomas

Gábor Boskovits wrote on 16 Apr 2019 11:57

Recipients:(name . Thomas Schmitt)(address . scdbackup@gmx.net)

Message-ID:CAE4v=phJmiS77k_YZ25ObxQ14J3f1y+H65+AjJ9om42OCUs=5g@mail.gmail.com

Hello people,

Thomas Schmitt <scdbackup@gmx.net> ezt írta (id?pont: 2019. ápr. 15., H, 19:54):

Toggle quote (19 lines)>
> Hi,
>
> Florian Pelz wrote:
> > Well this is strange.  I got fine ISO images each time (fine with no
> > complaints from xorriso or fdisk and bootable in QEMU without errors),
> > but after dd’ing them to different USB flash drives each time I get
> > kernel output when inserting the flash drive:
> > [   10.025223] GPT:Primary header thinks Alt. header is not at the end of
> > the disk.
>
> The alternative/backup header is a property of GPT which makes it
> rather unsuitable for disk images. xorriso puts it correctly into the
> last 512-byte block of the image. But when copied to a storage device,
> it should move up to the last block of the device.
> Even worse, the main GPT header at 512-byte LBA 1 needs to learn the
> new address.
>

Yes, this is a really painful point.

Could we create a simple tool to write the disk images to a disk

correcting this problem?

Does not look too hard?

I am also forwarding this to guix devel. I removed the xorriso bug

list, as I feel this does not belong there.

Best regards,

g_bor

Ludovic Courtès wrote on 16 Apr 2019 23:01

Recipients:(name . pelzflorian (Florian Pelz))(address . pelzflorian@pelzflorian.de)

Message-ID:87zhopbr5y.fsf@gnu.org

Hi Florian,

"pelzflorian (Florian Pelz)" <pelzflorian@pelzflorian.de> skribis:

Toggle quote (14 lines)> On Sat, Apr 13, 2019 at 03:46:09PM +0200, pelzflorian (Florian Pelz) wrote:
>> Yes, it seems fixed, I can confirm.
>
> Well this is strange.  I got fine ISO images each time (fine with no
> complaints from xorriso or fdisk and bootable in QEMU without errors),
> but after dd’ing them to different USB flash drives each time I get
> kernel output when inserting the flash drive:
>
> [   10.025223] GPT:Primary header thinks Alt. header is not at the end of the disk.
> [   10.026735] GPT:3220583 != 7831551
> [   10.028235] GPT:Alternate GPT header not at the end of the disk.
> [   10.029764] GPT:3220583 != 7831551
> [   10.031290] GPT: Use GNU Parted to correct GPT errors.

Could it be simply due to the incorrect location of the GPT backup as

Thomas explained?

Toggle quote (3 lines)

> Having such a USB flash drive inside my computer makes UEFI get stuck

> on some computers but not on others.

So you cannot boot from these USB drives at all?

Thanks,

Ludo’.

pelzflorian (Florian Pelz) wrote on 17 Apr 2019 11:03

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)

Message-ID:20190417090358.6l6g5xuzpyjs5q7v@pelzflorian.localdomain

On Tue, Apr 16, 2019 at 11:01:45PM +0200, Ludovic Courtï¿½s wrote:

Toggle quote (7 lines)

> "pelzflorian (Florian Pelz)" <pelzflorian@pelzflorian.de> skribis:

> > Having such a USB flash drive inside my computer makes UEFI get stuck

> > on some computers but not on others.

> So you cannot boot from these USB drives at all?

No, I cannot boot from them on this Macbook. I wonder how I installed

Guix System here; it may have been on a Debian ISO.

Regards,

Florian

Brice Waegeneire wrote on 11 Dec 2019 18:19

(no subject)

Recipients:(address . control@debbugs.gnu.org)

Message-ID:f62a2457e39b00adc81ae457baa0a950@waegenei.re

unarchive 33639

Brice Waegeneire wrote on 11 Dec 2019 18:21

Fixing the GPT errors from an installer on a USB stick

Recipients:

Message-ID:ed8be43c383b4c8291e1b456ff47ee1e@waegenei.re

I have the same issue as pelzflorian about the GPT errors.

To fix the  GPT mismatch you just need to execute the following command,
where device is something like "/dev/sdc":
sudo fdisk "$device" <<EOF
w
EOF

Your comment

This issue is archived.

To comment on this conversation send an email to 33639@debbugs.gnu.org

is:open	open issues
is:done	closed issues
submitter:<who>	search issue submitter
author:<who>	search by message author
date:yesterday..now	search by issue date
mdate:3m..2d	search by message date