mumi: Add context in search results when querying for (subject:<something>)

  • Open
  • quality assurance status badge
Details
3 participants
  • Arun Isaac
  • Felix Lechner
  • Giovanni Biscuolo
Owner
Somebody
Submitted by
Giovanni Biscuolo
Severity
wishlist
G
G
Giovanni Biscuolo wrote on 7 Sep 2023 18:52
[mumi] [wishlist] Allow searching subject prefix
(address . bug-guix@gnu.org)(name . Arun Isaac)(address . arunisaac@systemreboot.net)
871qfaexil.fsf@xelera.eu
Hello,

IMO is useful to be able to search for "subject:foo", it's a different
search than searching for foo in the body

in file mumi/xapian.scm I read:

Toggle snippet (9 lines)
;; Index subject and body without prefixes for general
;; search.
(index-text! term-generator subjects)
(increase-termpos! term-generator)
(index-text! term-generator text)


Is it possible to add such a feature please?

Thanks! Gio'


P.S.: I did not Cc: Ricardo Wurmus since AFAIU he prefers not to
continue developing this

--
Giovanni Biscuolo

Xelera IT Infrastructures
-----BEGIN PGP SIGNATURE-----

iQJABAEBCgAqFiEERcxjuFJYydVfNLI5030Op87MORIFAmT5/9IMHGdAeGVsZXJh
LmV1AAoJENN9DqfOzDkSMjcP/0qP9WCW+VrZFrE7wHZKjjMh0+IRceKH5fqF0jSH
a6Jh+DJoI35HHk+gmCNWCR8jEptlls6VUq9rZgSOHiXe11XNofwU/NI0xxKely9M
+bxnUPgm/kY4foa/I/W/WgUl/EyGyA5egrZdLVzo4HShkbvtG0ho3luSh0U+vO6d
G4sQSzZdJeN3T+fLXx6G68bRtHzj02zMhNAZLjAPIT4k4SWwE1pXjquMaLiXQHaG
NIXnjyw9KRNIVb+aAVI1cIh55eJsZfpTktfiuW3r0bbSMvqUZtYsFHgJJjhiS5wL
s+M0tP+L3648inQ17KZl/s6//OCEOuZHMmzCdwTx1WplHBkjvXEokQMWMJRGAYuo
bfFI4vNurgd5/koeyDj89XuPvbCrneoy+A7sOBAe8Q8uNaMk4nlp8phxJor6y04H
8sVdWPGwdtwmko9k7gDMPo6HjmWvOPNWszzdigMfMfmgNvfMksf/F6DEyZCxZT9h
F4Mlp6HjscepHy5ZTktePJvbfNUq/hhZPMIYSFiL+ENNTmZNgLQooeyLuAYtP4V+
b/ND6+Z3wZqP4UCYhd1BFe4QDeTPWd7+pyTTySxQC7MEbvMrQLlk0qgKIjjhX4+v
bi0GOjAp75XbzcGcoTZY7MGWZ0bg4vXZ36+nhtsmsoRaxdPYK/MWfKELZjJdle04
Hiwk
=HiKb
-----END PGP SIGNATURE-----

A
A
Arun Isaac wrote on 10 Sep 2023 06:47
87y1hezlaq.fsf@systemreboot.net
Hi Gio,

Thanks for this feature request! It's always gratifying to know that
someone is using mumi, especially its more advanced features! :-)

Toggle quote (3 lines)
> IMO is useful to be able to search for "subject:foo", it's a different
> search than searching for foo in the body

It looks like we implement this already. See
A search for "subject:foo" should work already.

Cheers!
Arun
G
G
Giovanni Biscuolo wrote on 10 Sep 2023 16:16
mumi: Add msg number and subject in search results when searching for subject:
(address . 65809@debbugs.gnu.org)
8734zmdsfq.fsf@xelera.eu
Hi!

(I'll also try to retitle this bug, after submitting this update)

Arun Isaac <arunisaac@systemreboot.net> writes:

Toggle quote (3 lines)
> Thanks for this feature request! It's always gratifying to know that
> someone is using mumi, especially its more advanced features! :-)

mumi advanced features could be **very** useful to a lot of contributing
activities, if improved a little bit and better understood

Toggle quote (7 lines)
>> IMO is useful to be able to search for "subject:foo", it's a different
>> search than searching for foo in the body
>
> It looks like we implement this already. See
> https://git.savannah.gnu.org/cgit/guix/mumi.git/tree/mumi/xapian.scm#n141
> A search for "subject:foo" should work already.

Uh I missed that code:

Toggle snippet (5 lines)
(index-text! term-generator subjects #:prefix "S")


otherwise I could have realize that I was misinterpreting mumi search
output.

If for example I search for 'subject:zoneinfo' I get this results:

Toggle snippet (10 lines)
giovanni@roquette [genv]\: mumi search subject:zoneinfo
#31484 [PATCH] gnu: icu4c: Patch zoneinfo directory.
opened on 17 mag 2018 14:58 by Christopher Baines
#57448 ? [PATCH 0/5] gnu: exa: Update to 0.10.1.
opened on 27 ago 2022 12:09 by ???
#58614 ? [PATCH 000/187] Remove unused crates
opened on 18 ott 2022 22:18 by Efraim Flashner

(the mumi CLI output is colored and the separation of each bug is
clearer)


Since 2 of the 3 patch "titles" are missing 'zoneinfo', I thought the
search was done in the subject and in the body, not just the subject.

For example, bug #58614 actually contains a message with this subject:
[PATCH 185/187] gnu: Remove rust-zoneinfo-compiled-0.4.

...my misunderstanding was due to the fact that each tracked bug (issue)
have a /Title/ given by the subject mail header from the original report
or by a retitle [1]; thus Title _is_ different from the "Subject"s of
the rest of the messages in the /thread/ of the tracked bug; when we
consider bugs tracking patches, actually each and every single patch
submission (not the reviews sent by people using a reply-to) do have a
different subject.

When searching for "subject:" it would be useful to have a speficic
message number and subject along with (or in place of) the bug title.

Using the search example above, a more useful result would be:

Toggle snippet (17 lines)
giovanni@roquette [genv]\: mumi search subject:zoneinfo
#31484 [PATCH] gnu: icu4c: Patch zoneinfo directory.
opened on 17 mag 2018 14:58 by Christopher Baines

#57448 ? [PATCH 0/5] gnu: exa: Update to 0.10.1.
opened on 27 ago 2022 12:09 by ???
[PATCH 4/5] gnu: rust-zoneinfo-compiled: Update to 0.5.1.
sent on 27 Aug 2022 12:10 by gyara

#58614 ? [PATCH 000/187] Remove unused crates
opened on 18 ott 2022 22:18 by Efraim Flashner
[PATCH 185/187] gnu: Remove rust-zoneinfo-compiled-0.4.
sent on 18 Oct 2022 22:20 by Efraim Flashner


I'd also add a blank line to have clear results also with "BW" consoles.

A similar addition would also be useful in the web interface, also
having a link to the message corresponding to the searched subject,
like:

[PATCH 185/187] gnu: Remove rust-zoneinfo-compiled-0.4. -> https://issues.guix.gnu.org/58614#184

Is it doable?

WDYT?

Thanks! Gio'



--
Giovanni Biscuolo

Xelera IT Infrastructures
-----BEGIN PGP SIGNATURE-----

iQJABAEBCgAqFiEERcxjuFJYydVfNLI5030Op87MORIFAmT9z8kMHGdAeGVsZXJh
LmV1AAoJENN9DqfOzDkSEDEQAOVS7lq9bEn5inNW8kjcD5vFSH8QO4si/x+7nJJe
UB9x90OOWdhRY7IzyzkX6apUXbGKb81ORLPXVyR9or3CQiSyrfkolcBSFAvbC8jw
xz7J0ZYtThuuNQWkFMKjmLdHrrhCtWCIkllYhKSK9cnP7crR0cYhDZ5JG1E8wlf5
M0p4A/0o9xYiPZ7et00v65cYpgJApwUoTGBcU53hFH0Rv/VkbaFfnC5Fk1cqzgmP
RFZQmT9OW67Qq47dr/aLS4cJ3ZKrCg49lVergMQ2vSd0YMq0Nf40yzIukLiAyOKn
07LrljLiCib4AydaeJPZyl3yoHQMtjO1KxedXTODrx4XLHqPXwpHR+KropVgaR+c
3ozbK1D2uq1D9zvbEsME70NS58sSLjkuCQfyidvuHdn7jEiRjDZPaO5TDoFZ1A7y
scNqtfT9A6UzV/BoAH9fm7krgfm4817WIk9YgdsIy0zeT3lAyoJwdaFgqiiVsVaE
fwl4nLYOpfHWR2wb8pO+PchcUMu3t3Qf/qG7H7H4s1U1MhRcHN7AW5hC6bZxIoao
9xlzkucVQEkn6G2hGJynyKuj9b9YK27YhQ3JC0dVd6SYFZPqbQ8TAIkDGWBJtHHl
JAM8ynOAOcaMObyowh1YUnCPC9WRVya6frQcfHf6HKnjMBpB5jyAzIBZcjOahulL
d/xK
=m8uz
-----END PGP SIGNATURE-----

G
G
Giovanni Biscuolo wrote on 10 Sep 2023 16:23
bug 65809
(address . control@debbugs.gnu.org)
87zg1ucdjn.fsf@xelera.eu
retitle 65809 mumi: Add msg number and subject in search results when searching for subject:
severity wishlist
A
A
Arun Isaac wrote on 17 Sep 2023 12:48
Re: bug#65809: mumi: Add msg number and subject in search results when searching for subject:
87ttrtt6rz.fsf@systemreboot.net
Hi Gio,

Sorry for my late reply. I have been travelling the last week and am
just catching up on all my email.

Toggle quote (3 lines)
> When searching for "subject:" it would be useful to have a speficic
> message number and subject along with (or in place of) the bug title.

This is actually difficult to do because of the way we index issues as
Xapian "documents".

First, a quick Xapian primer. Xapian has a bunch of documents each
associated with a set of terms. When a search query comes in, Xapian
decomposes the query into a list of terms and retrieves documents that
match those terms.

In our case, we index entire issues as Xapian documents; we don't index
each individual email message as its own Xapian document. This means
that an issue is the smallest unit we can address. We cannot address
each individual email message. So, localizing a subject to a specific
email message is difficult.

Maybe what you are looking for is some context in the search results to
know why that particular search result was produced. This can be done by
displaying a snippet of text from the issue with the search terms
highlighted. For a working demo of what I mean, see for example,
how the search term "database" is highlighted in the search
results. This is relatively easy to do with Xapian, and indeed I do plan
to implement this at some point.

WDYT? Would this meet your needs?

Regards,
Arun
G
G
Giovanni Biscuolo wrote on 18 Sep 2023 17:36
871qev5w8k.fsf@xelera.eu
Hi Arun,

Arun Isaac <arunisaac@systemreboot.net> writes:

Toggle quote (2 lines)
> Sorry for my late reply.

No problem: async! :-D

[...]

Toggle quote (3 lines)
> In our case, we index entire issues as Xapian documents; we don't index
> each individual email message as its own Xapian document.

Ooh, I understand now!

Toggle quote (4 lines)
> This means that an issue is the smallest unit we can address. We
> cannot address each individual email message. So, localizing a subject
> to a specific email message is difficult.

Yes, I see.

Toggle quote (6 lines)
> Maybe what you are looking for is some context in the search results to
> know why that particular search result was produced. This can be done by
> displaying a snippet of text from the issue with the search terms
> highlighted. For a working demo of what I mean, see for example,
> https://issues.genenetwork.org/search?query=database&type=all .

Oooh: a live example is more than a thousand words: thanks!
(I'm following Tissue but someway I missed that feature)

Toggle quote (4 lines)
> Notice how the search term "database" is highlighted in the search
> results. This is relatively easy to do with Xapian, and indeed I do
> plan to implement this at some point.

OK: can we consider this bug report (wishlist) as the "official" one for
that feature? :-)

Can I retitle it to better reflect the upcoming implementation and
assign it to you?

Actually I dont' know if there is some written or unwritten convention
in Guix or GNU about bug assignment, I don't want to put pressure on
you!

Toggle quote (2 lines)
> WDYT? Would this meet your needs?

Yes, absolutely yes: thank you!

Happy hacking! Gio'

--
Giovanni Biscuolo

Xelera IT Infrastructures
-----BEGIN PGP SIGNATURE-----

iQJABAEBCgAqFiEERcxjuFJYydVfNLI5030Op87MORIFAmUIbosMHGdAeGVsZXJh
LmV1AAoJENN9DqfOzDkSyHMP/2ECBxGAXWACgnBKW6UA3Fucj9bx4iekjwukl8Rq
h1mAVYWqfPMv+462Za+GN6izpNFRy5hIRQ3Lf36fQFLlh9P2WYM07BfXYT8+Hr0L
mw3lDXStUFxX0FRLf7dVD7tNg8NdVugh7hH81V9Kx8s3O4dPi+8wgoF77+6BylfD
N8wXKS6NkrQQz6JY9WEVMKFLv9cDqnh6Rez5DPFsOhjWdmNxaQ05pa2D6WDzXRgn
dhoaJ/PJzOChJ+x8ajQqcSn6zMey1RsCnpAEZKK+PRU0XMeKpR4qX1zHDwF+e9p5
8EQTQTfH/My19ygkO1YkRzHLrr9UmzeqycJR9SqfMRGPBLOMthLGdpNgL6J14SbZ
ylWNJL1v3tt5pd2oEa4S+owXTzxqD+l42/x9x7ScITr3E2f8jaZ+VDCpVzliCfuY
+ZQKcJX/XQT5RExdbOLgjI7qXiqiQghIr8JdDcvE4oD47bBSoe31sH/s+TIr7FwB
W+tsvcFGlWpe1cRLRGsSn30XCDJoKLObEC9oRjIbg8TanTJLX7WN8FJhMovloXYE
UT71wwnK8//vAavsX60XoKe7UA/i7kBhpEIfUmpRxu7rfc++1/s+Einr4TYlHqpD
pvJFzcXZsoG9h02EpIH8mgNK4/pr4VQiOTJ2KtCn5QJ6mUwTaCPQpkgcLg54bdCx
jpMb
=Jkwm
-----END PGP SIGNATURE-----

G
G
Giovanni Biscuolo wrote on 18 Sep 2023 17:42
(address . control@debbugs.gnu.org)
87y1h34hfb.fsf@xelera.eu
retitle 65809 mumi: Add context in search results when querying for (subject:<something>)
quit

As explained by Arun Isaac my feature request was not implementable, he
proposed a slightly different feature to give similar results.

Thanks!

--
Giovanni Biscuolo

Xelera IT Infrastructures
-----BEGIN PGP SIGNATURE-----

iQJABAEBCgAqFiEERcxjuFJYydVfNLI5030Op87MORIFAmUIb8gMHGdAeGVsZXJh
LmV1AAoJENN9DqfOzDkSRvMQAMtWqJX5RuVUNMxiFvQVyLEH10W2jjodUPuc1YKx
pal70cxtyTYVQv0sqaSsZ+IwVLC7YK234rEBJ4EVA9fAjpdrCMG2VW3LXNKyO95c
wVW0poyTNts/poSDDhNBOq0W1jp1dfrFwqqTwPMfhSr+KbUPVTwi/AThMJtl9Jqj
RJabR9IvwZrJG0GDBsIU0DrGtQcEBzlW86bNzQ74r+DVK4qxkaWBgSkqXaL8i9Cd
EPVeyOw3g0xPQhKJiM/nm7s38V8Wcl5S+AgpWfrVQ4uhwZUNUXdpOaOFUha58Vga
uS5fEGX7NFxejb3ZQg+WCEt2K0B1qE/wnaT4Aw2dL6Kxz9I0nyMoLRBY4SRu/YB5
h17MHkUTvXsKSAJSrI686ROljnxa7fRiXsjsuYV8EQvaBcq9XGduD/ApFotKZxoT
5lVO4QfWydNqzyYRJUvib57KzTZrtuZAAvKVQoDEl04I3PhIjKoz1sGzkUKUlIIt
CgAD/AcFHVFAqIoae0vhwjLXcIsNrcdLJ5nCRm5cRBi1dtFCSKp4WbBSp8eAvy+t
IHePk/bhe3voLINl2qp6rvOIm5wQXa2dZJX/5iBSveQHY5re5mkNrc59FoS+5zux
jdU0eTaA4mwhsGwZsh6hhIada4oGVAibFuB6UnaBbAMT43R4HTwka7aVv6bT981G
xAi7
=P6TC
-----END PGP SIGNATURE-----

A
A
Arun Isaac wrote on 18 Sep 2023 21:56
87fs3btfue.fsf@systemreboot.net
Toggle quote (7 lines)
>> Notice how the search term "database" is highlighted in the search
>> results. This is relatively easy to do with Xapian, and indeed I do
>> plan to implement this at some point.
>
> OK: can we consider this bug report (wishlist) as the "official" one for
> that feature? :-)

Sure!

Toggle quote (3 lines)
> Can I retitle it to better reflect the upcoming implementation and
> assign it to you?

Yes, please!

Toggle quote (4 lines)
> Actually I dont' know if there is some written or unwritten convention
> in Guix or GNU about bug assignment, I don't want to put pressure on
> you!

No problem! I do intend to implement this feature at some point
anyway. Even if I don't do it, it's good to list this as an issue so
that someone interested can try and hack on it.
G
G
Giovanni Biscuolo wrote on 19 Sep 2023 08:37
control commands
(address . control@debbugs.gnu.org)
87sf7a4qij.fsf@xelera.eu
owner 65809 Arun Isaac <arunisaac@systemreboot.net>
quit

Arun do intend to implement this feature at some point.

Thanks! Gio'

--
Giovanni Biscuolo

Xelera IT Infrastructures
F
F
Felix Lechner wrote on 8 Feb 19:09 +0100
(no subject)
(address . control@debbugs.gnu.org)
87h6ii6d58.fsf@lease-up.com
reassign 65858 mumi
reassign 65809 mumi
reassign 49295 mumi
reassign 64833 mumi
reassign 54204 mumi
reassign 68920 mumi
reassign 55765 mumi
reassign 65923 mumi
reassign 63507 mumi
reassign 60226 mumi
reassign 63936 mumi
reassign 60951 mumi
reassign 66027 mumi
thanks
?