From debbugs-submit-bounces@debbugs.gnu.org Sun Apr 26 13:02:01 2020 Received: (at 39258) by debbugs.gnu.org; 26 Apr 2020 17:02:01 +0000 Received: from localhost ([127.0.0.1]:34794 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jSkfJ-0005nE-6l for submit@debbugs.gnu.org; Sun, 26 Apr 2020 13:02:01 -0400 Received: from mail-qt1-f174.google.com ([209.85.160.174]:45675) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jSkfH-0005n1-Bg for 39258@debbugs.gnu.org; Sun, 26 Apr 2020 13:01:59 -0400 Received: by mail-qt1-f174.google.com with SMTP id 71so12333801qtc.12 for <39258@debbugs.gnu.org>; Sun, 26 Apr 2020 10:01:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=T46kNsd90cn0Z34pQMLDrYi8twHT3V0xsHpDUhklB0Q=; b=uBmtYAFFOp8amJ+J+99mVQwyKVxZ27J76bi9Gie9tgSYUG3zgw7DB/Ncxj2jKrdNvi gpi2RzKrxbwEP6u+kKVejkFjVg4F80nT7Dq8piSP685WpK2/n+bZ0yAL7jBbC2giW5qU n+U5ALW+LrmiaC7ZVXvS9Mu/yZsFIQBgFM489j0tQCWNmwwKkCo5Ii2ud0aWIrUdlP1z qVeP6lpT8po5jTszLXLlD8F+ANSeacK7sF8dxZ85uW5qS3gw2I4CRQf0Ues4rjodzS92 TQHzH9PkEnUhUzfgsAaEKKdOgONKLvbLXtKCVR2FHSNU8BsSmBG6pLrxpdw+BjJavnIF AQAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=T46kNsd90cn0Z34pQMLDrYi8twHT3V0xsHpDUhklB0Q=; b=SJ+3dJ7e2bV0Y6cJYFofP4ZeuSYjjeMjZJ/m03oEJdd5ttPiDRDKIQgjCILCmjzJmb RNT9RVZDGW7duJbfLrAgSDAAC6VWDetvEmhjesKfDXGugPGqTlaSROqhaekBfExrs7Uc KbwxqfBM8+Hk28r1hoXXHFFj39Z0PMVnrakM/A/OpqPSGX/W67g4v9RdlYO8V67rIBoG Wqn2K++W0L66PNAhq+/QVVIMrlstFiAarNDzWsNJRZO7oOs8T0/fo2aXqLT3aRZj6NB5 xchgbSruQXOAc+2G92uznmu76YI5j7DGtrZmNnqCAJUZ1at7C0mYmQW5lrIRGGATwn3o rjDQ== X-Gm-Message-State: AGi0PubXt5gzXKcxKWJGcni1Bb6A2JrmiJ+avVcd/n/NW1v/SqVbytqZ YG/CIofn8hLSJErJfToIeJUuBFDgAWoZt5OQLIKfRQkn X-Google-Smtp-Source: APiQypIX+jD7Tl9wPaDslxIZbSRAsH8BZOQVnrflsK2GWCPifw0z0yprzTklh+ktE3lOY/VTLRR6r3lM9OewoU7doGc= X-Received: by 2002:ac8:19fd:: with SMTP id s58mr19572563qtk.354.1587920512567; Sun, 26 Apr 2020 10:01:52 -0700 (PDT) MIME-Version: 1.0 References: <87lfmi2qod.fsf@gnu.org> In-Reply-To: <87lfmi2qod.fsf@gnu.org> From: zimoun Date: Sun, 26 Apr 2020 19:01:41 +0200 Message-ID: Subject: Re: benchmark search: default vs v2 vs v3 To: =?UTF-8?Q?Ludovic_Court=C3=A8s?= Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Score: -0.0 (/) X-Debbugs-Envelope-To: 39258 Cc: Arun Isaac , Pierre Neidhardt , 39258@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) Hi Ludo, On Sun, 26 Apr 2020 at 17:49, Ludovic Court=C3=A8s wrote: > It does seem like Arun=E2=80=99s v3 (or maybe even v2) would work nicely. The v3 is more interesting because it does not change the relevance scoring and does not add other dependency. However v2 is interesting to easily test BM25 which is another relevance scoring... work in progress. :-) > > The question is the tradeoff between: the slowdown of pull vs the > > speedup of search. What is acceptable? > > That=E2=80=99s only one criterion among others. I hear the argument that= 25s is > =E2=80=9Cnothing=E2=80=9D compared to the rest, but it=E2=80=99s really a= tradeoff. Like, if I > spent a day optimizing =E2=80=98guix pull=E2=80=99 and managed to save 25= s, I would find > it nice. :-) And I expect that the middle-term roadmap would even decrease more the computations of derivations. ;-) > > $ time guix pull -C ~/.config/guix/default-channels.scm > > It also depends on what=E2=80=99s in that file, of course. Contains only one line: %default-channels See my wishlist ;-) https://lists.gnu.org/archive/html/guix-devel/2020-04/msg00393.html me: 2m13.693s you: 0m57.916s As we already discussed elsewhere, it is hard to "test" 'guix pull'. Does it make sense to measure "guix pull"? As Chris (Marusich) did for CDN. > > Well, let remove the profiles and garbage collect the index files: > > > > rm /tmp/default /tmp/v{2,3}* > > guix gc -D \ > > /gnu/store/g5c08vqsv31nkn2r0hr32dbrkhf3cvd8-guix-package-cache \ > > /gnu/store/8xbzhn81hmshagbgazmnr7xfps1cdsa3-guix-package-search-inde= x \ > > /gnu/store/8j78b5c4ddic21gcx7wpbq2akjn7x7mr-guix-package-metadata-ca= che > > Could you do, for v2 and v3: > > time guix build /gnu/store/=E2=80=A6-package-metadata-cache.drv --check Newbie me! :-) Two points: 1. It may not be reproducible... I am checking. 2. The time seems similar (v2=3D26s and v3=3D29s) considering the time to start Guile and so on. --8<---------------cut here---------------start------------->8--- guix gc --list-live | grep metadata time /tmp/v3/bin/guix build /gnu/store/jxs0abica8kjz1ppym95df97jk0qa9by-guix-package-metadata-cache.drv --check The following profile hook will be built: /gnu/store/jxs0abica8kjz1ppym95df97jk0qa9by-guix-package-metadata-cache.= drv building package cache... (repl-version 0 1 1) Generating package metadata cache for '/gnu/store/95mi525syinh08jmcd3q7a7a8mr1sykb-profile'... (values (value "/gnu/store/zhp7wv87vr6iis0fa3ff925i5r04i08q-guix-package-me= tadata-cache/lib/guix/package-metadata.cache")) guix build: error: derivation `/gnu/store/jxs0abica8kjz1ppym95df97jk0qa9by-guix-package-metadata-cache.dr= v' may not be deterministic: output `/gnu/store/zhp7wv87vr6iis0fa3ff925i5r04i08q-guix-package-metadata-cache' differs real 0m29.788s user 0m0.535s sys 0m0.025s --8<---------------cut here---------------end--------------->8--- > That we=E2=80=99ll give us the exact cost of that part. It=E2=80=99ll be= interesting > especially in the Xapian case, which we expected to be higher. --8<---------------cut here---------------start------------->8--- time /tmp/v2/bin/guix build /gnu/store/w0dhl2n3ngi4v2ld8lprkqjl1g1q2m4p-guix-package-search-index.drv --check The following profile hook will be built: /gnu/store/w0dhl2n3ngi4v2ld8lprkqjl1g1q2m4p-guix-package-search-index.dr= v running profile hook of type 'package-search-index'... (repl-version 0 1 1) Generating package search index for '/gnu/store/wiinj9nrb45wlf2cgbgkjl9chxz9cb9b-profile'... (values (value "/gnu/store/8xbzhn81hmshagbgazmnr7xfps1cdsa3-guix-package-se= arch-index/lib/guix/package-search.index")) guix build: error: derivation `/gnu/store/w0dhl2n3ngi4v2ld8lprkqjl1g1q2m4p-guix-package-search-index.drv' may not be deterministic: output `/gnu/store/8xbzhn81hmshagbgazmnr7xfps1cdsa3-guix-package-search-index' differs real 0m26.552s user 0m0.626s sys 0m0.046s --8<---------------cut here---------------end--------------->8--- It is not higher. Why should it be? Considering aside the issue of reproducibility -- which should be one! -- well, should be possible to download the index file as any other substitute? Cheers, simon