From debbugs-submit-bounces@debbugs.gnu.org Mon Mar 09 14:19:47 2020 Received: (at 39258) by debbugs.gnu.org; 9 Mar 2020 18:19:47 +0000 Received: from localhost ([127.0.0.1]:51447 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jBN0E-0004Om-Uw for submit@debbugs.gnu.org; Mon, 09 Mar 2020 14:19:47 -0400 Received: from mail-qk1-f194.google.com ([209.85.222.194]:32845) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1jBN0D-0004OY-J9 for 39258@debbugs.gnu.org; Mon, 09 Mar 2020 14:19:45 -0400 Received: by mail-qk1-f194.google.com with SMTP id p62so10236556qkb.0 for <39258@debbugs.gnu.org>; Mon, 09 Mar 2020 11:19:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=fEHQ+yrhEnUirq0+TLk20ZxNEdxLbSGmklvpWQBJ/Eg=; b=GnYm9Xev5HstMaD2of9pyCGXtR2UKehOl1SVpqcWljCuACJsUmEdAl/4ZYGU9EeUQT LeOlS83xqVik/dVNkbaiEEd36gKo7QuNSzKM7ACO8NVxr3d1d6Gm/0Aou3L+ARSoH26X F1/2qZUhLYqPx3gBt52CwZzO+gxWSbDZNEq97TWJ1zB6KYQ97jL4II1kZBGO+6u0gZGb IEqgSUre+udqunCSYKkRHzWn67zGtMUO4R36w8mFrNP2WNvYrtoeZrwNBBJOFd288+6n QSsHFnBCzdurktw5/mB8ZKIfo+T5lQKoyxsARW5BWMJaRjbvbm6rwNrZ4altBV305ICH pMoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=fEHQ+yrhEnUirq0+TLk20ZxNEdxLbSGmklvpWQBJ/Eg=; b=i/eaZ8SA8E/yxYZ9zl2liNsiD8jmCep1MYFAULNT99/rjPaBpP5Blab6UDJCcVQ7/q 9nbDyK1WjpstF8xCkgNWx+xOFyXkxJ9Xxc5wxfinDxyCMpL4oJevfMsWyrOrwdvWAZOm S/UBOPuT3GhgXL+eRjkouGny9nnbFQ559MPJT7/nnp2jjy0kl6UDoE4Mja2O3Hz17bsM qprUGsSVPJ9IgYBxVjZhWJIIXk0I6tdxig9zc54oBlw5SGHoViuKCNCmiqPd2ydToXpO MLP7u+MnSFviI/kOGC5b/w5uWwk4D1MnQYVQ8VjOsxBitML2dKdz+/w6JUuG/2blY0Gb QCSA== X-Gm-Message-State: ANhLgQ1I2UE8B/4VLL3IXKKtHjepNrKjHfnCkZ5cInF+FAessdA3yPt7 3YlSj38F7IPOnY+xEAH4PEFS0NxYdfIt3j9NSEA= X-Google-Smtp-Source: ADFU+vt9x9cBOS64PTIyIQ2j/njxY4n6lsUW2HQMrhYWt/hhaJIkjWNZ2Zj4H+Wa/ik6B2eVukvEMM+fGjNgqhgWyJY= X-Received: by 2002:a37:6852:: with SMTP id d79mr8196802qkc.304.1583777980149; Mon, 09 Mar 2020 11:19:40 -0700 (PDT) MIME-Version: 1.0 References: <20200307133116.11443-1-arunisaac@systemreboot.net> <20200307133116.11443-3-arunisaac@systemreboot.net> In-Reply-To: <20200307133116.11443-3-arunisaac@systemreboot.net> From: zimoun Date: Mon, 9 Mar 2020 19:19:27 +0100 Message-ID: Subject: Re: [PATCH v2 2/3] gnu: Generate Xapian package search index. To: Arun Isaac Content-Type: text/plain; charset="UTF-8" X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: 39258 Cc: =?UTF-8?Q?Ludovic_Court=C3=A8s?= , Pierre Neidhardt , 39258@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) On Sat, 7 Mar 2020 at 14:31, Arun Isaac wrote: > diff --git a/gnu/packages.scm b/gnu/packages.scm > index d22c992bb1..c8e221de68 100644 > --- a/gnu/packages.scm > +++ b/gnu/packages.scm [...] > @@ -426,6 +429,43 @@ reducing the memory footprint." > #:opts '(#:to-file? #t))))) > cache-file) > > +(define %package-search-index > + ;; Location of the package search-index > + "/lib/guix/package-search.index") > + > +(define (generate-package-search-index directory) > + "Generate under DIRECTORY a Xapian index of all the available packages." > + (define db-path > + (string-append directory %package-search-index)) > + > + (mkdir-p (dirname db-path)) > + (call-with-writable-database db-path > + (lambda (db) > + (fold-packages (lambda (package _) > + (let* ((idterm (string-append "Q" (package-name package))) > + (doc (make-document #:data (string-trim-right > + (call-with-output-string > + (cut package->recutils package <>)) > + #\newline) > + #:terms `((,idterm . 0)))) > + (term-generator (make-term-generator #:stem (make-stem "en") > + #:document doc))) > + (for-each (match-lambda > + ((field . weight) > + (match (field package) > + ((? string? str) > + (index-text! term-generator str > + #:wdf-increment weight)) > + ((lst ...) > + (for-each (cut index-text! term-generator <> > + #:wdf-increment weight) > + lst))) > + (replace-document! db idterm doc))) > + %package-metrics))) > + #f))) > + > + db-path) If I understand correctly, the index is stored with a weight coming from '%package-metrics', right? Well, I am not convinced it is the correct way but I have not tried by myself yet. :-)