From debbugs-submit-bounces@debbugs.gnu.org Thu Dec 29 15:18:45 2022 Received: (at submit) by debbugs.gnu.org; 29 Dec 2022 20:18:45 +0000 Received: from localhost ([127.0.0.1]:32985 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pAzMT-0007GM-5G for submit@debbugs.gnu.org; Thu, 29 Dec 2022 15:18:45 -0500 Received: from lists.gnu.org ([209.51.188.17]:54252) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1pAzMR-0007GC-68 for submit@debbugs.gnu.org; Thu, 29 Dec 2022 15:18:43 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pAzMR-0005kN-0p for guix-patches@gnu.org; Thu, 29 Dec 2022 15:18:43 -0500 Received: from mugam.systemreboot.net ([139.59.75.54]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pAzMO-0004d6-2K for guix-patches@gnu.org; Thu, 29 Dec 2022 15:18:42 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=systemreboot.net; s=default; h=Content-Transfer-Encoding:MIME-Version: Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To:Content-Type:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=Svxwvue+mDX1Yys/m0cjnSWxmLI2jK+RD1+75y/rVTE=; b=L1J9+1gVkh7yofd6tVCAtO0KcE 7MGEL4HBb/XeZb04nw44Xa1cd6nXh0xb5w03Tny75y8N4Q9ZVeGVwRI3Baoh9Q6uwY3/piJWRA18U BqpsR0G4heQ+RwM61EgEIhvevRTQS4sDpjH4L5Tr3mBFniFvHekdk696+DopYV2qbbRHlvatd1I/I AdeaT1KUaBI3nuD6Be4wQFHO0Na6fL7fzo9+KVZr6/XHoc/96+cRUeqBPGIGLLn4/MpLB+E7KnBnK fS+P4UfEtk0oKRYRHEmldZCFf3tRYnMJoYOsY49rUgee0ogyg0isr/Ma/rzQ4IAaKCZ4qTHbcn+xB XnybUNmw==; Received: from [192.168.2.1] (port=45786 helo=localhost.localdomain) by systemreboot.net with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1pAzME-000oI2-0R; Fri, 30 Dec 2022 01:48:30 +0530 From: Arun Isaac To: guix-patches@gnu.org, Ricardo Wurmus Subject: [PATCH 0/7] mumi: Boolean prefixes in xapian indexing and others Date: Thu, 29 Dec 2022 20:18:09 +0000 Message-Id: <20221229201809.27997-1-arunisaac@systemreboot.net> X-Mailer: git-send-email 2.38.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=139.59.75.54; envelope-from=arunisaac@systemreboot.net; helo=mugam.systemreboot.net X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: -1.4 (-) X-Debbugs-Envelope-To: submit Cc: Arun Isaac X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -2.4 (--) Hi Ricardo, This is a patchset that has been sleeping for some time in my local git repo. So, I thought it was about time to send it over! The main change is that some xapian prefixes should be indexed as boolean prefixes. This makes the use of an implicit AND operator unneccessary and lets xapian do the natural thing of ordering results by relevance. I believe this improves the search significantly. Also, since we retrieve search results by relevance, we can offload limiting of search results to xapian. Thus, we improve performance as well. For this patchset to be useful, mumi's xapian index will have to be rebuilt. In general, it is good to periodically rebuilt the xapian index from scratch. Regards, Arun Arun Isaac (7): xapian: Index several terms as boolean and without positions. xapian: Declare some prefixes as boolean. xapian: Do not override the default OR implicit query operator. messages: Remove unused set intersection feature in search-bugs. messages: Offload limiting search results to xapian. cache: Specify that cache! returns the cached value. xapian: Preserve order of search results. mumi/cache.scm | 3 +- mumi/messages.scm | 29 ++++-------- mumi/xapian.scm | 109 +++++++++++++++++++++++++++++++--------------- 3 files changed, 86 insertions(+), 55 deletions(-) -- 2.38.1