-
Notifications
You must be signed in to change notification settings - Fork 50
Home
- First release to Maven Central https://search.maven.org/search?q=querqy
- Allow configurations (rules.txt, ...) to be gzipped (#77)
- Preparations for Maven Central repository:
- New
FieldAwareWhiteSpaceQuerqyParser
, which can handle field names in query strings
- Bugfix: Purely negative queries not boosted (#69)
Porting the changes from Querqy 4.5.lucene810.1 and 4.5.lucene810.0 (Solr 8.1) to Solr 8.0
- Bugfix: adding more than one
boost
parameter causes exception. (Fixed for Solr/Lucene 8.1.0 and later, but not for 8.0.0 yet).
- Adding a replace rewriter that allows to replace input terms (#64)
- Rule selection now supports 'levels' when sorting/limiting rules.(#53)
This release fixes a bug that could lead to no results for multi-term queries with 'minimum should match' set to 100%. (#60)
Querqy 4.4.lucene810.0 is compatible with Solr/Lucene 8.1.0 and 8.1.1.
If you are migrating from a Solr/Lucene version < 8.0.0, you will to re-test your search result orders. See release note for Querqy 4.4.lucene800.0 for details.
- #55: This Querqy version no longer includes com.jayway.jsonpath:json-path as part of the jar-with-dependency artefact as it is now provided by Solr.
Querqy 4.4.lucene800.0 is compatible with Solr/Lucene 8.0.0 only.
If you migrate from an earlier Solr/Lucene version, you will have to re-test your search result orders and possibly have to adjust boost factors in Querqy's common rules.
This is due to two changes in Lucene:
By default, Lucene uses the BM25 similarity for scoring. As of version 8.0, Lucene now fully implements term index statistics, such as 'average field length', and uses them in BM25. That means that even if you were not using Querqy, you would have to retest your search result orders when migrating to Lucene/Solr 8.
For example, the following change can be observed for the Edismax query parser:
Indexed documents (using WhitespaceTokenizer):
doc 1 { f1: "a" }
doc 2 { f1: "a" }
doc 3 { f2: "a" }
doc 4 { f3: "k a" }
doc 5 { f1: "a" }
doc 6 { f2: "y" }
query: q=a
qf=f1 f2 f3
defType=edismax
Search result
- prior to Solr 8:
doc 3, doc 1, doc 2, doc 5, doc 4
- from Solr 8:
doc 3, doc 4, doc 1, doc 2, doc 5
The Lucene/Solr developers decided to no longer allow negative scores. This has been prerequisite for implementing a speedup of queries that don't need document counts (such as result counts or facets). All out-of-the-box scorers are now guaranteed to return a positive value and using negative boosts in custom scorers has become impossible by design.
As a consequence, we had to change the implementation of UP/DOWN boosts in the Common Rules query rewriter. While the implementation prior to Solr/Lucene 8 just multiplied the positive or negative boost with the boost query score, the new implementation looks like this:
if (boost < 0):
score = abs(boost) * (1f - (1f - 1f/(boost query score + 1f)))
else:
score = abs(boost) * (1f - (1f/(boost query score + 1f)));
where boost is the positive or negative boost factor and boost query score is the similarity score (default: BM25) of the boost query.
For negative boosts, this will add a positive score to documents that don't match the boost query.
Given the same boost factor, the boosting effect will be weaker in the new Common Rules UP/DOWN implementation.
The new implementation guarantees the following:
Given
rules
a =>
DOWN(factor f1): x
UP(factor f2 = f1): y
two documents
doc 1: "a x y"
doc 2: "a k l"
and
query=a
then the scores of doc 1 and doc 2 will be the same: UP and DOWN equal each
other out in doc 1, whereas for doc 2, these rules are not applicable as the
document contains neither x nor y.