SOLR Index-Time Boost Facts
I had a “string” type field in my SOLR index that I wanted to boost at index time. For this field, my schema.xml looked like:
My documents had the the tags with boost defined. eg:
<field name=”myBoostedField” boost=”7.0″>value</field>
But unfortunately the boost never worked. I did all that I could to analyze where the problem was before I finally read about the “omitNorms” description.
The omitNorms was true for me for this field as myBoostedField is of type “string” and the fieldtype definition of “string” had the omitNorms=”true” by default.
Boosting finally worked for me when I changed my field description to:
<field name=”myBoostedField” type=”string” indexed=”true” stored=”true” multivalued=”true”omitNorms=”false”> style=”font-style: italic;”> </field>
Fact 1: It is mandatory to retain the norms for a field to be able to specify an index time boost on it.
Now, as you can see in the description of myBoostedField, it is a multivalued field. So every document could have more than one values for myBoostedField and interestingly, each value could have a different boost value. eg:
<field name=”myBoostedField” boost=”7.0″>value1</field>
<field name=”myBoostedField” boost=”8.0″>value2</field>
<field name=”myBoostedField” boost=”4.0″>value3</field>
Now, lucene internally does not store a boost for each value of a field, infact it does for each field. So in this situation, a consolidated boost is calculated for myBoostedField for each document and stored accordingly. As an effect, the individual values 7, 8 & 4 are lost.
Fact 2: An Index-time boost on a value of a multiValued field applies to all values for that field and not on individual values.
I had finally worked around this problem as well as I wanted to retain boost values for each value. I had to create new fields for each value. So I defined a dynamicField in my schema.xml.
eg:
<dynamicfield name=”myBoostedField*” type=”string” indexed=”true” stored=”true” omitnorms=”false”></dynamicfield>
Then I changed my solrconfig.xml to add a qf for my input query and it works like a charm. My qf=myBoostedField1^1.0 myBoostedField2^1.0 myBoostedField3^1.0
About this entry
You’re currently reading “SOLR Index-Time Boost Facts,” an entry on /kapil/blog
- Published:
- 1.20.08 / 7am
- Tags:
- lucene, performance, solr
/kapil/blog
No comments
Jump to comment form | comments rss [?] | trackback uri [?]