Scoring feature is used to prioritize and sort the search results by considering their relevance to the search query. For the scoring formula several facts are used. Below the formula which is used to calculate the score value is shown.
Score for term t in document d = ∑ tf (t in d).idf(t).boost(t.field in d).lengthNorm(t.field in d)
Below table is listed how those functions are calculated and the description of those functions.
Function | Description |
tf (t in d) = sqrt(freq) | Term frequency factor for the term (t) in the document (d). This factor result to have high score value for a document where more frequent a term occurred. |
idf(t) = log(numDocs/(docFreq+1)) + 1 | Inverse document frequency of the term. |
boost(t.field in d) | Field boost, as set during indexing. Boosting is used to give high priority for a term or field. This is useful for similarity search to provide high priority for most important area. |
lengthNorm(t.field in d)= 1/sqrt(numTerms) | Normalization value of a field, given the number of terms |
Ranking of the search results are based on the score value of the result. Documents which have high score value have high rank and documents which have low score value have low rank.
No comments:
Post a Comment