diff options
Diffstat (limited to 'content/posts/book-review-relevant-search.md')
-rw-r--r-- | content/posts/book-review-relevant-search.md | 74 |
1 files changed, 74 insertions, 0 deletions
diff --git a/content/posts/book-review-relevant-search.md b/content/posts/book-review-relevant-search.md new file mode 100644 index 0000000..6241ed6 --- /dev/null +++ b/content/posts/book-review-relevant-search.md @@ -0,0 +1,74 @@ +--- +categories: [blog, "book review"] +title: "[Book review] Relevant Search" +date: 2021-05-06T16:35:08+07:00 +tags: [book, review, search, programming, algorithm] +--- + +So I decided to review books as I write. As people say, you would understand +things better when you share it with each other. + +Each review will contain: + +- metadata: book name, author(s), ISBN, genres, language (please tell if there + are some more helpful information) +- summary: wrap up the content of the book; it should not no more than 5 + subsections of 150 words each +- comments: my thoughts on the book -- what I like and don't like about it + +# Metadata + +| Book | Relevant Search: With applications for Solr and Elasticsearch | +|---------|------------------------------| +| Authors | Doug Turnbull, John Berryman | +| ISBN | 9781617292774 | +| Genres | Programming | +| Language| English | + +# Summary +## The search relevance problem + +Given an increasingly large amount of information, it is infeasible for users +to retrieve what they needed. Relevance scoring is therefore essential for +search engines. + +In general, the relevance engineers have to identify the most important +features describing the content, the user, or the search query, transfer those +features to the search engine, then measure what's relevant to the search by +crafting signals and finally balance the weights of the signals to rank the +results. + +Unfortunately, it is a challenging problem. Each search application +serves a different type of content and thus has different expectation for +relevance. Consequently, there is no silver bullet to solve this problem. +Even the academic field that thoroughly study this problem, information +retrieval is not a one-size-fit all solution. Relevance is strongly tied with +the field and the application purpose. + +## Tackling the problem + +The book approaches the problem first by a top-down analysis of how a typical +search engine works. It then shows how a search query is processed by the +search engine. After providing basic knowledge of how search work, the authors +give some examples of relevance score tuning and show how it helps improving +the relevance of the search results. Not stopping at the technical view, the +authors also approach the problem from business view: they note that +interdiscipline collaboration is important in order to define and increase +relevance. + +# Comments + +## What I like + +The book approaches the problem from various views: business view, algorithmic +view, and practical view (giving examples). The book accentuates the diversity +of problems and thereby encouraging readers to critically think of their own +problems. While it suggests that search results should be influenced by +sponsors, it also notes that without balance that will as well lead to failure. + +## What I don't like + +Its structure is somewhat unclear and flow to me. I think some chapters can be +re-ordered so it's more logical. Also, I find weighing sponsors' priorities +over customers' unethical, but that is probably just a harsh truth in this +society rather than the authors' view. |