diff options
Diffstat (limited to 'content/posts/book-review-relevant-search.md')
-rw-r--r-- | content/posts/book-review-relevant-search.md | 74 |
1 files changed, 0 insertions, 74 deletions
diff --git a/content/posts/book-review-relevant-search.md b/content/posts/book-review-relevant-search.md deleted file mode 100644 index 6241ed6..0000000 --- a/content/posts/book-review-relevant-search.md +++ /dev/null @@ -1,74 +0,0 @@ ---- -categories: [blog, "book review"] -title: "[Book review] Relevant Search" -date: 2021-05-06T16:35:08+07:00 -tags: [book, review, search, programming, algorithm] ---- - -So I decided to review books as I write. As people say, you would understand -things better when you share it with each other. - -Each review will contain: - -- metadata: book name, author(s), ISBN, genres, language (please tell if there - are some more helpful information) -- summary: wrap up the content of the book; it should not no more than 5 - subsections of 150 words each -- comments: my thoughts on the book -- what I like and don't like about it - -# Metadata - -| Book | Relevant Search: With applications for Solr and Elasticsearch | -|---------|------------------------------| -| Authors | Doug Turnbull, John Berryman | -| ISBN | 9781617292774 | -| Genres | Programming | -| Language| English | - -# Summary -## The search relevance problem - -Given an increasingly large amount of information, it is infeasible for users -to retrieve what they needed. Relevance scoring is therefore essential for -search engines. - -In general, the relevance engineers have to identify the most important -features describing the content, the user, or the search query, transfer those -features to the search engine, then measure what's relevant to the search by -crafting signals and finally balance the weights of the signals to rank the -results. - -Unfortunately, it is a challenging problem. Each search application -serves a different type of content and thus has different expectation for -relevance. Consequently, there is no silver bullet to solve this problem. -Even the academic field that thoroughly study this problem, information -retrieval is not a one-size-fit all solution. Relevance is strongly tied with -the field and the application purpose. - -## Tackling the problem - -The book approaches the problem first by a top-down analysis of how a typical -search engine works. It then shows how a search query is processed by the -search engine. After providing basic knowledge of how search work, the authors -give some examples of relevance score tuning and show how it helps improving -the relevance of the search results. Not stopping at the technical view, the -authors also approach the problem from business view: they note that -interdiscipline collaboration is important in order to define and increase -relevance. - -# Comments - -## What I like - -The book approaches the problem from various views: business view, algorithmic -view, and practical view (giving examples). The book accentuates the diversity -of problems and thereby encouraging readers to critically think of their own -problems. While it suggests that search results should be influenced by -sponsors, it also notes that without balance that will as well lead to failure. - -## What I don't like - -Its structure is somewhat unclear and flow to me. I think some chapters can be -re-ordered so it's more logical. Also, I find weighing sponsors' priorities -over customers' unethical, but that is probably just a harsh truth in this -society rather than the authors' view. |