about summary refs log tree commit diff
path: root/content/posts/book-review-relevant-search.md
diff options
context:
space:
mode:
Diffstat (limited to 'content/posts/book-review-relevant-search.md')
-rw-r--r--content/posts/book-review-relevant-search.md74
1 files changed, 74 insertions, 0 deletions
diff --git a/content/posts/book-review-relevant-search.md b/content/posts/book-review-relevant-search.md
new file mode 100644
index 0000000..6241ed6
--- /dev/null
+++ b/content/posts/book-review-relevant-search.md
@@ -0,0 +1,74 @@
+---
+categories: [blog, "book review"]
+title: "[Book review] Relevant Search"
+date: 2021-05-06T16:35:08+07:00
+tags: [book, review, search, programming, algorithm]
+---
+
+So I decided to review books as I write. As people say, you would understand
+things better when you share it with each other.
+
+Each review will contain:
+
+- metadata: book name, author(s), ISBN, genres, language (please tell if there
+    are some more helpful information)
+- summary: wrap up the content of the book; it should not no more than 5
+    subsections of 150 words each
+- comments: my thoughts on the book -- what I like and don't like about it
+
+# Metadata
+
+| Book | Relevant Search: With applications for Solr and Elasticsearch |
+|---------|------------------------------|
+| Authors | Doug Turnbull, John Berryman |
+| ISBN    | 9781617292774                |
+| Genres  | Programming                  |
+| Language| English                      |
+
+# Summary
+## The search relevance problem
+
+Given an increasingly large amount of information, it is infeasible for users
+to retrieve what they needed.  Relevance scoring is therefore essential for
+search engines.
+
+In general, the relevance engineers have to identify the most important
+features describing the content, the user, or the search query, transfer those
+features to the search engine, then measure what's relevant to the search by
+crafting signals and finally balance the weights of the signals to rank the
+results.
+
+Unfortunately, it is a challenging problem.  Each search application
+serves a different type of content and thus has different expectation for
+relevance.  Consequently, there is no silver bullet to solve this problem.
+Even the academic field that thoroughly study this problem, information
+retrieval is not a one-size-fit all solution.  Relevance is strongly tied with
+the field and the application purpose.
+
+## Tackling the problem
+
+The book approaches the problem first by a top-down analysis of how a typical
+search engine works.  It then shows how a search query is processed by the
+search engine.  After providing basic knowledge of how search work, the authors
+give some examples of relevance score tuning and show how it helps improving
+the relevance of the search results.  Not stopping at the technical view, the
+authors also approach the problem from business view: they note that
+interdiscipline collaboration is important in order to define and increase
+relevance.
+
+# Comments
+
+## What I like
+
+The book approaches the problem from various views: business view, algorithmic
+view, and practical view (giving examples). The book accentuates the diversity
+of problems and thereby encouraging readers to critically think of their own
+problems.  While it suggests that search results should be influenced by
+sponsors, it also notes that without balance that will as well lead to failure.
+
+## What I don't like
+
+Its structure is somewhat unclear and flow to me.  I think some chapters can be
+re-ordered so it's more logical.  Also, I find weighing sponsors' priorities
+over customers' unethical, but that is probably just a harsh truth in this
+society rather than the authors' view.