From 6670e5a5b6d51d78f105430e416155e73c9a09a9 Mon Sep 17 00:00:00 2001 From: Ngô Ngọc Đức Huy Date: Thu, 6 May 2021 16:48:57 +0700 Subject: Add a book review --- .../2021-05-02-book-review-relevant-search.md | 62 ------------------ .../2021-05-06-book-review-relevant-search.md | 74 ++++++++++++++++++++++ 2 files changed, 74 insertions(+), 62 deletions(-) delete mode 100644 content/posts/2021-05-02-book-review-relevant-search.md create mode 100644 content/posts/2021-05-06-book-review-relevant-search.md diff --git a/content/posts/2021-05-02-book-review-relevant-search.md b/content/posts/2021-05-02-book-review-relevant-search.md deleted file mode 100644 index 0f4ef18..0000000 --- a/content/posts/2021-05-02-book-review-relevant-search.md +++ /dev/null @@ -1,62 +0,0 @@ ---- -categories: [blog, "book review"] -title: "[Book review] Relevant Search" -date: 2021-05-02T21:35:08+07:00 -tags: [book, review, search, programming, algorithm] -draft: true ---- - -So I decided to review books as I write. As people say, you would understand -things better when you share it with each other. - -Each review will contain: - -- metadata: book name, author(s), ISBN, genres, language (please tell if there - are some more helpful information) -- summary: wrap up the content of the book; it should not no more than 5 - subsections of 150 words each -- comments: my thoughts on the book -- what I like and don't like about it - -# Metadata - -| Book | Relevant Search: With applications for Solr and Elasticsearch | -|---------|------------------------------| -| Authors | Doug Turnbull, John Berryman | -| ISBN | 9781617292774 | -| Genres | Programming | -| Language| English | - -# Summary -## The search relevance problem - -Given an increasingly large amount of information, it is infeasible for users -to retrieve what they needed. Relevance scoring is therefore essential for -search engines. - -In general, the relevance engineers have to identify the most important -features describing the content, the user, or the search query, transfer those -features to the search engine, then measure what's relevant to the search by -crafting signals and finally balance the weights of the signals to rank the -results. - -Unfortunately, it is a challenging problem. Each search application -serves a different type of content and thus has different expectation for -relevance. Consequently, there is no silver bullet to solve this problem. -Even the academic field that thoroughly study this problem, information -retrieval is not a one-size-fit all solution. Relevance is strongly tied with -the field and the application purpose. - -## Tackling the problem - -The book approaches the problem first by a top-down analysis of how a typical -search engine works. - -## Taming token -## Multifield search -## Term-centric search -## Shaping relevance function -## Providing relevance feedback -## Designing a relevance-focused search application -## The relevance-centered enterprise -## Semantic and personalized search -# Comments diff --git a/content/posts/2021-05-06-book-review-relevant-search.md b/content/posts/2021-05-06-book-review-relevant-search.md new file mode 100644 index 0000000..6241ed6 --- /dev/null +++ b/content/posts/2021-05-06-book-review-relevant-search.md @@ -0,0 +1,74 @@ +--- +categories: [blog, "book review"] +title: "[Book review] Relevant Search" +date: 2021-05-06T16:35:08+07:00 +tags: [book, review, search, programming, algorithm] +--- + +So I decided to review books as I write. As people say, you would understand +things better when you share it with each other. + +Each review will contain: + +- metadata: book name, author(s), ISBN, genres, language (please tell if there + are some more helpful information) +- summary: wrap up the content of the book; it should not no more than 5 + subsections of 150 words each +- comments: my thoughts on the book -- what I like and don't like about it + +# Metadata + +| Book | Relevant Search: With applications for Solr and Elasticsearch | +|---------|------------------------------| +| Authors | Doug Turnbull, John Berryman | +| ISBN | 9781617292774 | +| Genres | Programming | +| Language| English | + +# Summary +## The search relevance problem + +Given an increasingly large amount of information, it is infeasible for users +to retrieve what they needed. Relevance scoring is therefore essential for +search engines. + +In general, the relevance engineers have to identify the most important +features describing the content, the user, or the search query, transfer those +features to the search engine, then measure what's relevant to the search by +crafting signals and finally balance the weights of the signals to rank the +results. + +Unfortunately, it is a challenging problem. Each search application +serves a different type of content and thus has different expectation for +relevance. Consequently, there is no silver bullet to solve this problem. +Even the academic field that thoroughly study this problem, information +retrieval is not a one-size-fit all solution. Relevance is strongly tied with +the field and the application purpose. + +## Tackling the problem + +The book approaches the problem first by a top-down analysis of how a typical +search engine works. It then shows how a search query is processed by the +search engine. After providing basic knowledge of how search work, the authors +give some examples of relevance score tuning and show how it helps improving +the relevance of the search results. Not stopping at the technical view, the +authors also approach the problem from business view: they note that +interdiscipline collaboration is important in order to define and increase +relevance. + +# Comments + +## What I like + +The book approaches the problem from various views: business view, algorithmic +view, and practical view (giving examples). The book accentuates the diversity +of problems and thereby encouraging readers to critically think of their own +problems. While it suggests that search results should be influenced by +sponsors, it also notes that without balance that will as well lead to failure. + +## What I don't like + +Its structure is somewhat unclear and flow to me. I think some chapters can be +re-ordered so it's more logical. Also, I find weighing sponsors' priorities +over customers' unethical, but that is probably just a harsh truth in this +society rather than the authors' view. -- cgit 1.4.1