diff options
author | Nguyễn Gia Phong <mcsinyx@disroot.org> | 2022-01-09 21:47:22 +0700 |
---|---|---|
committer | Nguyễn Gia Phong <mcsinyx@disroot.org> | 2022-01-09 21:47:22 +0700 |
commit | 9fd639eff5e47e8e15776f1974f0fcb9337b12f6 (patch) | |
tree | 908b0f776a2b6cccf6ff8ef3a6643d7a7d3cc39f /blog | |
parent | cb35d1b5811aac349fd4d09bc3c0d666bd7ebeae (diff) | |
download | site-9fd639eff5e47e8e15776f1974f0fcb9337b12f6.tar.gz |
Offically introduce commenting
Diffstat (limited to 'blog')
-rw-r--r-- | blog/butter.md | 2 | ||||
-rw-r--r-- | blog/reply.md | 373 | ||||
-rw-r--r-- | blog/teredo.md | 2 | ||||
-rw-r--r-- | blog/threa.md | 2 |
4 files changed, 376 insertions, 3 deletions
diff --git a/blog/butter.md b/blog/butter.md index ec1af10..1009273 100644 --- a/blog/butter.md +++ b/blog/butter.md @@ -2,7 +2,7 @@ rss = "How I reinstalled NixOS on Btrfs with an amnesiac root and backed up my data" date = Date(2021, 11, 14) -tags = ["backup", "btrfs", "fun", "nixos"] +tags = ["backup", "btrfs", "fun", "nixos", "recipe"] +++ # NixOS on Btrfs+tmpfs diff --git a/blog/reply.md b/blog/reply.md new file mode 100644 index 0000000..609ab97 --- /dev/null +++ b/blog/reply.md @@ -0,0 +1,373 @@ ++++ +rss = "Comments for Static Sites without JavaScript via Emails" +date = Date(2022, 1, 9) +tags = ["fun", "recipe"] ++++ + +# Comments for Static Sites without JavaScripts + +> I'm open for criticism\ +> But really, is it any room for criticism? + +Recently, I've switched my [feed] reader from [Newsboat] to [Liferea]. +The latter has a GUI and some extra features which make the experience +a lot more comfy. For instance, custom enclosure handling lets me +to finally migrate all of my YouTube subscriptions to [Atom] and *conveniently* +browse and watch videos using [mpv]. Image support also allows me +to directly view web comics.[^image] One of them, [The Monster Under +the Bed][TMUTB],[^nsfw] does not embed the strips in its feed, but it +has comments. + +Yes, [RSS] includes support for `<comments>`, and I was not aware of it +until [very recently][spark]. I suppose many other people late to +the (web feed) party are neither. Since the rise of static sites, +feeds have regain popularity, even for [Google to reconsider +its direction][android]. Compare to RSS or Atom, alternatives have +the following shortcomings: + +* [Usenet] is generally obsolete to most people. +* [Mailing list] messages are immutable. +* Fora and social media are silos.[^silo] +* Social media are designed for ephemeral discussions. +* Instant messaging is awful for archival. + +On the other hand, news feeds are commonly read-only: only a few readers +can render comments and even fewer are able to post one. On the server side, +a dynamic server is needed to accept comments. Traditionally, it's the same +as the system serving the website. Although this works, it is significantly +more costly than a server dedicated to static sites, which scale a lot better. + +[Hackers] have came up with multiple workarounds such as using [microblogging] +or [instant messaging][cactus] to add comments to their static sites, +but all require client-side code execution, which is an option for neither RSS +nor Atom. Furthermore, [JavaScript hurts portability and performance][curlpit] +on the WWW, hence it should be avoided unless it is absolutely impossible +to implement a feature otherwise. Commenting is not an exception. + +Following is my adventure implementing a comment section for this very blog. +If you're also up to the task, I think you should view what I did +as an inspiration (rather than a reference) and don't be afraid +to experiment around until satisfaction. + +\toc + +## Choosing Back-End + +As mentioned earlier, static sites or not, there still needs to be +a dynamic component to accept incoming replies. HTTP requests would be +the most portable since all netizen obviously have a web browser, but those +are what we're trying to replace here. What else does everyone has nowadays? +Something so common that it can be used to identify people upon +service registrations? Exactly, emails and phone numbers! + +OK, Imma stop horsing around. My back-end of choice would be emails. +It's global, it's cheap and federated. Cellular services almost fit the bill, +except that they would cost an arm and leg for one to comment around the web +everyday via SMS, whose character limit is not facilitating thoughtful +discussions either. As for forum, social medium or instant messaging, +no platform has nearly as large of an user base as electronic mails. + +![HTML is often a trojan horse for JavaScript](/assets/html5-js.png) + +It's not like any email would fit the comment section though. Especially +not the HTML kind with a few hundred kilobytes of embedded CSS, JS +and non-content images. From the security standpoint alone 'tis already +a no-go. A light markup language like Markdown[^mime] would be much better. + +One great thing about using a mature technology like email is that we have +all use cases covered. Filtering, exporting and parsing emails work out-of-box +regardless of one's provider, [MUA] and programming preferences. I have +an SourceHut account with which I can create mailing lists on-demand +so I'm using it; however there's no reason exporting from your private inbox +is any more difficult, presuming you have set up [offline email]. + +!!! note "Tips and tricks" + + Speaking of SourceHut, exporting a mailing list archive is rather easy, + one could either use the button on the web UI or download from the API. + As the operation is not exactly cost-free, the former is protected + by a [CSRF] token and the latter by [OAuth 2.0]. If you are a fellow + [sr.ht] user, you can use [acurl] on the build service with the URL + from the [GraphQL] `query { me { lists { results { name, archive } } } }`. + +## Designing Data Flow + +I promise, this sounds bigger than it really is, but first, +let's have a glance at how static generators work. Typically, +there are three times templating happens: + +1. Conversion of individual articles into HTML *content* +2. Inserting each article content in a page template + to create a complete HTML document +3. Inserting multiple HTML contents into one RSS or Atom feed template + +At completion, two kinds of output are generated: website and web feed. +Similarly, comments have to be rendered for both targets: an HTML +comment section for web browsing and a separate RSS feed for each article's +`<wfw:commentRss>`.[^wfw] Therefore, injections should be done separately +at stage 2 and 3. The overall process of static site generation +with email comments is illustrated as follows. + +![Data transformation during generation process](/assets/formbox.svg) + +For clarity, HTML and RSS input templates for comments and their parent page +and web feed are omitted. Path to each *comment feed* output being injected +in the respective *web feed item* is also not shown in the figure. + +## Implementation + +At the time of writing, this personal website of mine was generated +by [Julia] [Franklin], who was neither fast[^speed] nor [semantic], +but was the only one I knew supporting LaTeX prerendering out of the box. +Franklin is also rather [extendable] via Julia functions. + +### Accepting Replies + +Let's start with how each article can be programmatically and uniquely +identified. By default in RSS, a [GUID][^guid] is the permanent URL +of the associated web page. I am not exactly a creative person, so I mirrored +this idea, although I only used the difference between URLs, i.e. minus +the scheme, network location and trailing `index.html` (Franklin always +appends it to the target path of any source file that is neither `index.md` +nor `index.html`): + +```julia +dir_url() = strip(dirname(locvar(:fd_url)), '/') +message_id() = "%3C$(dir_url())@cnx%3E" +``` + +For maximum portability, threading identification is used in emails' +`In-Reply-To` header, which expects a message ID, which must match +`<.+@.+>`. Once again, to avoid having to think, I opted for +the path difference for the left hand side and my nickname `cnx` +for the right. The `mailto` URI could be then be constructed accordingly: + +```julia +using Printf: @sprintf + +function hfun_mailto_comment() + @sprintf("mailto:%s?%s=%s&%s=Re: %s", + "~cnx/site@lists.sr.ht", + "In-Reply-To", message_id(), + "Subject", locvar(:title)) +end +``` + +The anchor was then added to the page foot: + +```html +<a href="{{mailto_comment}}" + title="Reply via email">{{author}}</a> +``` + +### Rendering Comments + +This is when the fun begins. Julia's standard library does not include +an email parser, and I doubt your favorite language does either, +unless it is named after a British comedy troupe. Python is often described +as *batteries included*, or at least it used to (seemingly the consensus among +current core devs has shifted towards [favoring third-party libraries][3rd]). + +!!! note "Off-topic rambling" + + Standard library inclusion wasn't really the deal breaker here though. + I still needed a Markdown engine and a HTML sanitizer (because Markdown + can include HTML), and AFAICT no stdlib has them. The read issue was + with the lack of Julia packaging on most distributions (apart from Guix), + and most certainly [not on NixOS], my current distro. For the same reason + the idea of rewriting Franklin in Python has been running in my head + for a while now. Python packaging is much more downstream-friendly + and unlike Julia compilation overhead is almost non-existent. + +On the other hand, it's trivial to pipe an external program's output to Julia, +e.g. ``readchomp(`echo foo bar`)`` would give you the string "foo bar". Thus, +the to-be-written *comment generator* should take (the path to) a mail box, +the message ID of the article and a template, and write the result to stdout. +Argument parsing is, again, thankfully in Python's stdlib: + +```python +from argparse import ArgumentParser +from pathlib import Path +from urllib.parse import unquote + +parser = ArgumentParser() +parser.add_argument('mbox') +parser.add_argument('id', type=unquote) +parser.add_argument('template', type=Path) +args = parser.parse_args() +``` + +I then parsed the [mbox] into a mapping indexed by parent message IDs +as follows. They would be HTML-unquoted so that was why I needed +to do the same for the input message ID. + +```python +from collections import defaultdict +from email.utils import parsedate_to_datetime +from mailbox import mbox + +date = lambda m: parsedate_to_datetime(m['Date']).date() +archive = defaultdict(list) +for message in sorted(mbox(args.mbox), key=date): + archive[message['In-Reply-To']].append(message) +``` + +As said earlier, arbitrary HTML content is not exactly suitable for comments. +However, it is undeniable that HTML emails have taken over the world +and compromises must be made: allowing `multipart/alternative` of both +`text/plain` and `text/html`. It is not the only multipart, so are +attachments and cryptographic signatures. Since we are only interested +in the plaintext part, it is actually easier done than said to extract it: + +```python +from bleach import clean, linkify +from markdown import markdown + +def get_body(message): + if message.is_multipart(): + for payload in map(get_body, message.get_payload()): + if payload is not None: return payload + elif message.get_content_type() == 'text/plain': + body = message.get_payload(decode=True) + return clean(linkify(body, output_format='html5')), + tags=..., protocols=...) + return None +``` + +Now all that's left is to render that body and relevant headers +as an HTML segment or an RSS item. This is when we revisit the template. +Jinja is probably the most popular in Python, thanks to Django and Flask, +but its complexity is rather unnecessary. Instead, I went with the built-in +`str.format`. + +![Double braces are brilliant, but I prefer single ones](/assets/format.jpg) + +What are templates for, exactly? Not the complete document, apparently, +because that would differs from article to article and increase the complexity +for injection. Neither a single comment, as comments are threaded into trees +(or a forest) and their relationship can be useful. We gotta [meet +in tha middle] and use recursive templates instead, e.g. for nested comments: + +```html +<div class=comment> + ... + {children} +</div> +``` + +To render linear comments, such as for `<wfw:commentRss>`, simply move +the children out of the item as follows. + +```xml +<item> + ... +</item> +{children} +``` + +The rest substitutions are mostly just extracted from the email's headers. +Another bit that needs some extra decisions, though, is the parameters +for the `mailto` URI to reply to each comment: + +* `In-Reply-To` set to current `Message-Id` +* `Cc` set to current `Reply-To` (if exists) or `From` +* `Subject` is inherited, with `Re:` prepended if missing + +This is getting boring with a lot of trivial code, so I'll leave you +with a pointer to the completed script named [formbox] and move on +to more interesting stuff. + +### Injecting Comments + +Inserting HTML comment sections is pretty simple. First I wrote a simple +Julia function `render_comments` calling `formbox` under the hood, then + +```julia +hfun_comments_rendered() = render_comments("comment.html") +``` + +`comments_rendered` is then injected below the article. For RSS, +it took an extra steps: + +1. Insert `render_comments("comment.xml")` to the comment feed template + `comments.xml` (notice they are two different templates) and write it + next to the article's output `index.html` +2. Insert the path of the written comment feed to the `<wfw:commentRss>` tag + in the article's feed item + +That's it! + +## Moderation + +I don't want a *Terms of Services* page, it'd feel too corporate +for my *personal* website, so I will list the rules here: + +1. Please be excellent to each other. Disagreements are okay, + personal insults are not. +2. Stay on topic. If you want to publicly discuss with me + about something else, start a new thread on a [mailing list] + or reach me via social media. +3. [Use plaintext emails] and do not top post. Markdown inline markups, + block quotes, lists and code blocks are supported. +4. Comments are implied to be under [CC BY-SA 4.0] unless declared otherwise. +5. I reserve the right to remove any comment I don't like. + I generally don't delete comments, but if you want to exercise + your freedom of speech, publish it yourself. +6. I do not warrant the availability of the comments either. + I will try my best but one day all comments may just disappear, + just like this website itself. Archive what you deem important. +7. These rules are subject to change according to my personal liking + without notice. + +Replies will only be rendered on the website and feed after I see them, +so please expect a delay of at least 24 hours. If you are eager to reply +to each other, subscribe to the [site's mailing list] instead. + +[^image]: TBF there are image preview scripts in Newsboat's [contrib]. +[^nsfw]: Content warning: occasionally NSFW +[^silo]: Federation is getting there for social media; not so much for fora. +[^mime]: But don't use [text/markdown] for your emails. +[^wfw]: Unfortunately there's no equivalence for Atom. +[^speed]: Over 30 seconds to generate a few hundred kB of web pages. +[^guid]: Not to be confused with the micro soft hijacked term for [UUID]. + +[feed]: https://en.wikipedia.org/wiki/Web_feed +[Newsboat]: https://newsboat.org +[Liferea]: https://lzone.de/liferea +[Atom]: https://en.wikipedia.org/wiki/Atom_(Web_standard) +[mpv]: https://mpv.io +[TMUTB]: https://themonsterunderthebed.net +[RSS]: https://www.rssboard.org/rss-specification +[spark]: https://nixnet.social/notice/AEO3fYbuzYCJl85eD2 +[android]: https://www.theregister.com/2021/05/20/google_rss_chrome_android +[Mailing list]: https://en.wikipedia.org/wiki/Mailing_list +[Usenet]: https://en.wikipedia.org/wiki/Usenet +[Hackers]: https://en.wikipedia.org/wiki/Hacker +[microblogging]: https://carlschwan.eu/2020/12/29/adding-comments-to-your-static-blog-with-mastodon +[cactus]: https://cactus.chat +[curlpit]: https://unixsheikh.com/articles/so-called-modern-web-developers-are-the-culprits.html +[MUA]: https://en.wikipedia.org/wiki/Email_client +[offline email]: https://drewdevault.com/2021/05/17/aerc-with-mbsync-postfix.html +[CSRF]: https://en.wikipedia.org/wiki/Cross-site_request_forgery +[OAuth 2.0]: https://man.sr.ht/meta.sr.ht/oauth.md +[sr.ht]: https://sr.ht +[acurl]: https://man.sr.ht/builds.sr.ht/manifest.md#tasks +[GraphQL]: https://lists.sr.ht/graphql +[wfw]: https://web.archive.org/web/20050301040756/http://www.sellsbrothers.com/spout/#exposingRssComments +[Julia]: https://julialang.org +[Franklin]: https://franklinjl.org +[semantic]: https://github.com/tlienart/Franklin.jl/issues/936 +[extendable]: https://franklinjl.org/syntax/utils +[GUID]: https://www.rssboard.org/rss-profile#element-channel-item-guid +[3rd]: https://discuss.python.org/t/adopting-recommending-a-toml-parser/4068 +[not on NixOS]: https://github.com/NixOS/nixpkgs/issues/20649 +[mbox]: https://datatracker.ietf.org/doc/html/rfc4155 +[meet in tha middle]: https://genius.com/Timbaland-meet-in-tha-middle-lyrics +[formbox]: https://sr.ht/~cnx/formbox +[Use plaintext emails]: https://useplaintext.email +[mailing list]: https://lists.sr.ht/~cnx/misc +[CC BY-SA 4.0]: https://creativecommons.org/licenses/by-sa/4.0 +[site's mailing list]: https://lists.sr.ht/~cnx/site +[contrib]: https://drewdevault.com/2020/06/06/Add-a-contrib-directory.html +[text/markdown]: https://blog.brixit.nl/markdown-email +[UUID]: https://en.wikipedia.org/wiki/Universally_unique_identifier diff --git a/blog/teredo.md b/blog/teredo.md index d283977..0140f4d 100644 --- a/blog/teredo.md +++ b/blog/teredo.md @@ -1,7 +1,7 @@ +++ rss = "Teredo tunnel simulation in virtual machines" date = Date(2020, 7, 3) -tags = ["fun", "ipv6", "tunnel"] +tags = ["fun", "ipv6", "recipe"] +++ # Teredo Tunnel Simulation diff --git a/blog/threa.md b/blog/threa.md index 4694a42..e0c9d70 100644 --- a/blog/threa.md +++ b/blog/threa.md @@ -1,7 +1,7 @@ +++ rss = "Raku's concision demonstrated in form of a tutorial" date = Date(2021, 7, 3) -tags = ["clipboard", "fun", "raku"] +tags = ["clipboard", "fun", "raku", "recipe"] +++ # Writing a Clipboard Manager |