+++ rss = "Comments for Static Sites without JavaScript via Emails" date = Date(2022, 1, 9) tags = ["fun", "recipe", "net"] +++ # Comments for Static Sites without JavaScripts > I'm open for criticism\ > But really, is it any room for criticism? Recently, I've switched my [feed] reader from [Newsboat] to [Liferea]. The latter has a GUI and some extra features which make the experience a lot more comfy. For instance, custom enclosure handling lets me to finally migrate all of my YouTube subscriptions to [Atom] and *conveniently* browse and watch videos using [mpv]. Image support also allows me to directly view web comics.[^image] One of them, [The Monster Under the Bed][TMUTB],[^nsfw] does not embed the strips in its feed, but it has comments. Yes, [RSS] includes support for ``, and I was not aware of it until [very recently][spark]. I suppose many other people late to the (web feed) party are neither. Since the rise of static sites, feeds have regain popularity, even for [Google to reconsider its direction][android]. Compare to RSS or Atom, alternatives have the following shortcomings: * [Usenet] is generally obsolete to most people. * [Mailing list] messages are immutable. * Fora and social media are silos.[^silo] * Social media are designed for ephemeral discussions. * Instant messaging is awful for archival. On the other hand, news feeds are commonly read-only: only a few readers can render comments and even fewer are able to post one. On the server side, a dynamic server is needed to accept comments. Traditionally, it's the same as the system serving the website. Although this works, it is significantly more costly than a server dedicated to static sites, which scale a lot better. [Hackers] have came up with multiple workarounds such as using [microblogging] or [instant messaging][cactus] to add comments to their static sites, but all require client-side code execution, which is an option for neither RSS nor Atom. Furthermore, [JavaScript hurts portability and performance][curlpit] on the WWW, hence it should be avoided unless it is absolutely impossible to implement a feature otherwise. Commenting is not an exception. Following is my adventure implementing a comment section for this very blog. If you're also up to the task, I think you should view what I did as an inspiration (rather than a reference) and don't be afraid to experiment around until satisfaction. \toc ## Choosing Back-End As mentioned earlier, static sites or not, there still needs to be a dynamic component to accept incoming replies. HTTP requests would be the most portable since all netizen obviously have a web browser, but those are what we're trying to replace here. What else does everyone has nowadays? Something so common that it can be used to identify people upon service registrations? Exactly, emails and phone numbers! OK, Imma stop horsing around. My back-end of choice would be emails. It's global, it's cheap and federated. Cellular services almost fit the bill, except that they would cost an arm and leg for one to comment around the web everyday via SMS, whose character limit is not facilitating thoughtful discussions either. As for forum, social medium or instant messaging, no platform has nearly as large of an user base as electronic mails. ![HTML is often a trojan horse for JavaScript](/assets/html5-js.png) It's not like any email would fit the comment section though. Especially not the HTML kind with a few hundred kilobytes of embedded CSS, JS and non-content images. From the security standpoint alone 'tis already a no-go. A light markup language like Markdown[^mime] would be much better. One great thing about using a mature technology like email is that we have all use cases covered. Filtering, exporting and parsing emails work out-of-box regardless of one's provider, [MUA] and programming preferences. I have an SourceHut account with which I can create mailing lists on-demand so I'm using it; however there's no reason exporting from your private inbox is any more difficult, presuming you have set up [offline email]. !!! note "Tips and tricks" Speaking of SourceHut, exporting a mailing list archive is rather easy, one could either use the button on the web UI or download from the API. As the operation is not exactly cost-free, the former is protected by a [CSRF] token and the latter by [OAuth 2.0]. If you are a fellow [sr.ht] user, you can use [acurl] on the build service with the URL from the [GraphQL] `query { me { lists { results { name, archive } } } }`. ## Designing Data Flow I promise, this sounds bigger than it really is, but first, let's have a glance at how static generators work. Typically, there are three times templating happens: 1. Conversion of individual articles into HTML *content* 2. Inserting each article content in a page template to create a complete HTML document 3. Inserting multiple HTML contents into one RSS or Atom feed template At completion, two kinds of output are generated: website and web feed. Similarly, comments have to be rendered for both targets: an HTML comment section for web browsing and a separate RSS feed for each article's ``.[^wfw] Therefore, injections should be done separately at stage 2 and 3. The overall process of static site generation with email comments is illustrated as follows. ![Data transformation during generation process](/assets/formbox.svg) For clarity, HTML and RSS input templates for comments and their parent page and web feed are omitted. Path to each *comment feed* output being injected in the respective *web feed item* is also not shown in the figure. ## Implementation At the time of writing, this personal website of mine was generated by [Julia] [Franklin], who was neither fast[^speed] nor [semantic], but was the only one I knew supporting LaTeX prerendering out of the box. Franklin is also rather [extendable] via Julia functions. ### Accepting Replies Let's start with how each article can be programmatically and uniquely identified. By default in RSS, a [GUID][^guid] is the permanent URL of the associated web page. I am not exactly a creative person, so I mirrored this idea, although I only used the difference between URLs, i.e. minus the scheme, network location and trailing `index.html` (Franklin always appends it to the target path of any source file that is neither `index.md` nor `index.html`): ```julia dir_url() = strip(dirname(locvar(:fd_url)), '/') message_id() = "%3C$(dir_url())@cnx%3E" ``` For maximum portability, threading identification is used in emails' `In-Reply-To` header, which expects a message ID, which must match `<.+@.+>`. Once again, to avoid having to think, I opted for the path difference for the left hand side and my nickname `cnx` for the right. The `mailto` URI could be then be constructed accordingly: ```julia using Printf: @sprintf function hfun_mailto_comment() @sprintf("mailto:%s?%s=%s&%s=Re: %s", "~cnx/site@lists.sr.ht", "In-Reply-To", message_id(), "Subject", locvar(:title)) end ``` The anchor was then added to the page foot: ```html {{author}} ``` ### Rendering Comments This is when the fun begins. Julia's standard library does not include an email parser, and I doubt your favorite language does either, unless it is named after a British comedy troupe. Python is often described as *batteries included*, or at least it used to (seemingly the consensus among current core devs has shifted towards [favoring third-party libraries][3rd]). !!! note "Off-topic rambling" Standard library inclusion wasn't really the deal breaker here though. I still needed a Markdown engine and a HTML sanitizer (because Markdown can include HTML), and AFAICT no stdlib has them. The read issue was with the lack of Julia packaging on most distributions (apart from Guix), and most certainly [not on NixOS], my current distro. For the same reason the idea of rewriting Franklin in Python has been running in my head for a while now. Python packaging is much more downstream-friendly and unlike Julia compilation overhead is almost non-existent. On the other hand, it's trivial to pipe an external program's output to Julia, e.g. ``readchomp(`echo foo bar`)`` would give you the string "foo bar". Thus, the to-be-written *comment generator* should take (the path to) a mail box, the message ID of the article and a template, and write the result to stdout. Argument parsing is, again, thankfully in Python's stdlib: ```python from argparse import ArgumentParser from pathlib import Path from urllib.parse import unquote parser = ArgumentParser() parser.add_argument('mbox') parser.add_argument('id', type=unquote) parser.add_argument('template', type=Path) args = parser.parse_args() ``` I then parsed the [mbox] into a mapping indexed by parent message IDs as follows. They would be HTML-unquoted so that was why I needed to do the same for the input message ID. ```python from collections import defaultdict from email.utils import parsedate_to_datetime from mailbox import mbox date = lambda m: parsedate_to_datetime(m['Date']).date() archive = defaultdict(list) for message in sorted(mbox(args.mbox), key=date): archive[message['In-Reply-To']].append(message) ``` As said earlier, arbitrary HTML content is not exactly suitable for comments. However, it is undeniable that HTML emails have taken over the world and compromises must be made: allowing `multipart/alternative` of both `text/plain` and `text/html`. It is not the only multipart, so are attachments and cryptographic signatures. Since we are only interested in the plaintext part, it is actually easier done than said to extract it: ```python from bleach import clean, linkify from markdown import markdown def get_body(message): if message.is_multipart(): for payload in map(get_body, message.get_payload()): if payload is not None: return payload elif message.get_content_type() == 'text/plain': body = message.get_payload(decode=True) return clean(linkify(body, output_format='html5')), tags=..., protocols=...) return None ``` Now all that's left is to render that body and relevant headers as an HTML segment or an RSS item. This is when we revisit the template. Jinja is probably the most popular in Python, thanks to Django and Flask, but its complexity is rather unnecessary. Instead, I went with the built-in `str.format`. ![Double braces are brilliant, but I prefer single ones](/assets/format.jpg) What are templates for, exactly? Not the complete document, apparently, because that would differs from article to article and increase the complexity for injection. Neither a single comment, as comments are threaded into trees (or a forest) and their relationship can be useful. We gotta [meet in tha middle] and use recursive templates instead, e.g. for nested comments: ```html
... {children}
``` To render linear comments, such as for ``, simply move the children out of the item as follows. ```xml ... {children} ``` The rest substitutions are mostly just extracted from the email's headers. Another bit that needs some extra decisions, though, is the parameters for the `mailto` URI to reply to each comment: * `In-Reply-To` set to current `Message-Id` * `Cc` set to current `Reply-To` (if exists) or `From` * `Subject` is inherited, with `Re:` prepended if missing This is getting boring with a lot of trivial code, so I'll leave you with a pointer to the completed script named [formbox] and move on to more interesting stuff. ### Injecting Comments Inserting HTML comment sections is pretty simple. First I wrote a simple Julia function `render_comments` calling `formbox` under the hood, then ```julia hfun_comments_rendered() = render_comments("comment.html") ``` `comments_rendered` is then injected below the article. For RSS, it took an extra steps: 1. Insert `render_comments("comment.xml")` to the comment feed template `comments.xml` (notice they are two different templates) and write it next to the article's output `index.html` 2. Insert the path of the written comment feed to the `` tag in the article's feed item That's it! ## Moderation I don't want a *Terms of Services* page, it'd feel too corporate for my *personal* website, so I will list the rules here: 1. Please be excellent to each other. Disagreements are okay, personal insults are not. 2. Stay on topic. If you want to publicly discuss with me about something else, start a new thread on a [mailing list] or reach me via social media. 3. [Use plaintext emails] and do not top post. Markdown inline markups, block quotes, lists and code blocks are supported. 4. Comments are implied to be under [CC BY-SA 4.0] unless declared otherwise. 5. I reserve the right to remove any comment I don't like. I generally don't delete comments, but if you want to exercise your freedom of speech, publish it yourself. 6. I do not warrant the availability of the comments either. I will try my best but one day all comments may just disappear, just like this website itself. Archive what you deem important. 7. These rules are subject to change according to my personal liking without notice. Replies will only be rendered on the website and feed after I see them, so please expect a delay of at least 24 hours. If you are eager to reply to each other, subscribe to the [site's mailing list] instead. [^image]: TBF there are image preview scripts in Newsboat's [contrib]. [^nsfw]: Content warning: occasionally NSFW [^silo]: Federation is getting there for social media; not so much for fora. [^mime]: But don't use [text/markdown] for your emails. [^wfw]: Unfortunately there's no equivalence for Atom. [^speed]: Over 30 seconds to generate a few hundred kB of web pages. [^guid]: Not to be confused with the micro soft hijacked term for [UUID]. [feed]: https://en.wikipedia.org/wiki/Web_feed [Newsboat]: https://newsboat.org [Liferea]: https://lzone.de/liferea [Atom]: https://en.wikipedia.org/wiki/Atom_(Web_standard) [mpv]: https://mpv.io [TMUTB]: https://themonsterunderthebed.net [RSS]: https://www.rssboard.org/rss-specification [spark]: https://nixnet.social/notice/AEO3fYbuzYCJl85eD2 [android]: https://www.theregister.com/2021/05/20/google_rss_chrome_android [Mailing list]: https://en.wikipedia.org/wiki/Mailing_list [Usenet]: https://en.wikipedia.org/wiki/Usenet [Hackers]: https://en.wikipedia.org/wiki/Hacker [microblogging]: https://carlschwan.eu/2020/12/29/adding-comments-to-your-static-blog-with-mastodon [cactus]: https://cactus.chat [curlpit]: https://unixsheikh.com/articles/so-called-modern-web-developers-are-the-culprits.html [MUA]: https://en.wikipedia.org/wiki/Email_client [offline email]: https://drewdevault.com/2021/05/17/aerc-with-mbsync-postfix.html [CSRF]: https://en.wikipedia.org/wiki/Cross-site_request_forgery [OAuth 2.0]: https://man.sr.ht/meta.sr.ht/oauth.md [sr.ht]: https://sr.ht [acurl]: https://man.sr.ht/builds.sr.ht/manifest.md#tasks [GraphQL]: https://lists.sr.ht/graphql [wfw]: https://web.archive.org/web/20050301040756/http://www.sellsbrothers.com/spout/#exposingRssComments [Julia]: https://julialang.org [Franklin]: https://franklinjl.org [semantic]: https://github.com/tlienart/Franklin.jl/issues/936 [extendable]: https://franklinjl.org/syntax/utils [GUID]: https://www.rssboard.org/rss-profile#element-channel-item-guid [3rd]: https://discuss.python.org/t/adopting-recommending-a-toml-parser/4068 [not on NixOS]: https://github.com/NixOS/nixpkgs/issues/20649 [mbox]: https://datatracker.ietf.org/doc/html/rfc4155 [meet in tha middle]: https://genius.com/Timbaland-meet-in-tha-middle-lyrics [formbox]: https://sr.ht/~cnx/formbox [Use plaintext emails]: https://useplaintext.email [mailing list]: https://lists.sr.ht/~cnx/misc [CC BY-SA 4.0]: https://creativecommons.org/licenses/by-sa/4.0 [site's mailing list]: https://lists.sr.ht/~cnx/site [contrib]: https://drewdevault.com/2020/06/06/Add-a-contrib-directory.html [text/markdown]: https://blog.brixit.nl/markdown-email [UUID]: https://en.wikipedia.org/wiki/Universally_unique_identifier