Blog about px.cnx.gdn

author: Nguyễn Gia Phong <mcsinyx@disroot.org> 2023-03-17 21:32:11 +0900
committer: Nguyễn Gia Phong <mcsinyx@disroot.org> 2023-03-17 21:32:11 +0900
commit: f4835fb34397f54bf98325a3bd25acfb81d1d1c3 (patch)
tree: ee1601ba73db6bbe6399c4573f274ae095896711 /blog
parent: 1c0d20e291b0ea077b1e62d8d9b69e4951e96a42 (diff)
download: site-f4835fb34397f54bf98325a3bd25acfb81d1d1c3.tar.gz
2 files changed, 551 insertions, 1 deletions
diff --git a/blog/dedep.md b/blog/dedep.md
index a77dee7..c251880 100644
--- a/blog/dedep.md
+++ b/blog/dedep.md
@@ -132,7 +132,7 @@ join me in a De-Dependency December and fight for the users!
 [Fead]: https://trong.loang.net/~cnx/fead
 [SSG]: https://en.wikipedia.org/wiki/Static_site_generator
 [power]: https://www.youtube.com/watch?v=3Mpyias9ek4
-[context]: https://media.handmade-seattle.com/context-is-everything
+[context]: https://guide.handmade-seattle.com/c/2021/context-is-everything
 [standards]: https://xkcd.com/927
 [utilities libraries]: https://raku-advent.blog/2021/12/11/unix_philosophy_without_leftpad_part2
 [old]: https://wiki.debian.org/DontBreakDebian#Don.27t_suffer_from_Shiny_New_Stuff_Syndrome
diff --git a/blog/pixml.md b/blog/pixml.md
new file mode 100644
index 0000000..79b8129
--- /dev/null
+++ b/blog/pixml.md
@@ -0,0 +1,550 @@
++++
+rss = "Comments for Static Sites without JavaScript via Emails"
+date = Date(2023, 3, 17)
+tags = ["fun", "recipe"]
++++
+
+# XML and Photo Gallery Generation: A Love Story
+
+> I'm just a language, whose style sheets are good\
+> Oh, Lord, please, don't let me be misunderstood
+
+!!! note "Tips"
+
+    As usual, the article starts with a text wall of random rambling.
+    If you are only interested in the technical aspects, feel free to skip
+    the first two sections.
+
+\toc
+
+## Introduction
+
+Neural-optic live streaming probably, no, definitely offers
+the most photorealistic graphics one can set eyes on.  [CGI] is just
+a pathetic mimic, and photography or videography is no more
+than a poor plagiarism attempt when compared to quantum ray-tracing
+and other advanced physics simulations^W happenings.
+
+On the other hand, we humen are rather shite at replaying visual memories,
+whilst ([bit rot] aside) media can be archived [for forever].  Besides,
+many of us are too busy to *touch grass* or go see cool things
+as regularly as we wish to.  This is how an industry based on showing us
+[mundane stuff] or [obvious bullcrap] can still manage to make tens
+of thousands of [craploads] each year any why the interwebs are flooded
+with pictures of cats, kitties and pussies.
+
+Finding new shits means dopamine dispensation and that's why
+[they are dope][new is always better].  As a model netizen, I adhere
+to the web's social contract of mutual [shitposting] so that everyone
+can have a piece.  Every blue moon, I also enjoy posting more quality
+stuff like what you are reading right now, should you ignore the number
+of [Mozart] references in the last three paragraphs.
+
+## Motivation
+
+Some other times, I also want to share the living things and sceneries
+I encounter in the [new][move] place.  My camera was gifted by father
+before I moved and yet I shared more photos [with strangers][pixelfed]
+than with my family.  The PixelFed instance I landed on irreversibly
+shrank and lossily compressed them, while dumping 5 MB images to the family
+chat room just feels weird, hence I decided to gather the decency
+to build a photo gallery to show my loved ones (and admittedly,
+flex with online strangers).
+
+There are not many [CMS] in the wild for photo hosting, which are
+often either acts as a wall garden and/or a social network.
+Building and hosting a new one is quite overkill, thus the obvious
+solution left would be generating a static site.  Out of the gazillion [SSG],
+I couldn't found any that meets the my requirements:
+
+1. Generate a [web feed]
+2. Automate filling [image] title and alt text
+3. Offer fine-grain control for permanent [pagination]
+4. Generate thumbnails with custom size and name
+
+I mean, they perhaps exist, but the number I had to try and fight through
+would cost more time than writing the web pages and feed by hand.
+So I wrote them from scratch.  Y'all can stand up and clap now!
+
+## Preliminary
+
+Yes, I really started with writing [XHTML] and [Atom] by hand.
+A web page has the following structure with namespaces omitted
+and denoted in WXML ([Wisp]$\times$[SXML]) so I don't have
+to close the tags (have I given up on XML too early?-).
+
+!!! note "Syntax hints"
+
+    For the uninitiated, any indentation or colon in Wisp represents
+    an additional nest level, while a dot escape the nesting.  The at signs
+    are used by SXML to denote attributes, which may remind you of [XPath].
+    For example, the anchor to the previous page is `<a href=41>PREV</a>`.
+
+```
+html
+  head
+    link
+      @ : rel "alternate"
+          type "application/atom+xml"
+          href "/atom.xml"
+    ...
+  body
+    nav
+      a : @ : href "41"
+        . "PREV"
+      h1 "PAGE 42"
+      a : @ : href "43"
+        . "NEXT"
+    article
+      @ : id "foobar"
+      h2
+        a : @ : href "#foobar"
+          . "foobar"
+      a : @ : href "/42/foo.jpg"
+          img
+            @ : src "/42/foo.small.jpg"
+                alt "pic of foo"
+                title "pic of foo"
+      a : @ : href "/42/bar.jpg"
+          ...
+    article ...
+    ...
+    footer ...
+```
+
+So far, adding an `article` is not yet too cumbersome, there's only a bit
+of redundancy for permanent links and the nesting level is acceptable
+with the deepest being `/html/body/article/a/img`.  It gets more repetitive
+once we publish it to to the linked Atom feed:
+
+```
+feed
+  entry
+    link
+      @ : rel "alternate"
+          type "application/xhtml+xml"
+          href "https://gallery.example/42/#foobar"
+    id "https://gallery.example/42/#foobar"
+    title "foobar"
+    content
+      @ : type "xhtml"
+      div
+        img
+          @ : src "https://gallery.example/42/foo.jpg"
+              alt "pic of foo"
+              title "pic of foo"
+        img ...
+    updated ...
+  entry ...
+  ...
+```
+
+Since web feeds are standalone documents, they must always use absolute URLs.
+(Welp that's not entirely true, [XML Base] does exists, but not all readers
+support it, and more importantly, certain elements such as `atom:id` disallow
+relative references.)  In addition, whilst the web page links a thumbnail
+to the original image to save bandwidths, the feed can be consumed one post
+at a time, which thus points to the full size version.  Therefore,
+copying the markup to embed it inside the Atom is error-prone and doesn't
+exactly spark joy.
+
+!!! note "Fun fact"
+
+    What does spark joy is that we can embed XHTML directly into the web feed,
+    which means the content is still XML and we don't need to quote it in CDATA.
+    For other sites where contents don't accumulate up to hundreds of megabytes,
+    this will allow us to slap some (SPOILER ALERT!) stylesheet on the Atom feed
+    and let the user agent render it in a [human-readable form][XSL].
+
+## Approach
+
+I actually already spoiled it in the epigraph,[^spoiler] but for the sake
+of completeness let us [discuss a few possible solutions][efficiency].
+What I wanted was to reduce the redundancy of manual input, in other words,
+a system transforming a custom information-dense format to standard
+yet sparser ones, which in this case are XHTML and Atom.  Given some new photos
+and their relevant data, the purpose was to minimize the publishing friction.
+
+It's worth mentioning that the goal was not to minimize the input format,
+the transformation speed, or feedback latency, but all of the above,
+plus the cost of constructing the tool, incrementally as our requirements
+slightly changes over time.  Our choice for the base [programming system]
+shall affect each and every of these aspects and more.
+
+Some technical dimensions are more equal than others, though.
+For this use case, IMHO immediate feedback loop should be given
+the number one priority, not only because it'd be frustrating
+to have to complete multiple rituals just to preview the changes,
+but also as watching and reflecting file system changes is (sadly still)
+a difficult problem.
+
+For Linux[^interjection] there's [inotify] which doesn't suck,
+except when it does and misses events, and the standard POSIX build tool
+[make] relies on [mtime which is also flaky][mtime].  Some SSG
+work around this by spawning up a server with more sophisticated
+caching mechanism and even include a HTTP server sending out refresh events.
+Implementing such system is easily [more expensive][automation] than doing
+the original task manually.
+
+Luckily, there is another way.  *After* the birth of imperative
+DOM manipulation programs running on VM inside browsers (Ecma scripts),
+there came a (now forgotten) art of purely functional DOM transformation.
+More specifically, [XSLT] can declaratively transform any XML document
+to another, and its best part is that modern browser native support it,
+i.e. there's no difference between editing the input document
+and the hypothetical output XHTML.  For better portability
+and rendering performance, we can still generate the latter
+ahead-of-time (AoT) during deployment.
+
+## Implementation
+
+Going back to the example, the input format could boil down
+to a more concise XML file, e.g. `42/index.xml`:
+
+```
+page
+  @ : prev "41"
+      curr "42"
+      next "43"
+  post : @ : title "foobar"
+         picture
+           @ : filename "foo"
+               desc "pic of foo"
+         picture ...
+         ...
+         time ...
+  post ...
+  ...
+```
+
+### Page Generation
+
+The stylesheet should then be declared at the beginning of the file,
+so that the user agent can automatically fetch and apply it
+to render the output XHML:
+
+```
+<?xml-stylesheet href="/page.xslt" type="text/xsl"?>
+```
+
+XSLT is essentially a templating language, similar to PHP (which is also older)
+and template libraries in your favorite languages.  For the ease of reading,
+I will let the target document's namespace be the default, while aliasing
+the transformation one as `xsl`.  The stylesheet for the web pages would
+look something like the following, which should be self-explanatory.
+
+```
+xsl:stylesheet
+  xsl:template : @ : match "/page"
+    xsl:variable : @ : name "base"
+      xsl:text "/"
+      xsl:value-of : @ : select "@curr"
+      xsl:text "/"
+    html
+      head ...
+      body
+        nav
+          xsl:if : @ : test "@prev != ''"
+            a : @ : href "/{@prev}/"
+              . "PREV"
+          h1 : xsl:text "PAGE "
+               xsl:value-of : @ : select "@curr"
+          xsl:if : @ : test "@next != ''"
+            ...
+        xsl:for-each : @ : select "post"
+          xsl:variable : @ : name "id"
+            xsl:value-of
+              @ : select "translate(@title, ' ', '-')"
+          article
+            @ : id "{$id}"
+            h2
+              a : @ : href "#{$id}"
+                  xsl:value-of : @ : select "@title"
+            xsl:for-each : @ : select "picture"
+              a : @ : href "{$base}{@filename}.jpg"
+                  img
+                    @ : src "{$base}{@filename}.small.jpg"
+                        alt "{@desc}"
+                        title "{@desc}"
+        footer ...
+```
+
+### Feed Generation
+
+Similarly, for Atom entries on a single page,
+
+```
+xsl:stylesheet
+  xsl:variable : @ : name "root"
+    . "https://gallery.example/"
+  xsl:template : @ : match "/page"
+    xsl:variable : @ : name "base"
+      xsl:value-of : @ : select "$root"
+      xsl:value-of : @ : select "@curr"
+      xsl:text "/"
+    xsl:for-each : @ : select "post"
+      xsl:variable : @ : name "url"
+        xsl:value-of : @ : select "$base"
+        xsl:text "#"
+        xsl:value-of
+          @ : select "translate(@title, ' ', '-')"
+      entry
+        link
+          @ : rel "alternate"
+              type "application/xhtml+xml"
+              href "{$url}"
+        id : xsl:value-of : @ : select "$id"
+        title : xsl:value-of : @ : select "@title"
+        content
+          @ : type "xhtml"
+          div
+            xsl:for-each : @ : select "picture"
+              img
+                @ : src "{$base}{@filename}.jpg"
+                    alt "{@desc}"
+                    title "{@desc}"
+        updated : xsl:value-of : @ : select "time"
+```
+
+The trickier part here is concatenating the entries together.
+Simple enough, instead of linking to the stylesheet in the data,
+we can read XML files directly from XSLT.
+
+```
+xsl:template
+  @ : match "/"
+  ...
+  xsl:apply-templates
+    @ : select "document('42/index.xml')/page"
+  xsl:apply-templates ...
+  ...
+```
+
+This allows us to do other cool things, such as embedding SVG in XHTML
+to make use of the parent element's [currentcolor], while keeping
+the source files separate.  It is especially useful for monochromatic icons,
+e.g.
+
+```
+xsl:copy-of : @ : select "document('cc.svg')/*"
+xsl:copy-of : @ : select "document('by.svg')/*"
+xsl:copy-of : @ : select "document('sa.svg')/*"
+```
+
+### Thumbnail Generation
+
+So far, we have met three out of the [four requirements](#motivation),
+only thing left is creating the thumbnails.  Inspired by Ethan Dalool,
+I am going for [fairly large ones of 1024 px in width][big thumbs],
+
+> large enough to comfortably browse the photos without clicking through
+> to the big version of each, and the thumbnails are decently light
+> and not too jpeggy at about 125-150 kilobytes on average.
+
+At such size, I can aim for around ten photoes[^toes] per page
+while maintaining a somewhat decent load time.  Plus, since the width
+of images are hardcoded, page [margin] could be automatically inferred
+to never stretch them.
+
+```css
+html {
+    box-sizing: border-box;
+    margin: auto;
+    max-width: calc(1024px + 2ch);
+}
+body { margin: 0 1ch }
+```
+
+To generate the thumbnails, I use [epeg] together with `make` for wildcarding:
+
+```
+PICTURES := $(filter-out %.small.jpg $(PREFIX)/%.jpg, $(wildcard */*.jpg))
+THUMBNAILS := $(patsubst %.jpg,%.small.jpg,$(PICTURES))
+
+%.small.jpg: %.jpg
+	epeg -w 1024 -p -q 80 $< $@
+```
+
+The Makefile also define rules for AoT compilation using [xsltproc]
+for the web pages and feed.  Apparently no feed reader supports XSLT,
+and for pages runtime processing negatively affect the performance
+due to the multiple round trips for the stylesheet and the vector icons.
+
+```
+DATA := $(wildcard */index.xml) index.xml
+PAGES := $(patsubst %.xml,%.xhtml,$(DATA))
+OUTPUTS := $(THUMBNAILS) $(PAGES) atom.xml
+
+all: $(OUTPUTS)
+
+index.xml: $(LATEST)/index.xml
+	ln -fs $< $@
+
+%.xhtml: %.xml page.xslt
+	xsltproc page.xslt $< > $@
+
+atom.xml: atom.xslt $(DATA) $(wildcard *.svg)
+	xsltproc atom.xslt > atom.xml
+```
+
+The [full implementation][src] is deployed to [px.cnx.gdn],
+mirrored to the [OpenNIC] domain [pix.sinyx.indy] reusing
+the former's TLS certificate, because CA/Browser Forum
+disallows support for domains not recognized by ICANN and no
+[CA for OpenNIC] is mature enough.
+
+## Discussion
+
+> *Okay you built your site using XML macros, so what?
+> The syntax is clunky and you hate it so much yourself
+> that not even a single line of code example here is in actual XML.
+> Doesn't seem like a love story to me!*
+
+Like all relationships, it's not that simple.  I've learned to not judge
+a book by its cover and come to the understanding that XML is the (ugly)
+equivalence of [sexp].[^sex]  Unlike afterthoughts such as C preprocessors,
+[Django]-like templates, or even the Wisp-lookalike syntax of [Slim],
+XML stylesheets is in the same data structure.  To put it another way,
+one can use XSLT to generate XSLT from XSLT.  Do I need it in this case
+or ever at all?  Probably not, but that certainly makes XSL a lot more
+attractive in my eyes.
+
+Furthermore, the tooling for XML is highly mature, from editors to linters
+and processors to rendering engines.  It'd be lying to say you ain't
+fascinated that tis possible to directly feed browsers pure data
+instead of markup representations.  More than that, one can have
+entirely static API endpoints that are both human- and machine-readable.
+
+> *XSL is just declarative JS!  You are so blinded
+> by your lust for functional programming that you have
+> become [the very thing you swore to destroy](/blog/reply)!*
+
+My distaste for Ecma scripts is not due to DOM manipulation.
+Sure, I do find in-place modification inelegant for documents,
+but if only that's the only issue.  I block them on most sites
+because they can interact with many things other than just the DOM,
+imposing [privacy] and [security] risks while [fucking up the UX].
+
+Architecturally, Ecma scripts enable the absolute bloody worst possible
+kind of web pages with zero data at all, fetching tiny pieces of content
+in JSON and turn performance [to shit].  The user agents then try to salvage
+efficiency by turning themselves into a distributed system component
+and adding optimizations that shall never be (ab)used for the sake of users.
+O ye [cycle of doom]!
+
+Note that one can make a similar mistake with XSL regarding the number
+of round trips, and XML stylesheets can provide the same front-end/back-end
+separation.  Both can be used to provide hot loading during development
+and AoT rendering in production (if not all, then many JS libraries support
+pre-rendering, ignoring the monstrous [dependency graph](/blog/dedep)).
+At the end of the day, it's not the matter of technology but principle:
+to be the [users' best interest].
+
+> *There is nothing complex about the photo gallery,
+> any existing SSG can do the same with minor tweaks!
+> You never needed to write a new one to begin with!*
+
+I am wondering the same myself, but keep in mind there are details
+I've been hiding from in the example.  I went all-in for the semantic web
+with the hope for best portability and accessibility.  One thing
+I haven't mentioned is the `lang` attribute, e.g. `en`, `vi` or `fr`
+depending on the post.  Adding this to the web pages requires the SSG
+to be somewhat modular, and even harder for the web feed.
+
+Moreover, generic SSG are not designed to handle the difference
+in content between a page's `article` and the feed's corresponding `entry`,
+neither for having multiple posts in a single page.  Pagination is
+also commonly implemented backwards, i.e. page 2 being the second latest one,
+making it impossible to avoid link rot.
+
+Not to suggest that the majority of SSG are poorly designed, just that
+from a certain amount of [context] difference, tis cheaper to just redesign
+from scratch.  This is not about XSL vs Go/Python/JS for SSG or web dev
+in general, but this specific and happen-to-be-far-from-complex case.
+
+## Conclusion
+
+At the time of writing, XML has pretty much been superseded by JSON or YAML,
+for the better or worse.  I have no love for YAML for obvious reasons,
+but it also saddens me to sometimes see JSON being solely used as a container
+for HTML.  I hope that this essay can [awaken something in you] about XML
+and remind you about the semantic web in your next project.  It worked out
+for me, maybe it'll work out for you too!
+
+The story between XML and my photo gallery is a fond love story.
+They were born for each other, there was no drama, everything just werkt.
+Their romance inspire me to better appreciate stability and maturity,
+and value those right in front of my eyes yet I had been *too blind to see*.
+Anyway, this is getting too long, so Imma end it with another [song].
+
+> Lookin' for perfect\
+> Surrounded by artificial\
+> You're the closest thing to real I've seen\
+> Sure, everyone has their problems\
+> That's a given\
+> Yours are the easiest to tolerate
+
+[^spoiler]: If you know, you know.
+[^interjection]: Yup, just the kernel.
+[^toes]: *Thumb*nails, pho*toes*, get it?-)
+[^sex]: Or conventionally in most Lisp 1's, `sex?`.
+
+[CGI]: https://en.wikipedia.org/wiki/Computer-generated_imagery
+[bit rot]: https://en.wikipedia.org/wiki/Data_degradation
+[for forever]: https://xkcd.com/1683
+[mundane stuff]: https://en.wikipedia.org/wiki/Drama
+[obvious bullcrap]: https://en.wikipedia.org/wiki/Fiction
+[craploads]: https://antifandom.com/how-i-met-your-mother/wiki/Crapload
+[new is always better]: https://www.youtube.com/watch?v=1SNRULEnTVQ
+[shitposting]: https://fe.disroot.org/@mcsinyx
+[Mozart]: https://peervideo.club/w/uByA7Czy7PWYMqnu8FgXvW
+
+[move]: https://github.com/zig-community/user-map/pull/120
+[pixelfed]: https://fotofed.nl/cnx
+[CMS]: https://en.wikipedia.org/wiki/Content_management_system
+[SSG]: https://en.wikipedia.org/wiki/Static_site_generator
+[web feed]: https://en.wikipedia.org/wiki/Web_feed
+[image]: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/Img
+[pagination]: https://en.wikipedia.org/wiki/Pagination
+
+[XHTML]: https://en.wikipedia.org/wiki/XHTML
+[Atom]: https://www.rfc-editor.org/rfc/rfc4287
+[Wisp]: https://www.draketo.de/software/wisp
+[SXML]: https://okmij.org/ftp/Scheme/SXML.html
+[XPath]: https://www.w3.org/TR/xpath
+[XML Base]: https://www.w3.org/TR/xmlbase
+[XSL]: https://simonesilvestroni.com/blog/build-a-human-readable-rss-with-jekyll
+
+[efficiency]: https://xkcd.com/1445
+[programming system]: https://programming-journal.org/2023/7/13
+[inotify]: https://man7.org/linux/man-pages/man7/inotify.7.html
+[make]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/make.html
+[mtime]: https://apenwarr.ca/log/20181113
+[automation]: https://xkcd.com/1319
+[XSLT]: https://www.w3.org/standards/xml/transformation
+
+[currentcolor]: https://developer.mozilla.org/en-US/docs/Web/CSS/color_value#currentcolor_keyword
+[big thumbs]: https://voussoir.net/writing/sharing_photos
+[epeg]: https://github.com/mattes/epeg
+[margin]: https://en.wikipedia.org/wiki/Margin_(typography)
+[xsltproc]: https://gnome.pages.gitlab.gnome.org/libxslt/xsltproc.html
+[src]: https://trong.loang.net/~cnx/px
+[px.cnx.gdn]: https://px.cnx.gdn
+[OpenNIC]: https://www.opennic.org
+[pix.sinyx.indy]: https://pix.sinyx.indy
+[CA for OpenNIC]: https://wiki.opennic.org/opennic/tls
+
+[sexp]: https://en.wikipedia.org/wiki/S-expression
+[Django]: https://docs.djangoproject.com/en/dev/topics/templates
+[Slim]: https://github.com/slim-template/slim
+[privacy]: https://en.wikipedia.org/wiki/Mouse_tracking
+[security]: https://react-etc.net/entry/exploiting-speculative-execution-meltdown-spectre-via-javascript
+[fucking up the UX]: https://meta.stackexchange.com/q/2980/698165
+[to shit]: https://unixsheikh.com/articles/so-called-modern-web-developers-are-the-culprits.html
+[cycle of doom]: https://en.wikipedia.org/wiki/Wirth%27s_law
+[users' best interest]: https://pluralistic.net/2023/01/21/potemkin-ai/#hey-guys
+[link rot]: https://en.wikipedia.org/wiki/Link_rot
+[context]: https://guide.handmade-seattle.com/c/2021/context-is-everything
+
+[awaken something in you]: https://www.youtube.com/watch?v=F3QPWrLFsOA
+[song]: https://www.youtube.com/watch?v=5LvOdWi3Qno
author	Nguyễn Gia Phong <mcsinyx@disroot.org>	2023-03-17 21:32:11 +0900
committer	Nguyễn Gia Phong <mcsinyx@disroot.org>	2023-03-17 21:32:11 +0900
commit	f4835fb34397f54bf98325a3bd25acfb81d1d1c3 (patch)
tree	ee1601ba73db6bbe6399c4573f274ae095896711 /blog
parent	1c0d20e291b0ea077b1e62d8d9b69e4951e96a42 (diff)
download	site-f4835fb34397f54bf98325a3bd25acfb81d1d1c3.tar.gz