diff options
author | Nguyễn Gia Phong <mcsinyx@disroot.org> | 2023-03-17 21:32:11 +0900 |
---|---|---|
committer | Nguyễn Gia Phong <mcsinyx@disroot.org> | 2023-03-17 21:32:11 +0900 |
commit | f4835fb34397f54bf98325a3bd25acfb81d1d1c3 (patch) | |
tree | ee1601ba73db6bbe6399c4573f274ae095896711 /blog | |
parent | 1c0d20e291b0ea077b1e62d8d9b69e4951e96a42 (diff) | |
download | site-f4835fb34397f54bf98325a3bd25acfb81d1d1c3.tar.gz |
Blog about px.cnx.gdn
Diffstat (limited to 'blog')
-rw-r--r-- | blog/dedep.md | 2 | ||||
-rw-r--r-- | blog/pixml.md | 550 |
2 files changed, 551 insertions, 1 deletions
diff --git a/blog/dedep.md b/blog/dedep.md index a77dee7..c251880 100644 --- a/blog/dedep.md +++ b/blog/dedep.md @@ -132,7 +132,7 @@ join me in a De-Dependency December and fight for the users! [Fead]: https://trong.loang.net/~cnx/fead [SSG]: https://en.wikipedia.org/wiki/Static_site_generator [power]: https://www.youtube.com/watch?v=3Mpyias9ek4 -[context]: https://media.handmade-seattle.com/context-is-everything +[context]: https://guide.handmade-seattle.com/c/2021/context-is-everything [standards]: https://xkcd.com/927 [utilities libraries]: https://raku-advent.blog/2021/12/11/unix_philosophy_without_leftpad_part2 [old]: https://wiki.debian.org/DontBreakDebian#Don.27t_suffer_from_Shiny_New_Stuff_Syndrome diff --git a/blog/pixml.md b/blog/pixml.md new file mode 100644 index 0000000..79b8129 --- /dev/null +++ b/blog/pixml.md @@ -0,0 +1,550 @@ ++++ +rss = "Comments for Static Sites without JavaScript via Emails" +date = Date(2023, 3, 17) +tags = ["fun", "recipe"] ++++ + +# XML and Photo Gallery Generation: A Love Story + +> I'm just a language, whose style sheets are good\ +> Oh, Lord, please, don't let me be misunderstood + +!!! note "Tips" + + As usual, the article starts with a text wall of random rambling. + If you are only interested in the technical aspects, feel free to skip + the first two sections. + +\toc + +## Introduction + +Neural-optic live streaming probably, no, definitely offers +the most photorealistic graphics one can set eyes on. [CGI] is just +a pathetic mimic, and photography or videography is no more +than a poor plagiarism attempt when compared to quantum ray-tracing +and other advanced physics simulations^W happenings. + +On the other hand, we humen are rather shite at replaying visual memories, +whilst ([bit rot] aside) media can be archived [for forever]. Besides, +many of us are too busy to *touch grass* or go see cool things +as regularly as we wish to. This is how an industry based on showing us +[mundane stuff] or [obvious bullcrap] can still manage to make tens +of thousands of [craploads] each year any why the interwebs are flooded +with pictures of cats, kitties and pussies. + +Finding new shits means dopamine dispensation and that's why +[they are dope][new is always better]. As a model netizen, I adhere +to the web's social contract of mutual [shitposting] so that everyone +can have a piece. Every blue moon, I also enjoy posting more quality +stuff like what you are reading right now, should you ignore the number +of [Mozart] references in the last three paragraphs. + +## Motivation + +Some other times, I also want to share the living things and sceneries +I encounter in the [new][move] place. My camera was gifted by father +before I moved and yet I shared more photos [with strangers][pixelfed] +than with my family. The PixelFed instance I landed on irreversibly +shrank and lossily compressed them, while dumping 5 MB images to the family +chat room just feels weird, hence I decided to gather the decency +to build a photo gallery to show my loved ones (and admittedly, +flex with online strangers). + +There are not many [CMS] in the wild for photo hosting, which are +often either acts as a wall garden and/or a social network. +Building and hosting a new one is quite overkill, thus the obvious +solution left would be generating a static site. Out of the gazillion [SSG], +I couldn't found any that meets the my requirements: + +1. Generate a [web feed] +2. Automate filling [image] title and alt text +3. Offer fine-grain control for permanent [pagination] +4. Generate thumbnails with custom size and name + +I mean, they perhaps exist, but the number I had to try and fight through +would cost more time than writing the web pages and feed by hand. +So I wrote them from scratch. Y'all can stand up and clap now! + +## Preliminary + +Yes, I really started with writing [XHTML] and [Atom] by hand. +A web page has the following structure with namespaces omitted +and denoted in WXML ([Wisp]$\times$[SXML]) so I don't have +to close the tags (have I given up on XML too early?-). + +!!! note "Syntax hints" + + For the uninitiated, any indentation or colon in Wisp represents + an additional nest level, while a dot escape the nesting. The at signs + are used by SXML to denote attributes, which may remind you of [XPath]. + For example, the anchor to the previous page is `<a href=41>PREV</a>`. + +``` +html + head + link + @ : rel "alternate" + type "application/atom+xml" + href "/atom.xml" + ... + body + nav + a : @ : href "41" + . "PREV" + h1 "PAGE 42" + a : @ : href "43" + . "NEXT" + article + @ : id "foobar" + h2 + a : @ : href "#foobar" + . "foobar" + a : @ : href "/42/foo.jpg" + img + @ : src "/42/foo.small.jpg" + alt "pic of foo" + title "pic of foo" + a : @ : href "/42/bar.jpg" + ... + article ... + ... + footer ... +``` + +So far, adding an `article` is not yet too cumbersome, there's only a bit +of redundancy for permanent links and the nesting level is acceptable +with the deepest being `/html/body/article/a/img`. It gets more repetitive +once we publish it to to the linked Atom feed: + +``` +feed + entry + link + @ : rel "alternate" + type "application/xhtml+xml" + href "https://gallery.example/42/#foobar" + id "https://gallery.example/42/#foobar" + title "foobar" + content + @ : type "xhtml" + div + img + @ : src "https://gallery.example/42/foo.jpg" + alt "pic of foo" + title "pic of foo" + img ... + updated ... + entry ... + ... +``` + +Since web feeds are standalone documents, they must always use absolute URLs. +(Welp that's not entirely true, [XML Base] does exists, but not all readers +support it, and more importantly, certain elements such as `atom:id` disallow +relative references.) In addition, whilst the web page links a thumbnail +to the original image to save bandwidths, the feed can be consumed one post +at a time, which thus points to the full size version. Therefore, +copying the markup to embed it inside the Atom is error-prone and doesn't +exactly spark joy. + +!!! note "Fun fact" + + What does spark joy is that we can embed XHTML directly into the web feed, + which means the content is still XML and we don't need to quote it in CDATA. + For other sites where contents don't accumulate up to hundreds of megabytes, + this will allow us to slap some (SPOILER ALERT!) stylesheet on the Atom feed + and let the user agent render it in a [human-readable form][XSL]. + +## Approach + +I actually already spoiled it in the epigraph,[^spoiler] but for the sake +of completeness let us [discuss a few possible solutions][efficiency]. +What I wanted was to reduce the redundancy of manual input, in other words, +a system transforming a custom information-dense format to standard +yet sparser ones, which in this case are XHTML and Atom. Given some new photos +and their relevant data, the purpose was to minimize the publishing friction. + +It's worth mentioning that the goal was not to minimize the input format, +the transformation speed, or feedback latency, but all of the above, +plus the cost of constructing the tool, incrementally as our requirements +slightly changes over time. Our choice for the base [programming system] +shall affect each and every of these aspects and more. + +Some technical dimensions are more equal than others, though. +For this use case, IMHO immediate feedback loop should be given +the number one priority, not only because it'd be frustrating +to have to complete multiple rituals just to preview the changes, +but also as watching and reflecting file system changes is (sadly still) +a difficult problem. + +For Linux[^interjection] there's [inotify] which doesn't suck, +except when it does and misses events, and the standard POSIX build tool +[make] relies on [mtime which is also flaky][mtime]. Some SSG +work around this by spawning up a server with more sophisticated +caching mechanism and even include a HTTP server sending out refresh events. +Implementing such system is easily [more expensive][automation] than doing +the original task manually. + +Luckily, there is another way. *After* the birth of imperative +DOM manipulation programs running on VM inside browsers (Ecma scripts), +there came a (now forgotten) art of purely functional DOM transformation. +More specifically, [XSLT] can declaratively transform any XML document +to another, and its best part is that modern browser native support it, +i.e. there's no difference between editing the input document +and the hypothetical output XHTML. For better portability +and rendering performance, we can still generate the latter +ahead-of-time (AoT) during deployment. + +## Implementation + +Going back to the example, the input format could boil down +to a more concise XML file, e.g. `42/index.xml`: + +``` +page + @ : prev "41" + curr "42" + next "43" + post : @ : title "foobar" + picture + @ : filename "foo" + desc "pic of foo" + picture ... + ... + time ... + post ... + ... +``` + +### Page Generation + +The stylesheet should then be declared at the beginning of the file, +so that the user agent can automatically fetch and apply it +to render the output XHML: + +``` +<?xml-stylesheet href="/page.xslt" type="text/xsl"?> +``` + +XSLT is essentially a templating language, similar to PHP (which is also older) +and template libraries in your favorite languages. For the ease of reading, +I will let the target document's namespace be the default, while aliasing +the transformation one as `xsl`. The stylesheet for the web pages would +look something like the following, which should be self-explanatory. + +``` +xsl:stylesheet + xsl:template : @ : match "/page" + xsl:variable : @ : name "base" + xsl:text "/" + xsl:value-of : @ : select "@curr" + xsl:text "/" + html + head ... + body + nav + xsl:if : @ : test "@prev != ''" + a : @ : href "/{@prev}/" + . "PREV" + h1 : xsl:text "PAGE " + xsl:value-of : @ : select "@curr" + xsl:if : @ : test "@next != ''" + ... + xsl:for-each : @ : select "post" + xsl:variable : @ : name "id" + xsl:value-of + @ : select "translate(@title, ' ', '-')" + article + @ : id "{$id}" + h2 + a : @ : href "#{$id}" + xsl:value-of : @ : select "@title" + xsl:for-each : @ : select "picture" + a : @ : href "{$base}{@filename}.jpg" + img + @ : src "{$base}{@filename}.small.jpg" + alt "{@desc}" + title "{@desc}" + footer ... +``` + +### Feed Generation + +Similarly, for Atom entries on a single page, + +``` +xsl:stylesheet + xsl:variable : @ : name "root" + . "https://gallery.example/" + xsl:template : @ : match "/page" + xsl:variable : @ : name "base" + xsl:value-of : @ : select "$root" + xsl:value-of : @ : select "@curr" + xsl:text "/" + xsl:for-each : @ : select "post" + xsl:variable : @ : name "url" + xsl:value-of : @ : select "$base" + xsl:text "#" + xsl:value-of + @ : select "translate(@title, ' ', '-')" + entry + link + @ : rel "alternate" + type "application/xhtml+xml" + href "{$url}" + id : xsl:value-of : @ : select "$id" + title : xsl:value-of : @ : select "@title" + content + @ : type "xhtml" + div + xsl:for-each : @ : select "picture" + img + @ : src "{$base}{@filename}.jpg" + alt "{@desc}" + title "{@desc}" + updated : xsl:value-of : @ : select "time" +``` + +The trickier part here is concatenating the entries together. +Simple enough, instead of linking to the stylesheet in the data, +we can read XML files directly from XSLT. + +``` +xsl:template + @ : match "/" + ... + xsl:apply-templates + @ : select "document('42/index.xml')/page" + xsl:apply-templates ... + ... +``` + +This allows us to do other cool things, such as embedding SVG in XHTML +to make use of the parent element's [currentcolor], while keeping +the source files separate. It is especially useful for monochromatic icons, +e.g. + +``` +xsl:copy-of : @ : select "document('cc.svg')/*" +xsl:copy-of : @ : select "document('by.svg')/*" +xsl:copy-of : @ : select "document('sa.svg')/*" +``` + +### Thumbnail Generation + +So far, we have met three out of the [four requirements](#motivation), +only thing left is creating the thumbnails. Inspired by Ethan Dalool, +I am going for [fairly large ones of 1024 px in width][big thumbs], + +> large enough to comfortably browse the photos without clicking through +> to the big version of each, and the thumbnails are decently light +> and not too jpeggy at about 125-150 kilobytes on average. + +At such size, I can aim for around ten photoes[^toes] per page +while maintaining a somewhat decent load time. Plus, since the width +of images are hardcoded, page [margin] could be automatically inferred +to never stretch them. + +```css +html { + box-sizing: border-box; + margin: auto; + max-width: calc(1024px + 2ch); +} +body { margin: 0 1ch } +``` + +To generate the thumbnails, I use [epeg] together with `make` for wildcarding: + +``` +PICTURES := $(filter-out %.small.jpg $(PREFIX)/%.jpg, $(wildcard */*.jpg)) +THUMBNAILS := $(patsubst %.jpg,%.small.jpg,$(PICTURES)) + +%.small.jpg: %.jpg + epeg -w 1024 -p -q 80 $< $@ +``` + +The Makefile also define rules for AoT compilation using [xsltproc] +for the web pages and feed. Apparently no feed reader supports XSLT, +and for pages runtime processing negatively affect the performance +due to the multiple round trips for the stylesheet and the vector icons. + +``` +DATA := $(wildcard */index.xml) index.xml +PAGES := $(patsubst %.xml,%.xhtml,$(DATA)) +OUTPUTS := $(THUMBNAILS) $(PAGES) atom.xml + +all: $(OUTPUTS) + +index.xml: $(LATEST)/index.xml + ln -fs $< $@ + +%.xhtml: %.xml page.xslt + xsltproc page.xslt $< > $@ + +atom.xml: atom.xslt $(DATA) $(wildcard *.svg) + xsltproc atom.xslt > atom.xml +``` + +The [full implementation][src] is deployed to [px.cnx.gdn], +mirrored to the [OpenNIC] domain [pix.sinyx.indy] reusing +the former's TLS certificate, because CA/Browser Forum +disallows support for domains not recognized by ICANN and no +[CA for OpenNIC] is mature enough. + +## Discussion + +> *Okay you built your site using XML macros, so what? +> The syntax is clunky and you hate it so much yourself +> that not even a single line of code example here is in actual XML. +> Doesn't seem like a love story to me!* + +Like all relationships, it's not that simple. I've learned to not judge +a book by its cover and come to the understanding that XML is the (ugly) +equivalence of [sexp].[^sex] Unlike afterthoughts such as C preprocessors, +[Django]-like templates, or even the Wisp-lookalike syntax of [Slim], +XML stylesheets is in the same data structure. To put it another way, +one can use XSLT to generate XSLT from XSLT. Do I need it in this case +or ever at all? Probably not, but that certainly makes XSL a lot more +attractive in my eyes. + +Furthermore, the tooling for XML is highly mature, from editors to linters +and processors to rendering engines. It'd be lying to say you ain't +fascinated that tis possible to directly feed browsers pure data +instead of markup representations. More than that, one can have +entirely static API endpoints that are both human- and machine-readable. + +> *XSL is just declarative JS! You are so blinded +> by your lust for functional programming that you have +> become [the very thing you swore to destroy](/blog/reply)!* + +My distaste for Ecma scripts is not due to DOM manipulation. +Sure, I do find in-place modification inelegant for documents, +but if only that's the only issue. I block them on most sites +because they can interact with many things other than just the DOM, +imposing [privacy] and [security] risks while [fucking up the UX]. + +Architecturally, Ecma scripts enable the absolute bloody worst possible +kind of web pages with zero data at all, fetching tiny pieces of content +in JSON and turn performance [to shit]. The user agents then try to salvage +efficiency by turning themselves into a distributed system component +and adding optimizations that shall never be (ab)used for the sake of users. +O ye [cycle of doom]! + +Note that one can make a similar mistake with XSL regarding the number +of round trips, and XML stylesheets can provide the same front-end/back-end +separation. Both can be used to provide hot loading during development +and AoT rendering in production (if not all, then many JS libraries support +pre-rendering, ignoring the monstrous [dependency graph](/blog/dedep)). +At the end of the day, it's not the matter of technology but principle: +to be the [users' best interest]. + +> *There is nothing complex about the photo gallery, +> any existing SSG can do the same with minor tweaks! +> You never needed to write a new one to begin with!* + +I am wondering the same myself, but keep in mind there are details +I've been hiding from in the example. I went all-in for the semantic web +with the hope for best portability and accessibility. One thing +I haven't mentioned is the `lang` attribute, e.g. `en`, `vi` or `fr` +depending on the post. Adding this to the web pages requires the SSG +to be somewhat modular, and even harder for the web feed. + +Moreover, generic SSG are not designed to handle the difference +in content between a page's `article` and the feed's corresponding `entry`, +neither for having multiple posts in a single page. Pagination is +also commonly implemented backwards, i.e. page 2 being the second latest one, +making it impossible to avoid link rot. + +Not to suggest that the majority of SSG are poorly designed, just that +from a certain amount of [context] difference, tis cheaper to just redesign +from scratch. This is not about XSL vs Go/Python/JS for SSG or web dev +in general, but this specific and happen-to-be-far-from-complex case. + +## Conclusion + +At the time of writing, XML has pretty much been superseded by JSON or YAML, +for the better or worse. I have no love for YAML for obvious reasons, +but it also saddens me to sometimes see JSON being solely used as a container +for HTML. I hope that this essay can [awaken something in you] about XML +and remind you about the semantic web in your next project. It worked out +for me, maybe it'll work out for you too! + +The story between XML and my photo gallery is a fond love story. +They were born for each other, there was no drama, everything just werkt. +Their romance inspire me to better appreciate stability and maturity, +and value those right in front of my eyes yet I had been *too blind to see*. +Anyway, this is getting too long, so Imma end it with another [song]. + +> Lookin' for perfect\ +> Surrounded by artificial\ +> You're the closest thing to real I've seen\ +> Sure, everyone has their problems\ +> That's a given\ +> Yours are the easiest to tolerate + +[^spoiler]: If you know, you know. +[^interjection]: Yup, just the kernel. +[^toes]: *Thumb*nails, pho*toes*, get it?-) +[^sex]: Or conventionally in most Lisp 1's, `sex?`. + +[CGI]: https://en.wikipedia.org/wiki/Computer-generated_imagery +[bit rot]: https://en.wikipedia.org/wiki/Data_degradation +[for forever]: https://xkcd.com/1683 +[mundane stuff]: https://en.wikipedia.org/wiki/Drama +[obvious bullcrap]: https://en.wikipedia.org/wiki/Fiction +[craploads]: https://antifandom.com/how-i-met-your-mother/wiki/Crapload +[new is always better]: https://www.youtube.com/watch?v=1SNRULEnTVQ +[shitposting]: https://fe.disroot.org/@mcsinyx +[Mozart]: https://peervideo.club/w/uByA7Czy7PWYMqnu8FgXvW + +[move]: https://github.com/zig-community/user-map/pull/120 +[pixelfed]: https://fotofed.nl/cnx +[CMS]: https://en.wikipedia.org/wiki/Content_management_system +[SSG]: https://en.wikipedia.org/wiki/Static_site_generator +[web feed]: https://en.wikipedia.org/wiki/Web_feed +[image]: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/Img +[pagination]: https://en.wikipedia.org/wiki/Pagination + +[XHTML]: https://en.wikipedia.org/wiki/XHTML +[Atom]: https://www.rfc-editor.org/rfc/rfc4287 +[Wisp]: https://www.draketo.de/software/wisp +[SXML]: https://okmij.org/ftp/Scheme/SXML.html +[XPath]: https://www.w3.org/TR/xpath +[XML Base]: https://www.w3.org/TR/xmlbase +[XSL]: https://simonesilvestroni.com/blog/build-a-human-readable-rss-with-jekyll + +[efficiency]: https://xkcd.com/1445 +[programming system]: https://programming-journal.org/2023/7/13 +[inotify]: https://man7.org/linux/man-pages/man7/inotify.7.html +[make]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/make.html +[mtime]: https://apenwarr.ca/log/20181113 +[automation]: https://xkcd.com/1319 +[XSLT]: https://www.w3.org/standards/xml/transformation + +[currentcolor]: https://developer.mozilla.org/en-US/docs/Web/CSS/color_value#currentcolor_keyword +[big thumbs]: https://voussoir.net/writing/sharing_photos +[epeg]: https://github.com/mattes/epeg +[margin]: https://en.wikipedia.org/wiki/Margin_(typography) +[xsltproc]: https://gnome.pages.gitlab.gnome.org/libxslt/xsltproc.html +[src]: https://trong.loang.net/~cnx/px +[px.cnx.gdn]: https://px.cnx.gdn +[OpenNIC]: https://www.opennic.org +[pix.sinyx.indy]: https://pix.sinyx.indy +[CA for OpenNIC]: https://wiki.opennic.org/opennic/tls + +[sexp]: https://en.wikipedia.org/wiki/S-expression +[Django]: https://docs.djangoproject.com/en/dev/topics/templates +[Slim]: https://github.com/slim-template/slim +[privacy]: https://en.wikipedia.org/wiki/Mouse_tracking +[security]: https://react-etc.net/entry/exploiting-speculative-execution-meltdown-spectre-via-javascript +[fucking up the UX]: https://meta.stackexchange.com/q/2980/698165 +[to shit]: https://unixsheikh.com/articles/so-called-modern-web-developers-are-the-culprits.html +[cycle of doom]: https://en.wikipedia.org/wiki/Wirth%27s_law +[users' best interest]: https://pluralistic.net/2023/01/21/potemkin-ai/#hey-guys +[link rot]: https://en.wikipedia.org/wiki/Link_rot +[context]: https://guide.handmade-seattle.com/c/2021/context-is-everything + +[awaken something in you]: https://www.youtube.com/watch?v=F3QPWrLFsOA +[song]: https://www.youtube.com/watch?v=5LvOdWi3Qno |