+++ rss = "How I make my photo gallery in XML and what's lovely about it" date = Date(2023, 3, 17) tags = ["fun", "recipe", "net"] +++ # XML and Photo Gallery Generation: A Love Story > I'm just a language, whose style sheets are good\ > Oh, Lord, please, don't let me be misunderstood !!! note "Tips" As usual, the article starts with a text wall of random rambling. If you are only interested in the technical aspects, feel free to skip the first two sections. \toc ## Introduction Neural-optic live streaming probably, no, definitely offers the most photorealistic graphics one can set eyes on. [CGI] is just a pathetic mimic, and photography or videography is no more than a poor plagiarism attempt when compared to quantum ray-tracing and other advanced physics simulations^W happenings. On the other hand, we humen are rather shite at replaying visual memories, whilst ([bit rot] aside) media can be archived [for forever]. Besides, many of us are too busy to *touch grass* or go see cool things as regularly as we wish to. This is how an industry based on showing us [mundane stuff] or [obvious bullcrap] can still manage to make tens of thousands of [craploads] each year any why the interwebs are flooded with pictures of cats, kitties and pussies. Finding new shits means dopamine dispensation and that's why [they are dope][new is always better]. As a model netizen, I adhere to the web's social contract of mutual [shitposting] so that everyone can have a piece. Every blue moon, I also enjoy posting more quality stuff like what you are reading right now, should you ignore the number of [Mozart] references in the last three paragraphs. ## Motivation Some other times, I also want to share the living things and sceneries I encounter in the [new][move] place. My camera was gifted by father before I moved and yet I shared more photos [with strangers][pixelfed] than with my family. The PixelFed instance I landed on irreversibly shrank and lossily compressed them, while dumping 5 MB images to the family chat room just feels weird, hence I decided to gather the decency to build a photo gallery to show my loved ones (and admittedly, flex with online strangers). There are not many [CMS] in the wild for photo hosting, and they often acts as a wall garden and/or a social network. Building and hosting a new one is quite overkill, thus the obvious solution left would be generating a static site. Out of the gazillion [SSG], I couldn't found any that meets the my requirements: 1. Generate a [web feed] 2. Automate filling [image] title and alt text 3. Offer fine-grain control for permanent [pagination] 4. Generate thumbnails with custom size and name I mean, they perhaps exist, but the number I had to try and fight through would cost more time than writing the web pages and feed by hand. So I wrote them from scratch. Y'all can stand up and clap now! ## Preliminary Yes, I really started with writing [XHTML] and [Atom] by hand. A web page has the following structure with namespaces omitted and denoted in WXML ([Wisp]$\times$[SXML]) so I don't have to close the tags (have I given up on XML too early?-). !!! note "Syntax hints" For the uninitiated, any indentation or colon in Wisp represents an additional nest level, while a dot escape the nesting. The at signs are used by SXML to denote attributes, which may remind you of [XPath]. For example, the anchor to the previous page is `PREV`. ``` html head link @ : rel "alternate" type "application/atom+xml" href "/atom.xml" ... body nav a : @ : href "41" . "PREV" h1 "PAGE 42" a : @ : href "43" . "NEXT" article @ : id "foobar" h2 a : @ : href "#foobar" . "foobar" a : @ : href "/42/foo.jpg" img @ : src "/42/foo.small.jpg" alt "pic of foo" title "pic of foo" a : @ : href "/42/bar.jpg" ... article ... ... footer ... ``` So far, adding an `article` is not yet too cumbersome, there's only a bit of redundancy for permanent links and the nesting level is acceptable with the deepest being `/html/body/article/a/img`. It gets more repetitive once we publish it to to the linked Atom feed: ``` feed entry link @ : rel "alternate" type "application/xhtml+xml" href "https://gallery.example/42/#foobar" id "https://gallery.example/42/#foobar" title "foobar" content @ : type "xhtml" div img @ : src "https://gallery.example/42/foo.jpg" alt "pic of foo" title "pic of foo" img ... updated ... entry ... ... ``` Since web feeds are standalone documents, they must always use absolute URLs. (Welp that's not entirely true, [XML Base] does exists, but not all readers support it, and more importantly, certain elements such as `atom:id` disallow relative references.) In addition, whilst the web page links a thumbnail to the original image to save bandwidths, the feed can be consumed one post at a time, which thus points to the full size version. Therefore, copying the markup to embed it inside the Atom is error-prone and doesn't exactly spark joy. !!! note "Fun fact" What does spark joy is that we can embed XHTML directly into the web feed, which means the content is still XML and we don't need to quote it in CDATA. For other sites where contents don't accumulate up to hundreds of megabytes, this will allow us to slap some (SPOILER ALERT!) stylesheet on the Atom feed and let the user agent render it in a [human-readable form][XSL]. ## Approach I actually already spoiled it in the epigraph,[^spoiler] but for the sake of completeness let us [discuss a few possible solutions][efficiency]. What I wanted was to reduce the redundancy of manual input, in other words, a system transforming a custom information-dense format to standard yet sparser ones, which in this case are XHTML and Atom. Given some new photos and their relevant data, the purpose was to minimize the publishing friction. It's worth mentioning that the goal was not to minimize the input format, the transformation speed, or feedback latency, but all of the above, plus the cost of constructing the tool, incrementally as our requirements slightly changes over time. Our choice for the base [programming system] shall affect each and every of these aspects and more. Some technical dimensions are [more equal] than others, though. For this use case, IMHO immediate feedback loop should be given the number one priority, not only because it'd be frustrating to have to complete multiple rituals just to preview the changes, but also as watching and reflecting file system changes is (sadly still) a difficult problem. For Linux[^interjection] there's [inotify] which doesn't suck, except when it does and misses events,[^entr] and the standard POSIX build tool [make] relies on [mtime which is also flaky][mtime]. Some SSG work around this by spawning up a server with more sophisticated caching mechanism and even include a HTTP server sending out refresh events. Implementing such system is easily [more expensive][automation] than doing the original task manually. Luckily, there is another way. *After* the birth of imperative DOM manipulation programs running on VM inside browsers (Ecma scripts), there came a (now forgotten) art of purely functional DOM transformation. More specifically, [XSLT] can declaratively transform any XML document to another, and its best part is that modern browsers natively support it, i.e. there's no difference between editing the input document and the hypothetical output XHTML. For better portability and rendering performance, we can still generate the latter ahead-of-time (AoT) during deployment. ## Implementation Going back to the example, the input format could boil down to a more concise XML file, e.g. `42/index.xml`: ``` page @ : prev "41" curr "42" next "43" post @ : title "foobar" time ... picture @ : filename "foo" desc "pic of foo" picture ... ... post ... ... ``` ### Page Generation The stylesheet should then be declared at the beginning of the file, so that the user agent can automatically fetch and apply it to render the output XHML: ``` ``` XSLT is essentially a templating language, similar to PHP (which is also older) and template libraries in your favorite languages. For the ease of reading, I will let the target document's namespace be the default, while aliasing the transformation one as `xsl`. The stylesheet for the web pages would look something like the following, which should be self-explanatory. ``` xsl:stylesheet xsl:template : @ : match "/page" xsl:variable : @ : name "base" xsl:text "/" xsl:value-of : @ : select "@curr" xsl:text "/" html head ... body nav xsl:if : @ : test "@prev != ''" a : @ : href "/{@prev}/" . "PREV" h1 : xsl:text "PAGE " xsl:value-of : @ : select "@curr" xsl:if : @ : test "@next != ''" ... xsl:for-each : @ : select "post" xsl:variable : @ : name "id" xsl:value-of @ : select "translate(@title, ' ', '-')" article @ : id "{$id}" h2 a : @ : href "#{$id}" xsl:value-of : @ : select "@title" xsl:for-each : @ : select "picture" a : @ : href "{$base}{@filename}.jpg" img @ : src "{$base}{@filename}.small.jpg" alt "{@desc}" title "{@desc}" footer ... ``` ### Feed Generation Similarly, for Atom entries on a single page, ``` xsl:stylesheet xsl:variable : @ : name "root" . "https://gallery.example/" xsl:template : @ : match "/page" xsl:variable : @ : name "base" xsl:value-of : @ : select "$root" xsl:value-of : @ : select "@curr" xsl:text "/" xsl:for-each : @ : select "post" xsl:variable : @ : name "url" xsl:value-of : @ : select "$base" xsl:text "#" xsl:value-of @ : select "translate(@title, ' ', '-')" entry link @ : rel "alternate" type "application/xhtml+xml" href "{$url}" id : xsl:value-of : @ : select "$id" title : xsl:value-of : @ : select "@title" content @ : type "xhtml" div xsl:for-each : @ : select "picture" img @ : src "{$base}{@filename}.jpg" alt "{@desc}" title "{@desc}" updated : xsl:value-of : @ : select "@time" ``` The trickier part here is concatenating the entries together. Simple enough, instead of linking to the stylesheet in the data, we can read XML files directly from XSLT. ``` xsl:template @ : match "/" ... xsl:apply-templates @ : select "document('42/index.xml')/page" xsl:apply-templates ... ... ``` This allows us to do other cool things, such as embedding SVG in XHTML to make use of the parent element's [currentcolor], while keeping the source files separate. It is especially useful for monochromatic icons, e.g. ``` xsl:copy-of : @ : select "document('cc.svg')/*" xsl:copy-of : @ : select "document('by.svg')/*" xsl:copy-of : @ : select "document('sa.svg')/*" ``` ### Thumbnail Generation So far, we have met three out of the [four requirements](#motivation), only thing left is creating the thumbnails. Inspired by Ethan Dalool, I am going for [fairly large ones of 1024 px in width][big thumbs], > large enough to comfortably browse the photos without clicking through > to the big version of each, and the thumbnails are decently light > and not too jpeggy at about 125-150 kilobytes on average. At such size, I can aim for around ten photoes[^toes] per page while maintaining a somewhat decent load time. Plus, since the width of images are hardcoded, page [margin] could be automatically inferred to never stretch them. ```css html { box-sizing: border-box; margin: auto; max-width: calc(1024px + 2ch); } body { margin: 0 1ch } ``` To generate the thumbnails, I use [epeg] together with `make` for wildcarding: ``` PICTURES := $(filter-out %.small.jpg $(PREFIX)/%.jpg, $(wildcard */*.jpg)) THUMBNAILS := $(patsubst %.jpg,%.small.jpg,$(PICTURES)) %.small.jpg: %.jpg epeg -w 1024 -p -q 80 $< $@ ``` The Makefile also define rules for AoT compilation using [xsltproc] for the web pages and feed. Apparently no feed reader supports XSLT, and for pages runtime processing negatively affect the performance due to the multiple round trips for the stylesheet and the vector icons. ``` DATA := $(wildcard */index.xml) index.xml PAGES := $(patsubst %.xml,%.xhtml,$(DATA)) OUTPUTS := $(THUMBNAILS) $(PAGES) atom.xml all: $(OUTPUTS) index.xml: $(LATEST)/index.xml ln -fs $< $@ %.xhtml: %.xml page.xslt xsltproc page.xslt $< > $@ atom.xml: atom.xslt $(DATA) $(wildcard *.svg) xsltproc atom.xslt > atom.xml ``` The [full implementation][src] is deployed to [px.cnx.gdn], mirrored to the [OpenNIC] domain [pix.sinyx.indy] reusing the former's TLS certificate, because CA/Browser Forum disallows support for domains not recognized by ICANN and no [CA for OpenNIC] is mature enough. ## Discussion > *Okay you built your site using XML macros, so what? > The syntax is clunky and you hate it so much yourself > that not even a single line of code example here is in actual XML. > Doesn't seem like a love story to me!* Like all relationships, it's not that simple. I've learned to not judge a book by its cover and come to the understanding that XML is the (ugly) equivalence of [sexp].[^sex] Unlike afterthoughts such as C preprocessors, [Django]-like templates, or even the Wisp-lookalike syntax of [Slim], XML stylesheets is in the same data structure. To put it another way, one can use XSLT to generate XSLT from XSLT. Do I need it in this case or ever at all? Probably not, but that certainly makes XSL a lot more attractive in my eyes. Furthermore, the tooling for XML is highly mature, from editors to linters and processors to rendering engines. It'd be lying to say you ain't fascinated that tis possible to directly feed browsers pure data instead of markup representations. More than that, one can have entirely static API endpoints that are both human- and machine-readable. > *XSL is just declarative JS! You are so blinded > by your lust for functional programming that you have > become [the very thing you swore to destroy](/blog/reply)!* My distaste for Ecma scripts is not due to DOM manipulation. Sure, I do find in-place modification inelegant for documents, but if only that's the only issue. I block them on most sites because they can interact with many things other than just the DOM, imposing [privacy] and [security] risks while [fucking up the UX]. Architecturally, Ecma scripts enable the absolute bloody worst possible kind of web pages with zero data at all, fetching tiny pieces of content in JSON and turn performance [to shit]. The user agents then try to salvage efficiency by turning themselves into a distributed system component and adding optimizations that shall never be (ab)used for the sake of users. O ye [cycle of doom]! Note that one can make a similar mistake with XSL regarding the number of round trips, and XML stylesheets can provide the same front-end/back-end separation. Both can be used to provide hot loading during development and AoT rendering in production (if not all, then many JS libraries support pre-rendering, ignoring the monstrous [dependency graph](/blog/dedep)). At the end of the day, it's not the matter of technology but principle: to be in the [users' best interest]. > *There is nothing complex about the photo gallery, > any existing SSG can do the same with minor tweaks! > You never needed to write a new one to begin with!* I am wondering the same myself, but keep in mind there are details I've been hiding from in the example. I went all-in for the semantic web with the hope for best portability and accessibility. One thing I haven't mentioned is the `lang` attribute, e.g. `en`, `vi` or `fr` depending on the post. Adding this to the web pages requires the SSG to be somewhat modular, and even harder for the web feed. Moreover, generic SSG are not designed to handle the difference in content between a page's `article` and the feed's corresponding `entry`, neither for having multiple posts in a single page. Pagination is also commonly implemented backwards, i.e. page 2 being the second latest one, making it impossible to avoid link rot. Not to suggest that the majority of SSG are poorly designed, just that from a certain amount of [context] difference, tis cheaper to just redesign from scratch. This is not about XSL vs Go/Python/JS for SSG or web dev in general, but this specific and happen-to-be-far-from-complex case. ## Conclusion At the time of writing, XML has pretty much been superseded by JSON or YAML, for the better or worse. I have no love for YAML for obvious reasons, but it also saddens me to sometimes see JSON being solely used as a container for HTML. I hope that this essay can [awaken something in you] about XML and remind you about the semantic web in your next project. It worked out for me, maybe it'll work out for you too! The story between XML and my photo gallery is a fond love story. They were born for each other, there was no drama, everything just werkt. Their romance inspire me to better appreciate stability and maturity, and value those right in front of my eyes yet I had been *too blind to see*. Anyway, this is getting too long, so Imma end it with another [song]. > Lookin' for perfect\ > Surrounded by artificial\ > You're the closest thing to real I've seen\ > Sure, everyone has their problems\ > That's a given\ > Yours are the easiest to tolerate [^spoiler]: If you know, you know. [^interjection]: Yup, just the kernel. [^entr]: But in case it works for you, check out [entr]. [^toes]: *Thumb*nails, pho*toes*, get it?-) [^sex]: Or conventionally in most Lisp 1's, `sex?`. [CGI]: https://en.wikipedia.org/wiki/Computer-generated_imagery [bit rot]: https://en.wikipedia.org/wiki/Data_degradation [for forever]: https://xkcd.com/1683 [mundane stuff]: https://en.wikipedia.org/wiki/Drama [obvious bullcrap]: https://en.wikipedia.org/wiki/Fiction [craploads]: https://antifandom.com/how-i-met-your-mother/wiki/Crapload [new is always better]: https://www.youtube.com/watch?v=1SNRULEnTVQ [shitposting]: https://fe.disroot.org/@mcsinyx [Mozart]: https://peervideo.club/w/uByA7Czy7PWYMqnu8FgXvW [move]: https://github.com/zig-community/user-map/pull/120 [pixelfed]: https://fotofed.nl/cnx [CMS]: https://en.wikipedia.org/wiki/Content_management_system [SSG]: https://en.wikipedia.org/wiki/Static_site_generator [web feed]: https://en.wikipedia.org/wiki/Web_feed [image]: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/Img [pagination]: https://en.wikipedia.org/wiki/Pagination [XHTML]: https://en.wikipedia.org/wiki/XHTML [Atom]: https://www.rfc-editor.org/rfc/rfc4287 [Wisp]: https://www.draketo.de/software/wisp [SXML]: https://okmij.org/ftp/Scheme/SXML.html [XPath]: https://www.w3.org/TR/xpath [XML Base]: https://www.w3.org/TR/xmlbase [XSL]: https://simonesilvestroni.com/blog/build-a-human-readable-rss-with-jekyll [efficiency]: https://xkcd.com/1445 [programming system]: https://programming-journal.org/2023/7/13 [more equal]: https://en.wikipedia.org/wiki/Animal_Farm [inotify]: https://man7.org/linux/man-pages/man7/inotify.7.html [make]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/make.html [mtime]: https://apenwarr.ca/log/20181113 [automation]: https://xkcd.com/1319 [XSLT]: https://www.w3.org/standards/xml/transformation [currentcolor]: https://developer.mozilla.org/en-US/docs/Web/CSS/color_value#currentcolor_keyword [big thumbs]: https://voussoir.net/writing/sharing_photos [epeg]: https://github.com/mattes/epeg [margin]: https://en.wikipedia.org/wiki/Margin_(typography) [xsltproc]: https://gnome.pages.gitlab.gnome.org/libxslt/xsltproc.html [src]: https://trong.loang.net/~cnx/px [px.cnx.gdn]: https://px.cnx.gdn [OpenNIC]: https://www.opennic.org [pix.sinyx.indy]: https://pix.sinyx.indy [CA for OpenNIC]: https://wiki.opennic.org/opennic/tls [sexp]: https://en.wikipedia.org/wiki/S-expression [Django]: https://docs.djangoproject.com/en/dev/topics/templates [Slim]: https://github.com/slim-template/slim [privacy]: https://en.wikipedia.org/wiki/Mouse_tracking [security]: https://react-etc.net/entry/exploiting-speculative-execution-meltdown-spectre-via-javascript [fucking up the UX]: https://meta.stackexchange.com/q/2980/698165 [to shit]: https://unixsheikh.com/articles/so-called-modern-web-developers-are-the-culprits.html [cycle of doom]: https://en.wikipedia.org/wiki/Wirth%27s_law [users' best interest]: https://pluralistic.net/2023/01/21/potemkin-ai/#hey-guys [link rot]: https://en.wikipedia.org/wiki/Link_rot [context]: https://guide.handmade-seattle.com/c/2021/context-is-everything [awaken something in you]: https://www.youtube.com/watch?v=F3QPWrLFsOA [song]: https://www.youtube.com/watch?v=5LvOdWi3Qno [entr]: https://eradman.com/entrproject