+++
rss = "How I make my photo gallery in XML and what's lovely about it"
date = Date(2023, 3, 17)
tags = ["fun", "recipe", "net"]
+++

# XML and Photo Gallery Generation: A Love Story

> I'm just a language, whose style sheets are good\
> Oh, Lord, please, don't let me be misunderstood

!!! note "Tips"

    As usual, the article starts with a text wall of random rambling.
    If you are only interested in the technical aspects, feel free to skip
    the first two sections.

\toc

## Introduction

Neural-optic live streaming probably, no, definitely offers
the most photorealistic graphics one can set eyes on.  [CGI] is just
a pathetic mimic, and photography or videography is no more
than a poor plagiarism attempt when compared to quantum ray-tracing
and other advanced physics simulations^W happenings.

On the other hand, we humen are rather shite at replaying visual memories,
whilst ([bit rot] aside) media can be archived [for forever].  Besides,
many of us are too busy to *touch grass* or go see cool things
as regularly as we wish to.  This is how an industry based on showing us
[mundane stuff] or [obvious bullcrap] can still manage to make tens
of thousands of [craploads] each year any why the interwebs are flooded
with pictures of cats, kitties and pussies.

Finding new shits means dopamine dispensation and that's why
[they are dope][new is always better].  As a model netizen, I adhere
to the web's social contract of mutual [shitposting] so that everyone
can have a piece.  Every blue moon, I also enjoy posting more quality
stuff like what you are reading right now, should you ignore the number
of [Mozart] references in the last three paragraphs.

## Motivation

Some other times, I also want to share the living things and sceneries
I encounter in the [new][move] place.  My camera was gifted by father
before I moved and yet I shared more photos [with strangers][pixelfed]
than with my family.  The PixelFed instance I landed on irreversibly
shrank and lossily compressed them, while dumping 5 MB images to the family
chat room just feels weird, hence I decided to gather the decency
to build a photo gallery to show my loved ones (and admittedly,
flex with online strangers).

There are not many [CMS] in the wild for photo hosting,
and they often acts as a wall garden and/or a social network.
Building and hosting a new one is quite overkill, thus the obvious
solution left would be generating a static site.  Out of the gazillion [SSG],
I couldn't found any that meets the my requirements:

1. Generate a [web feed]
2. Automate filling [image] title and alt text
3. Offer fine-grain control for permanent [pagination]
4. Generate thumbnails with custom size and name

I mean, they perhaps exist, but the number I had to try and fight through
would cost more time than writing the web pages and feed by hand.
So I wrote them from scratch.  Y'all can stand up and clap now!

## Preliminary

Yes, I really started with writing [XHTML] and [Atom] by hand.
A web page has the following structure with namespaces omitted
and denoted in WXML ([Wisp]$\times$[SXML]) so I don't have
to close the tags (have I given up on XML too early?-).

!!! note "Syntax hints"

    For the uninitiated, any indentation or colon in Wisp represents
    an additional nest level, while a dot escape the nesting.  The at signs
    are used by SXML to denote attributes, which may remind you of [XPath].
    For example, the anchor to the previous page is `<a href=41>PREV</a>`.

```
html
  head
    link
      @ : rel "alternate"
          type "application/atom+xml"
          href "/atom.xml"
    ...
  body
    nav
      a : @ : href "41"
        . "PREV"
      h1 "PAGE 42"
      a : @ : href "43"
        . "NEXT"
    article
      @ : id "foobar"
      h2
        a : @ : href "#foobar"
          . "foobar"
      a : @ : href "/42/foo.jpg"
          img
            @ : src "/42/foo.small.jpg"
                alt "pic of foo"
                title "pic of foo"
      a : @ : href "/42/bar.jpg"
          ...
    article ...
    ...
    footer ...
```

So far, adding an `article` is not yet too cumbersome, there's only a bit
of redundancy for permanent links and the nesting level is acceptable
with the deepest being `/html/body/article/a/img`.  It gets more repetitive
once we publish it to to the linked Atom feed:

```
feed
  entry
    link
      @ : rel "alternate"
          type "application/xhtml+xml"
          href "https://gallery.example/42/#foobar"
    id "https://gallery.example/42/#foobar"
    title "foobar"
    content
      @ : type "xhtml"
      div
        img
          @ : src "https://gallery.example/42/foo.jpg"
              alt "pic of foo"
              title "pic of foo"
        img ...
    updated ...
  entry ...
  ...
```

Since web feeds are standalone documents, they must always use absolute URLs.
(Welp that's not entirely true, [XML Base] does exists, but not all readers
support it, and more importantly, certain elements such as `atom:id` disallow
relative references.)  In addition, whilst the web page links a thumbnail
to the original image to save bandwidths, the feed can be consumed one post
at a time, which thus points to the full size version.  Therefore,
copying the markup to embed it inside the Atom is error-prone and doesn't
exactly spark joy.

!!! note "Fun fact"

    What does spark joy is that we can embed XHTML directly into the web feed,
    which means the content is still XML and we don't need to quote it in CDATA.
    For other sites where contents don't accumulate up to hundreds of megabytes,
    this will allow us to slap some (SPOILER ALERT!) stylesheet on the Atom feed
    and let the user agent render it in a [human-readable form][XSL].

## Approach

I actually already spoiled it in the epigraph,[^spoiler] but for the sake
of completeness let us [discuss a few possible solutions][efficiency].
What I wanted was to reduce the redundancy of manual input, in other words,
a system transforming a custom information-dense format to standard
yet sparser ones, which in this case are XHTML and Atom.  Given some new photos
and their relevant data, the purpose was to minimize the publishing friction.

It's worth mentioning that the goal was not to minimize the input format,
the transformation speed, or feedback latency, but all of the above,
plus the cost of constructing the tool, incrementally as our requirements
slightly changes over time.  Our choice for the base [programming system]
shall affect each and every of these aspects and more.

Some technical dimensions are [more equal] than others, though.
For this use case, IMHO immediate feedback loop should be given
the number one priority, not only because it'd be frustrating
to have to complete multiple rituals just to preview the changes,
but also as watching and reflecting file system changes is (sadly still)
a difficult problem.

For Linux[^interjection] there's [inotify] which doesn't suck,
except when it does and misses events,[^entr] and the standard POSIX build tool
[make] relies on [mtime which is also flaky][mtime].  Some SSG
work around this by spawning up a server with more sophisticated
caching mechanism and even include a HTTP server sending out refresh events.
Implementing such system is easily [more expensive][automation] than doing
the original task manually.

Luckily, there is another way.  *After* the birth of imperative
DOM manipulation programs running on VM inside browsers (Ecma scripts),
there came a (now forgotten) art of purely functional DOM transformation.
More specifically, [XSLT] can declaratively transform any XML document
to another, and its best part is that modern browsers natively support it,
i.e. there's no difference between editing the input document
and the hypothetical output XHTML.  For better portability
and rendering performance, we can still generate the latter
ahead-of-time (AoT) during deployment.

## Implementation

Going back to the example, the input format could boil down
to a more concise XML file, e.g. `42/index.xml`:

```
page
  @ : prev "41"
      curr "42"
      next "43"
  post
    @ : title "foobar"
        time ...
    picture
      @ : filename "foo"
          desc "pic of foo"
    picture ...
    ...
  post ...
  ...
```

### Page Generation

The stylesheet should then be declared at the beginning of the file,
so that the user agent can automatically fetch and apply it
to render the output XHML:

```
<?xml-stylesheet href="/page.xslt" type="text/xsl"?>
```

XSLT is essentially a templating language, similar to PHP (which is also older)
and template libraries in your favorite languages.  For the ease of reading,
I will let the target document's namespace be the default, while aliasing
the transformation one as `xsl`.  The stylesheet for the web pages would
look something like the following, which should be self-explanatory.

```
xsl:stylesheet
  xsl:template : @ : match "/page"
    xsl:variable : @ : name "base"
      xsl:text "/"
      xsl:value-of : @ : select "@curr"
      xsl:text "/"
    html
      head ...
      body
        nav
          xsl:if : @ : test "@prev != ''"
            a : @ : href "/{@prev}/"
              . "PREV"
          h1 : xsl:text "PAGE "
               xsl:value-of : @ : select "@curr"
          xsl:if : @ : test "@next != ''"
            ...
        xsl:for-each : @ : select "post"
          xsl:variable : @ : name "id"
            xsl:value-of
              @ : select "translate(@title, ' ', '-')"
          article
            @ : id "{$id}"
            h2
              a : @ : href "#{$id}"
                  xsl:value-of : @ : select "@title"
            xsl:for-each : @ : select "picture"
              a : @ : href "{$base}{@filename}.jpg"
                  img
                    @ : src "{$base}{@filename}.small.jpg"
                        alt "{@desc}"
                        title "{@desc}"
        footer ...
```

### Feed Generation

Similarly, for Atom entries on a single page,

```
xsl:stylesheet
  xsl:variable : @ : name "root"
    . "https://gallery.example/"
  xsl:template : @ : match "/page"
    xsl:variable : @ : name "base"
      xsl:value-of : @ : select "$root"
      xsl:value-of : @ : select "@curr"
      xsl:text "/"
    xsl:for-each : @ : select "post"
      xsl:variable : @ : name "url"
        xsl:value-of : @ : select "$base"
        xsl:text "#"
        xsl:value-of
          @ : select "translate(@title, ' ', '-')"
      entry
        link
          @ : rel "alternate"
              type "application/xhtml+xml"
              href "{$url}"
        id : xsl:value-of : @ : select "$id"
        title : xsl:value-of : @ : select "@title"
        content
          @ : type "xhtml"
          div
            xsl:for-each : @ : select "picture"
              img
                @ : src "{$base}{@filename}.jpg"
                    alt "{@desc}"
                    title "{@desc}"
        updated : xsl:value-of : @ : select "@time"
```

The trickier part here is concatenating the entries together.
Simple enough, instead of linking to the stylesheet in the data,
we can read XML files directly from XSLT.

```
xsl:template
  @ : match "/"
  ...
  xsl:apply-templates
    @ : select "document('42/index.xml')/page"
  xsl:apply-templates ...
  ...
```

This allows us to do other cool things, such as embedding SVG in XHTML
to make use of the parent element's [currentcolor], while keeping
the source files separate.  It is especially useful for monochromatic icons,
e.g.

```
xsl:copy-of : @ : select "document('cc.svg')/*"
xsl:copy-of : @ : select "document('by.svg')/*"
xsl:copy-of : @ : select "document('sa.svg')/*"
```

### Thumbnail Generation

So far, we have met three out of the [four requirements](#motivation),
only thing left is creating the thumbnails.  Inspired by Ethan Dalool,
I am going for [fairly large ones of 1024 px in width][big thumbs],

> large enough to comfortably browse the photos without clicking through
> to the big version of each, and the thumbnails are decently light
> and not too jpeggy at about 125-150 kilobytes on average.

At such size, I can aim for around ten photoes[^toes] per page
while maintaining a somewhat decent load time.  Plus, since the width
of images are hardcoded, page [margin] could be automatically inferred
to never stretch them.

```css
html {
    box-sizing: border-box;
    margin: auto;
    max-width: calc(1024px + 2ch);
}
body { margin: 0 1ch }
```

To generate the thumbnails, I use [epeg] together with `make` for wildcarding:

```
PICTURES := $(filter-out %.small.jpg $(PREFIX)/%.jpg, $(wildcard */*.jpg))
THUMBNAILS := $(patsubst %.jpg,%.small.jpg,$(PICTURES))

%.small.jpg: %.jpg
	epeg -w 1024 -p -q 80 $< $@
```

The Makefile also define rules for AoT compilation using [xsltproc]
for the web pages and feed.  Apparently no feed reader supports XSLT,
and for pages runtime processing negatively affect the performance
due to the multiple round trips for the stylesheet and the vector icons.

```
DATA := $(wildcard */index.xml) index.xml
PAGES := $(patsubst %.xml,%.xhtml,$(DATA))
OUTPUTS := $(THUMBNAILS) $(PAGES) atom.xml

all: $(OUTPUTS)

index.xml: $(LATEST)/index.xml
	ln -fs $< $@

%.xhtml: %.xml page.xslt
	xsltproc page.xslt $< > $@

atom.xml: atom.xslt $(DATA) $(wildcard *.svg)
	xsltproc atom.xslt > atom.xml
```

The [full implementation][src] is deployed to [px.cnx.gdn],
mirrored to the [OpenNIC] domain [pix.sinyx.indy] reusing
the former's TLS certificate, because CA/Browser Forum
disallows support for domains not recognized by ICANN and no
[CA for OpenNIC] is mature enough.

## Discussion

> *Okay you built your site using XML macros, so what?
> The syntax is clunky and you hate it so much yourself
> that not even a single line of code example here is in actual XML.
> Doesn't seem like a love story to me!*

Like all relationships, it's not that simple.  I've learned to not judge
a book by its cover and come to the understanding that XML is the (ugly)
equivalence of [sexp].[^sex]  Unlike afterthoughts such as C preprocessors,
[Django]-like templates, or even the Wisp-lookalike syntax of [Slim],
XML stylesheets is in the same data structure.  To put it another way,
one can use XSLT to generate XSLT from XSLT.  Do I need it in this case
or ever at all?  Probably not, but that certainly makes XSL a lot more
attractive in my eyes.

Furthermore, the tooling for XML is highly mature, from editors to linters
and processors to rendering engines.  It'd be lying to say you ain't
fascinated that tis possible to directly feed browsers pure data
instead of markup representations.  More than that, one can have
entirely static API endpoints that are both human- and machine-readable.

> *XSL is just declarative JS!  You are so blinded
> by your lust for functional programming that you have
> become [the very thing you swore to destroy](/blog/reply)!*

My distaste for Ecma scripts is not due to DOM manipulation.
Sure, I do find in-place modification inelegant for documents,
but if only that's the only issue.  I block them on most sites
because they can interact with many things other than just the DOM,
imposing [privacy] and [security] risks while [fucking up the UX].

Architecturally, Ecma scripts enable the absolute bloody worst possible
kind of web pages with zero data at all, fetching tiny pieces of content
in JSON and turn performance [to shit].  The user agents then try to salvage
efficiency by turning themselves into a distributed system component
and adding optimizations that shall never be (ab)used for the sake of users.
O ye [cycle of doom]!

Note that one can make a similar mistake with XSL regarding the number
of round trips, and XML stylesheets can provide the same front-end/back-end
separation.  Both can be used to provide hot loading during development
and AoT rendering in production (if not all, then many JS libraries support
pre-rendering, ignoring the monstrous [dependency graph](/blog/dedep)).
At the end of the day, it's not the matter of technology but principle:
to be in the [users' best interest].

> *There is nothing complex about the photo gallery,
> any existing SSG can do the same with minor tweaks!
> You never needed to write a new one to begin with!*

I am wondering the same myself, but keep in mind there are details
I've been hiding from in the example.  I went all-in for the semantic web
with the hope for best portability and accessibility.  One thing
I haven't mentioned is the `lang` attribute, e.g. `en`, `vi` or `fr`
depending on the post.  Adding this to the web pages requires the SSG
to be somewhat modular, and even harder for the web feed.

Moreover, generic SSG are not designed to handle the difference
in content between a page's `article` and the feed's corresponding `entry`,
neither for having multiple posts in a single page.  Pagination is
also commonly implemented backwards, i.e. page 2 being the second latest one,
making it impossible to avoid link rot.

Not to suggest that the majority of SSG are poorly designed, just that
from a certain amount of [context] difference, tis cheaper to just redesign
from scratch.  This is not about XSL vs Go/Python/JS for SSG or web dev
in general, but this specific and happen-to-be-far-from-complex case.

## Conclusion

At the time of writing, XML has pretty much been superseded by JSON or YAML,
for the better or worse.  I have no love for YAML for obvious reasons,
but it also saddens me to sometimes see JSON being solely used as a container
for HTML.  I hope that this essay can [awaken something in you] about XML
and remind you about the semantic web in your next project.  It worked out
for me, maybe it'll work out for you too!

The story between XML and my photo gallery is a fond love story.
They were born for each other, there was no drama, everything just werkt.
Their romance inspire me to better appreciate stability and maturity,
and value those right in front of my eyes yet I had been *too blind to see*.
Anyway, this is getting too long, so Imma end it with another [song].

> Lookin' for perfect\
> Surrounded by artificial\
> You're the closest thing to real I've seen\
> Sure, everyone has their problems\
> That's a given\
> Yours are the easiest to tolerate

[^spoiler]: If you know, you know.
[^interjection]: Yup, just the kernel.
[^entr]: But in case it works for you, check out [entr].
[^toes]: *Thumb*nails, pho*toes*, get it?-)
[^sex]: Or conventionally in most Lisp 1's, `sex?`.

[CGI]: https://en.wikipedia.org/wiki/Computer-generated_imagery
[bit rot]: https://en.wikipedia.org/wiki/Data_degradation
[for forever]: https://xkcd.com/1683
[mundane stuff]: https://en.wikipedia.org/wiki/Drama
[obvious bullcrap]: https://en.wikipedia.org/wiki/Fiction
[craploads]: https://antifandom.com/how-i-met-your-mother/wiki/Crapload
[new is always better]: https://www.youtube.com/watch?v=1SNRULEnTVQ
[shitposting]: https://fe.disroot.org/@mcsinyx
[Mozart]: https://peervideo.club/w/uByA7Czy7PWYMqnu8FgXvW

[move]: https://github.com/zig-community/user-map/pull/120
[pixelfed]: https://fotofed.nl/cnx
[CMS]: https://en.wikipedia.org/wiki/Content_management_system
[SSG]: https://en.wikipedia.org/wiki/Static_site_generator
[web feed]: https://en.wikipedia.org/wiki/Web_feed
[image]: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/Img
[pagination]: https://en.wikipedia.org/wiki/Pagination

[XHTML]: https://en.wikipedia.org/wiki/XHTML
[Atom]: https://www.rfc-editor.org/rfc/rfc4287
[Wisp]: https://www.draketo.de/software/wisp
[SXML]: https://okmij.org/ftp/Scheme/SXML.html
[XPath]: https://www.w3.org/TR/xpath
[XML Base]: https://www.w3.org/TR/xmlbase
[XSL]: https://simonesilvestroni.com/blog/build-a-human-readable-rss-with-jekyll

[efficiency]: https://xkcd.com/1445
[programming system]: https://programming-journal.org/2023/7/13
[more equal]: https://en.wikipedia.org/wiki/Animal_Farm
[inotify]: https://man7.org/linux/man-pages/man7/inotify.7.html
[make]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/make.html
[mtime]: https://apenwarr.ca/log/20181113
[automation]: https://xkcd.com/1319
[XSLT]: https://www.w3.org/standards/xml/transformation

[currentcolor]: https://developer.mozilla.org/en-US/docs/Web/CSS/color_value#currentcolor_keyword
[big thumbs]: https://voussoir.net/writing/sharing_photos
[epeg]: https://github.com/mattes/epeg
[margin]: https://en.wikipedia.org/wiki/Margin_(typography)
[xsltproc]: https://gnome.pages.gitlab.gnome.org/libxslt/xsltproc.html
[src]: https://trong.loang.net/~cnx/px
[px.cnx.gdn]: https://px.cnx.gdn
[OpenNIC]: https://www.opennic.org
[pix.sinyx.indy]: https://pix.sinyx.indy
[CA for OpenNIC]: https://wiki.opennic.org/opennic/tls

[sexp]: https://en.wikipedia.org/wiki/S-expression
[Django]: https://docs.djangoproject.com/en/dev/topics/templates
[Slim]: https://github.com/slim-template/slim
[privacy]: https://en.wikipedia.org/wiki/Mouse_tracking
[security]: https://react-etc.net/entry/exploiting-speculative-execution-meltdown-spectre-via-javascript
[fucking up the UX]: https://meta.stackexchange.com/q/2980/698165
[to shit]: https://unixsheikh.com/articles/so-called-modern-web-developers-are-the-culprits.html
[cycle of doom]: https://en.wikipedia.org/wiki/Wirth%27s_law
[users' best interest]: https://pluralistic.net/2023/01/21/potemkin-ai/#hey-guys
[link rot]: https://en.wikipedia.org/wiki/Link_rot
[context]: https://guide.handmade-seattle.com/c/2021/context-is-everything

[awaken something in you]: https://www.youtube.com/watch?v=F3QPWrLFsOA
[song]: https://www.youtube.com/watch?v=5LvOdWi3Qno
[entr]: https://eradman.com/entrproject