htmx is composable??

I wrote an HTMX app and it was easy to develop a powerful plugin system within it. That surprised me. I had assumed that JSON-driven REST APIs were the only way to make composable web APIs. In my mind, HTMX blends the backend and frontend together into one monolithic component. It seemed counterintuitive.

Let me tell you about it.

The Streamlit Prototype

Before the New Year I decided to hack on an idea. I wanted a social media client for Mastodon that displays my feed in a way that suits me — surface the information I’m trying to track and de-prioritize everything else. Basically the reverse of how Big Tech opimizes their algorithms. I call it Fossil.

So I spent about 3:30 hours and produced a working app using streamlit. Streamlit was an amazing experience, it certainly streamlined the proof of concept phase. When I wrote about it, someone on HN said they liked the idea of having their own algorithm, they just didn’t like what I made. What a good thought! I should turn this into a pluggable framework for creating social media algorithms!

So now my goal is to make a pluggable framework, where anyone can make their own algorithm.

The Plug-in Framework

As I rewrote fossil in HTMX, I designed for a pluggable interface. The algorithm part was easy — 3rd parties can write a Python class that implements a few abstract methods. It’s all Python, so it’s pretty straightforward.

But what if someone needs a new SQL table? Like maybe they need to cache some kind of statistics about users (e.g. topics they post about, authoritative posts, etc.). Well, they can probably just run CREATE TABLE statements in the constructor of the class. Seems fine.

graph LR subgraph server FastAPI SQLite end SQLite --> FastAPI --> HTMX

Right, but what if they want to add buttons in the UI? e.g. If a user can mark a post as belonging to the “political nonsense” topic, then we could train a model to identify posts we don’t want to see. But that means the plugin would need to add buttons to the UI to provide that kind of feedback.

When I first saw Simon Wilison’s llm tool, I loved how easy it was to install plugins. Just pip install. I want the same ease here too. The thing is, with components that span UI, backend and database, that tends to be a tough sell.

With fossil plugins, it’s become straightforward to work on any part of the stack:

  1. UI elements — write verbatim HTML or Jinja templates, packaged into a plugin
  2. API endpoints — register them via a decorator API
  3. DB tables — Create them during plugin initialization
  4. AI algorithms — register them via the API

That’s neat. The whole stack.

graph TD fossil-->ui[UI Plugins] api[API endpoints]-->fossil db[DB tables]-->fossil fossil-->ai[AI Algorithms]

toot_debug.py

As a very short example, this is a real plugin in fossile core. It adds the ability to click a button and see what the Mastodon JSON message looks like in the server terminal. I use it a lot for developing Fossil.

import json
from fastapi import responses
from fossil_mastodon import plugins, core


# Metadata
plugin = plugins.Plugin(
    name="Toot Debug Button",
    description="Adds a button to toots that prints the toot's JSON to the server's console.",
)


# An API endpoint. The `plugin.api_operation` object is a FastAPI app.
@plugin.api_operation.post("/plugins/toot_debug/{id}")
async def toots_debug(id: int):
    toot = core.Toot.get_by_id(id)
    if toot is not None:
        print(json.dumps(toot.orig_dict, indent=2))
    # Feedback that the button was clicked. This 
    # will replace the text of the button.
    return responses.HTMLResponse("<div>💯</div>")


# A UI plugin. The bits of HTML are included into the `/index` response.
@plugin.toot_display_button
def get_response(toot: core.Toot, context: plugins.RenderContext) -> responses.Response:
    return responses.HTMLResponse(f"""
        <button hx-post="/plugins/toot_debug/{ toot.id }">🪲</button>
    """)

That provides an API endpoint, as well as a bit of HTML that instructs how the API endpoint is incorporated into the application.

My Confusion

I think of APIs like UNIX-style CLI programs — a collection of tiny parts that are easy to combine in ways the creators never thought of. Plugin systems, on the other hand, are defined by their composability. Monoliths generally aren’t composable. I’m describing HTMX as monolithic because I tend to push all program logic into the backend, all in once place.

The problem is, I wasn’t comparing against just REST APIs, I was comparing against React + REST.

graph LR React-->API-->React

So, while an API might be extremely composable on it’s own, the combination of React + an API isn’t just monolithic, it’s a monolith split across a distributed system. And those are extremly non-composable.

Individual React components are very composable. But when you combine the requirements that I need, spanning the full stack, you find yourself in what I like to describe as a distributed system, since state is split between the client and server.

I’ve spent a fair amount of time working with distributed systems. It’s just regular programming, just that everything is harder. Exceptions don’t bubble up, errors can be indistinguishable from latency, systems don’t compose, error handling doesn’t have a single best approach, even retries are harder than they should be.

HTMX as Configuration

Stepping back, it feels like the HTML is more like a configuration language, with instructions for how all the pieces fit together. There is state, but it’s hidden within the engine that interprets my declarative configuration (a.k.a the browser).

Years ago, in .NET and Java, it was popular to use an Inversion of Control container with XML configuration that declared and configured different classes and objects. I think it largely went out of style because it’s complicated, or at least more complicated than it needed to be.

The HTML I write with HTMX feels a bit like IoC configuration, in that describes how all the program components fit together. But it’s more functional, because it also describes how the UI is laid out. When I look at it as configuration, it’s clear why it’s easy to make a plugin system in it. It is a plugin system.

Conclusion

Thinking of HTMX as a sort of configuration helps me understand it’s contributions to program composability. I’m not sure if that helps anyone else, but the entire framework makes more sense to me since I’ve started thinking about it that way. The HTMX site talks about [HTATEOAS][hateaos], which is a different phrasing this — the HTML is the application state.

Discussion