Markup First: Describe the Structure, Then the Behaviour

When I lay out a screen, I ask what the structure looks like before I ask which component to use. That ordering sounds trivial, but it decides how maintainable and how debuggable everything downstream turns out to be. Most framework tutorials teach it the other way round: here’s a component, now pour your data into it. The structure ends up smeared by the component’s lifecycle and state management, and you lose your intuition for what the browser actually renders.

Structure first, behaviour wrapped on top

Markup languages — HTML, Markdown, XML — are, at heart, descriptions of what a document is made of and how its pieces nest. That’s a declarative statement of structure, separate from how anything works. I like to let this layer settle first: a heading is a heading, a list is a list, a form is a form. Get the semantics right and you get most of accessibility, SEO, and readability for free.

Behaviour should be a thin layer wrapped over that. A button that fires a request and drops the returned fragment into some container doesn’t actually need a component framework underneath it. htmx does this cleanly — interactions are expressed as HTML attributes, the server decides what comes back, and the client holds almost no state. I’m not saying everything should be written this way. Highly interactive, offline, canvas, or local-first apps genuinely need client-side state management; there’s no dodging that. But for most content- and form-driven pages, the default should be “structure first, behaviour thin,” not reflexively standing up an SPA.

When the server hands back a piece of already-formed HTML rather than a blob of JSON for the client to reassemble into DOM, the mental overhead drops sharply. With the former, what you see in devtools is what you get; with the latter, you have to simulate the render in your head before you understand why the page looks the way it does.

An attribute is not a property

One distinction has been blurred by frameworks for years, and it matters: an HTML attribute and a DOM property are not the same thing.

An attribute is the initial value you write in the markup — it describes the element’s starting markup state. After the browser parses the HTML, it builds DOM nodes, and the property on a node is the live, mutable, runtime state. They often share a name while their values diverge. The classic case is input: you write value="hello" in the HTML, the user types, and element.value becomes the new content — but element.getAttribute("value") is still hello. One is the initial markup; the other is the current state. checked versus defaultChecked, class versus classList, sit on the same fault line.

<input value="hello">
<!-- after the user changes it to world -->
<!-- element.value          => "world"  (DOM property, live) -->
<!-- element.getAttribute() => "hello"  (HTML attribute, initial) -->

A framework’s abstraction only reduces how often you touch this line directly; it doesn’t erase it. When you write value={state} in React, or set up two-way binding elsewhere, something underneath is still syncing a property to a state source on your behalf. The day a binding misbehaves, a form value won’t line up, or a third-party library mutates the DOM and fights the framework, you’re back at this distinction asking: did the attribute fail to update, or did something quietly change the property? People who know this line debug far faster, because they know which side to inspect.

Start by drawing the DOM tree

In practice I’d suggest designing a screen by drawing the DOM tree first — on paper or in your head. What’s the root, how many blocks sit beneath it, what’s the semantic tag for each, where does interactivity belong. Once that tree is clear, which framework you reach for and whether you split components are later questions, and they get easier, because component boundaries usually fall right on the tree’s natural seams.

This makes maintenance much better. When something breaks, your mental model is aligned with the Elements panel in devtools — you can reason backwards from the real DOM rather than staring at a pile of component names and props, guessing what they flatten into. Debugging is, at bottom, finding the gap between the structure you imagined and the structure that exists; the closer those two are, the faster you find it.

There’s a longer-running observation too. The thicker the software stack gets, the more layers in the toolchain, the more users and engineers drift back to thinner, more transparent forms of expression. Plain text, Markdown, and markup languages endure because you can read them directly, diff them, and put them under version control to see how each line changed. A screen produced through ten layers of abstraction is hard to diff when it breaks — which layer changed what? A piece of HTML or Markdown, when it breaks, you see it at a glance.

In the end this isn’t nostalgia, not a belief that older things were better. It’s that structure-first artefacts are controllable and explainable: controllable because the layering is simple and each layer owns a small, clear responsibility; explainable because it’s just sitting there, no need to run a framework to learn the outcome. Those two are the whole point.

Takeaways

  • Design a screen by describing structure first and wrapping behaviour on top — not by starting from framework components.
  • An HTML attribute is the initial markup state; a DOM property is the live runtime state. Frameworks reduce but never erase the line.
  • People who understand the attribute/property line debug faster, because they know which side to inspect.
  • Start by drawing the DOM tree; your mental model stays aligned with devtools, so gaps are quick to find.
  • As stacks thicken, plain text and markup languages endure because they’re readable, diffable, and version-controllable.

Sheng’s take, drafted with Claude · part of the 2026-06-13 blog renovation, paint still drying.