Tags and attributes
Two primitives, a handful of rules, and you can read any HTML ever written.
HTML has two primitives. Tags mark up regions of content; attributes attach extra information to a tag. That's the whole language. Once you internalize the two, every element you'll ever meet — from <input> to <picture> to <dialog> — is just a name plus attributes plus (sometimes) content.
A normal tag is a pair: an opening tag, content, a closing tag. The closing tag is identical to the opening tag with a / after the <.
<p>This paragraph has <strong>emphasized</strong> text inside it.</p>
Tags can nest, but they can't cross. <p><strong></p></strong> is malformed — <strong> opens before <p> closes, but closes after. The browser will silently rewrite this for you, and the resulting DOM may not be what you expected.
A handful of tags don't have content and don't have a closing form. They're called void elements: <br>, <hr>, <img>, <input>, <meta>, <link>, <source>, <track>, <area>, <base>, <col>, <embed>, <wbr>. You write them once and they're done — there's nothing to wrap.
<img src="cat.jpg" alt="A tabby cat asleep on a keyboard"> <input type="email" name="email">
Older HTML and XHTML used self-closing syntax (<img />). HTML5 doesn't require the trailing slash on void elements — <img> is the standard form — but it tolerates it. Use whichever your project's style guide prefers and stop thinking about it.
Attributes
Attributes go inside the opening tag, after the tag name, separated by whitespace. The standard form is name="value" — value in double quotes, no spaces around the =.
<a href="/about" target="_blank" rel="noopener">About</a>
Three small rules cover most edge cases:
- Quoting. Double quotes are conventional, single quotes are equivalent, and unquoted values work only if there are no spaces or special characters in the value (
<input type=text>is legal,<input type="text email">requires quotes). Pick one form for the project and stick to it. - Order doesn't matter.
<a href="/x" rel="noopener">and<a rel="noopener" href="/x">are identical to the parser. - Boolean attributes. Some attributes are toggles — they're either present or absent.
<input disabled>,<option selected>,<details open>. The HTML form is just the attribute name. The XHTML-flavoreddisabled="disabled"works too but reads as superstition; the modern form is the bare name.
disabled="false" does not unset a boolean attribute. The attribute is present, so it's true — the value is ignored. To remove the disabled state, remove the attribute entirely.
Global attributes
A handful of attributes work on any element. Worth knowing because they show up everywhere:
id— a unique identifier within the document. Used by CSS (#main), JavaScript (getElementById), and the URL hash (/page#section). Must be unique; if you have two elements with the same id, only the first one is reachable by#idselectors.class— a space-separated list of class names. Not unique — many elements can share a class. The hook CSS reaches for most.lang— overrides the document'slangattribute for a region of content. Useful when you have a quoted phrase in another language.hidden— when present, the element isn't rendered and isn't read by screen readers. Equivalent todisplay: nonefrom a styling perspective, but built into the platform.title— a tooltip-style hint shown on hover. Not for critical information; users on touch devices and screen readers won't always see it.data-*— any attribute starting withdata-is yours. The platform ignores them; your JavaScript can read them viaelement.dataset.
<article id="post-2024" class="post featured" data-author="ada"> <h2>HTML is older than I thought</h2> <p lang="fr">« quelque chose en français »</p> <p hidden>This won't render until JS removes the attribute.</p> </article>
id="cta". The CSS rule #cta { color: red } works on both. What about document.getElementById("cta")?