HTML basics · 2 / 8
lesson 2

The shape of a document

Doctype, head, body, charset, viewport — the contract every HTML file signs.

~ 10 min read·lesson 2 of 8
0 / 8

Almost every HTML file starts with the same five or six lines, and almost every beginner copies them without knowing what they do. They aren't ceremony — each one tells the browser something it would otherwise have to guess at, and guessing the wrong way produces pages that render in subtly broken modes for the rest of their life.

index.html
<!doctype html>
<html lang="en">
<head>
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <title>Page title</title>
</head>
<body>
  <!-- visible content -->
</body>
</html>

The doctype line

<!doctype html> is not a tag and not an HTML element. It's a declaration the browser reads before it starts parsing, and its only job is to say "use standards mode."

If you omit it, the browser falls into "quirks mode" — a compatibility hack from the late 1990s that emulates the bugs of Netscape 4. Box sizing changes, table heights behave differently, some CSS properties stop working. Quirks mode is a museum exhibit you don't want your page wandering into.

Capitalization doesn't matter (<!DOCTYPE html> is equally valid), and there's no closing tag — it's a one-shot signal at the top of the file.

check your understanding
You inherit a project where every page omits <!doctype html>. The CSS layout looks slightly off across the site. What's the most likely first cause to check?

html, head, body

<html> is the root element — every other element nests inside it. The lang attribute on <html> tells screen readers, search engines, and translation tools what language the content is in. Skipping it is how you end up with English pages read aloud with a Spanish accent.

<head> holds metadata — information about the page that doesn't appear on the page. Title, character encoding, viewport hints, links to stylesheets, scripts to load, social-share images. Nothing in <head> renders into the visible layout.

<body> holds the content — everything the user sees and interacts with. Headings, paragraphs, images, forms, the lot.

htmlheadbody title meta link h1 p main metadata vs content
The browser parses head metadata before painting body content.

The split exists so the browser can do useful work before the body finishes downloading: render the tab title, fetch the stylesheets, set the encoding correctly. Mixing content into <head> (a stray <p>, for example) breaks that contract and forces the parser to abort the head and start the body early.

check your understanding
Where does <link rel="stylesheet" href="..."> belong?

Charset and viewport

Two <meta> tags belong in every modern document.

<meta charset="utf-8"> declares the file's character encoding. UTF-8 covers every script in human use — Latin, Cyrillic, Arabic, Chinese, every emoji. Skip it and the browser guesses, usually wrongly, and your apostrophes turn into mojibake the first time someone pastes in a smart quote.

<meta name="viewport" content="width=device-width, initial-scale=1"> tells mobile browsers to render at the device's natural width instead of pretending to be a 980px desktop and zooming out. Without it, your responsive CSS does nothing on a phone — the browser is rendering everything at desktop size and shrinking the result.

Watch out

Both metas must be inside <head>, and charset must come within the first 1024 bytes of the file — the parser uses it to decide how to read everything that follows. Put them at the top of the head, ahead of <title>.

check your understanding
You build a page with mobile-first CSS, but on a phone the layout still looks like the desktop version, scaled down. What's missing?

The document outline

Inside <body>, the structure of headings (<h1><h6>) and sectioning elements (<article>, <section>, <nav>, <aside>) gives the page an outline — a hierarchical table of contents the browser builds automatically.

A few elements act as sectioning roots: <body>, <blockquote>, <details>, <dialog>, <fieldset>, <figure>, <td>. Headings inside them don't bleed out into the surrounding outline. That matters because it means a <h1> inside a <blockquote> is "the heading of that quote", not "another top-level heading of the document."

You'll meet semantic sections in lesson 7. For now the takeaway is small: the outline is real, screen readers and crawlers walk it, and laying out your headings in a sensible order pays off the moment one of those tools starts reading your page.

check your understanding
A page has one <h1> for the article, then a <blockquote> containing its own <h1>. What does the document outline see at the top level?
← prevnext lesson →
KeepLearningcertificate
for completing
HTML basics
0 of 8 read