Skip to main content
5 min read
Guide

MDX, Content Collections, and Content-as-Architecture

Why dropping Markdown files into a folder does not make a publishing system, and how to build one that scales.

Published
By Interface Atlas Team

The most common failure mode when adopting MDX or Markdown in a JavaScript framework is treating it like a glorified word processor.

Teams configure a parser, drop fifty .mdx files into a /content folder, and assume they have a publishing system. Six months later, they have an untyped pile of posts. The frontmatter is inconsistent. Guide pages use a different layout than blog posts, but only because an author remembered to import the correct component. Reorganizing the documentation requires a regex search across the entire repository.

When your product is reading material, your content model is your architecture.

The Content-as-Architecture Thesis

In a file-based publication like Interface Atlas, you must stop viewing the src/content/ folder as a storage bucket. It is a database.

If it is a database, it needs a schema. It needs relationships. It needs constraints. In Astro, we enforce this architecture using Content Collections.

The Repository as a Publishing System

graph TD
    subgraph The Database Layer
        C1[Guides Collection] -->|Validates via Zod| S1(Guide Schema)
        C2[Glossary Collection] -->|Validates via Zod| S2(Glossary Schema)
    end
    
    subgraph The Graph Layer
        S1 -->|topicKeys| T[Topic Hubs]
        S1 -->|glossaryKeys| C2
    end
    
    subgraph The Presentation Layer
        T --> R1[Astro Topic Route]
        C1 --> R2[Astro Guide Route]
        C2 --> R3[Astro Glossary Route]
    end

Moving from Markdown to Content Collections

The Untyped Pile (The Failure Mode)

Before Content Collections, frameworks loaded markdown files via glob imports. You got whatever frontmatter the author typed. If an author typed diffculty: hard instead of difficulty: Advanced, the build would pass, but the page would silently drop the difficulty badge. Content logic became hidden inside fragile UI components trying to parse broken data.

The Typed Collection (The Solution)

Astro’s Content Collections fix this by introducing a strict boundary between content and rendering.

// src/content.config.ts
import { defineCollection } from 'astro:content';
import { glob } from 'astro/loaders';
import { z } from 'astro/zod';

export const guides = defineCollection({
  loader: glob({ pattern: "**/*.mdx", base: "./src/content/guides" }),
  schema: z.object({
    title: z.string(),
    description: z.string(),
    topicKeys: z.array(z.string()), // Enforces the graph relationship
    difficulty: z.enum(["Foundation", "Intermediate", "Advanced"]), // Prevents typos
  }),
});

This code teaches an architectural consequence: Content errors are now build errors. If an author forgets the topicKeys array, the Astro build fails. The site cannot deploy in a broken state.

Frontmatter is the Relational Schema

Once you have typed collections, Frontmatter stops being just a place to put the title. It becomes the relational schema that builds your content graph.

In Interface Atlas, we do not hardcode “Related Guides” at the bottom of MDX files. We use frontmatter keys:

---
title: "Information Architecture"
topicKeys: ["architecture", "seo"]
glossaryKeys: ["taxonomy"]
relatedGuideKeys: ["what-frontend-architecture-really-is"]
---

Because this is strongly typed, the routing layer can automatically generate the related links, the topic hub chips, and the glossary tooltips.

The Git-Based Editorial Workflow

When the repository is the CMS, the Git workflow is the editorial workflow.

  1. Drafting: An author creates a branch and writes an MDX file.
  2. Review: The Pull Request becomes the editorial review. Reviewers can comment on prose the same way they comment on code.
  3. Validation: CI/CD runs the Astro build. The schema validates the frontmatter, and MDX validates the component syntax.
  4. Publishing: Merging to main triggers a static deployment to the edge.

This workflow fails when teams try to force non-technical editors to use GitHub directly without a tool like Decap CMS, or when the MDX files become so polluted with complex JSX that authors can no longer read the text.

When MDX Is Not Enough

MDX is a hybrid medium. It allows you to embed complex interactive components directly into long-form text. But you must know when to stop.

If an MDX file contains 40 lines of prose and 300 lines of complex React state management, you have crossed a boundary. The content is now subservient to the application logic.

The Rule: Keep MDX files focused on reading. If an interactive widget requires complex local state, abstract it into an Island component and pass it simple props from the MDX file.

Explore the supporting systems: