Worker Size Optimization
How ArgoBox stays under Cloudflare's 3 MiB Worker size limit while running 130+ API routes
Worker Size Optimization
Cloudflare Workers on the free tier have a hard 3 MiB compressed size limit. ArgoBox runs 100+ API routes and multiple SSR pages through a single Worker. This doc covers how the site stays under that limit despite having 400+ content entries.
The Problem
Astro 5 introduced a "data layer" that bundles all content collection data into the SSR Worker when any SSR code path imports from astro:content. With ~402 content entries totaling ~4.5 MB of Markdown, the data layer chunk alone compiled to 10.9 MB uncompressed -- far beyond CF's limit.
The root cause: a single import { getCollection } from 'astro:content' in any SSR-rendered page or API route pulls the entire data layer into the Worker bundle. It does not tree-shake by collection. If one API route imports getCollection, every collection's data ships.
Architecture: Prerender vs SSR
In Astro's output: 'static' mode (ArgoBox now uses this), all pages are prerendered to static HTML by default. Only pages that explicitly export prerender = false become SSR and run as part of the Worker.
Admin pages that are pure client-side HTML + JS (inline <script> tags, fetch() calls to API routes) do not need SSR. They work fine as static files.
Only 4 admin pages actually require SSR because they use server-side Astro APIs:
| Page | Why SSR |
|---|---|
auth-bounce |
Uses Astro.redirect() |
servers/[slug] |
Uses Astro.params, Astro.request, D1 queries |
servers/masaimara |
Reads Astro.request.headers |
dashboard-profiles/edit/[id] |
Dynamic route via Astro.params |
Everything else -- blog posts, journal entries, docs, the homepage, admin panels -- prerenders at build time and serves from the CDN.
Solution: content-api.ts
src/lib/content-api.ts is a drop-in replacement for getCollection() that never imports from astro:content. It provides three functions:
// Metadata only -- synchronous, no network calls, reads from build-time JSON
const posts = getCollectionMeta('posts');
// Full entries with body text -- async, fetches from Gitea API
const posts = await getCollectionSSR('posts');
// Single entry with body text
const entry = await getEntrySSR('posts', '2026-01-15-my-post.md');
How it works
Build time: scripts/build-content-index.mjs walks all content directories, parses YAML frontmatter from each .md/.mdx file, and writes src/data/content-index.json. This file contains every entry's id, slug, collection, and data (frontmatter fields) but no body text. For 402 entries across 6 collections, the index is ~197 KB.
Runtime: getCollectionMeta() reads directly from the imported JSON -- synchronous, no I/O. When body text is needed, getCollectionSSR() and getEntrySSR() fetch raw Markdown from the Gitea API via src/lib/content-backend.ts, strip frontmatter, and return the body. Fetches run in parallel with a batch size of 15.
In development: content-backend.ts detects import.meta.env.DEV and reads files from the local filesystem instead of Gitea.
The critical invariant: zero SSR code imports from astro:content. The data layer is only used by prerendered pages (which don't count toward Worker size because they compile to static HTML at build time).
Solution: Prerender Everything Possible
72 admin pages previously had export const prerender = false despite not using any SSR APIs. Removing that export from those pages moved them from the Worker bundle to static HTML.
Module source files under modules/ (e.g., modules/pentest/, modules/api-dashboard/) also had to be fixed. The install-modules.sh script copies module files into src/ at prebuild time -- if the source .astro files contained prerender = false, it would get re-added on every build. The fix was applied at the source.
Solution: public-docs-loader.ts
The /docs/ route presents sanitized documentation to the public. These files live in src/content/public-docs/ and need to be loaded both in getStaticPaths() (module scope) and in the page template (instance scope).
Astro splits frontmatter into two scopes: module scope (where getStaticPaths lives) and instance scope (where the page renders). Variables declared in one scope are not accessible in the other.
src/lib/public-docs-loader.ts solves this by putting the glob in an importable module:
const modules = import.meta.glob('/content/public-docs/**/*.md', { eager: true });
export const allPublicDocs: PublicDoc[] = Object.entries(modules).map(
([path, mod]: [string, any]) => ({
slug: getSlug(path),
data: mod.frontmatter,
Content: mod.Content,
getHeadings: mod.getHeadings,
})
);
Both scopes import allPublicDocs from the same module. Because public-docs is loaded via import.meta.glob instead of getCollection('public-docs'), it does not pull in the data layer. The glob resolves at build time for prerendered pages, keeping the docs out of the Worker entirely.
Results
| Metric | Before | After |
|---|---|---|
| Worker uncompressed | 10.9 MB | 6.1 MB |
| Worker compressed | ~3.5 MiB | ~1.0 MiB |
| CF limit | 3 MiB | 3 MiB |
| Content index size | N/A | 197 KB |
The compressed Worker went from over the limit to well under it, with room to grow.
Key Files
| File | Role |
|---|---|
src/lib/content-api.ts |
SSR-safe content access (getCollectionMeta, getCollectionSSR, getEntrySSR) |
src/lib/content-backend.ts |
File I/O abstraction -- local FS in dev, Gitea API in production |
scripts/build-content-index.mjs |
Prebuild script that generates src/data/content-index.json |
src/data/content-index.json |
Build-time metadata for all 6 collections (gitignored, regenerated each build) |
src/lib/public-docs-loader.ts |
import.meta.glob loader for public docs, shared across Astro scopes |
Rules for Future Development
Never import from astro:content in SSR code. Any file that ends up in the Worker bundle must not touch the data layer. Use getCollectionSSR() or getCollectionMeta() from content-api.ts instead.
Only add prerender = false when the page actually needs it. The page must use Astro.request, Astro.redirect(), Astro.params on a dynamic route, or server-side bindings (KV, D1). If it just fetches from API routes via client-side JS, it does not need SSR.
Module source files in modules/ must not have prerender = false for .astro pages. The install-modules.sh script copies these files into src/ on every build. If the source has the SSR flag, the flag reappears even after manual removal.
The prebuild chain runs in this order:
packages/argonaut → tsc (compile @argonaut/core — dist/ is gitignored)
→ install-modules.sh
→ sanitize-public-docs.js
→ build-content-index.mjs
→ build-rag-stats.js
→ security-scan.mjs
The @argonaut/core TypeScript compilation must run first because it is a local workspace package ("file:packages/argonaut" in package.json) whose dist/ directory is gitignored. CF Pages clones a fresh repo with no compiled output — if the prebuild doesn't compile the package, Vite fails with Failed to resolve entry for package "@argonaut/core".
content-index.json must be generated after install-modules.sh (which may add new content) and after sanitize-public-docs.js (which generates the public-docs collection). The ordering in package.json enforces this.
Incident: 2026-03-08
Two issues hit simultaneously, each masking the other:
@argonaut/corenot compiled — the prebuild script didn't includetscfor the argonaut package. CF Pages gotFailed to resolve entry for package "@argonaut/core". Fix: addedcd packages/argonaut && npx tsc && cd ../..as the first prebuild step.12 MB data layer in Worker — once the build got past the argonaut error, the deploy failed with
Your Worker exceeded the size limit of 3 MiB. Root cause:src/pages/api/dashboard/status.ts(an SSR route withprerender = false) importedgetCollectiondirectly fromastro:content, pulling the entire 12,165 KiB content data layer into the Worker bundle. Fix: replaced withgetCollectionMeta()fromcontent-api.ts.
The data layer doesn't tree-shake. Even one SSR file importing astro:content bundles every collection. This is why content-api.ts exists — use it for all SSR code.