HTML to PDF library vs API — when self-host wins, when it doesn't

21pdf Engineering · 2026-04-24 · 12 min read · Comparisons

Every few months, someone on our team or a customer’s team proposes “we should just self-host Puppeteer — how hard can it be?” The answer is harder than you’d guess, for longer than you’d want, and this post is the structured version of the rant I’ve given several times.

It’s not a trick question. Self-hosting HTML-to-PDF rendering is the right answer for some teams. It’s the wrong answer for most. The line between them is clearer than it looks once you separate the engineering work from the cloud bill.

TL;DR

Self-host wins when: volume > 100k PDFs/day, PDF rendering is core to your product, you have a dedicated engineer comfortable with browser infrastructure.
Managed API wins when: any of the above is false. Including “we have the engineer but they should be building the actual product”.
The hidden cost of self-hosting is operational: Chromium memory leaks, font management, SSRF hardening, queue back-pressure, on-call for 3am alerts. None of it is hard; all of it adds up.
The crossover volume is roughly 100k PDFs/day (~3M/month). Below that, paying $29-$499/month for a managed API is cheaper than the engineer-hours required to do it right.
Gotenberg is the most turnkey self-hosted option — a Docker image that wraps Chromium via a clean HTTP API. Start there if you must self-host.

The real choice

The question isn’t “library vs API” in abstract. It’s a specific engineering decision with three variables:

How much PDF rendering am I doing? Measured in PDFs/day at peak, not average.
How central is PDF rendering to my product? A billing SaaS that renders invoices is in the “central” bucket. A collaboration tool that renders “export to PDF” when users click the button is in the “peripheral” bucket.
What’s my team’s operational appetite? Specifically, who wakes up at 3am when the PDF queue backs up or the Chromium pool OOMs.

Three answers give you six scenarios. The self-host vs API tradeoff is different in each.

Volume	Centrality	Op appetite	Recommendation
< 10k/day	Peripheral	Low	Managed API ($9-$29/mo) — no-brainer
< 10k/day	Central	Low	Managed API ($29/mo) — spend engineering on the real product
< 10k/day	Central	High	Managed API first, revisit at 10× growth
10k-100k/day	Peripheral	Low	Managed API ($69-$299/mo) — at this volume you still win
10k-100k/day	Central	High	Either works; lean self-host if you want control
> 100k/day	Central	High	Self-host (Gotenberg or custom) — vendor margins start to hurt

Below the line in the table above, managed APIs win. Above it, self-hosting starts to make sense. Most SaaS teams live below the line and don’t know it because they overestimate their volume and underestimate the operational tax.

What “library” means, concretely

“Library” in this post covers every option where you run the Chromium pool yourself:

Puppeteer (Node.js, Google-maintained) — npm i puppeteer, browser.newPage(), page.pdf(). Bundles a specific Chromium; tightly coupled to version.

Playwright (Node/Python/Java/.NET, Microsoft) — similar API, multi-browser (though PDF only on Chromium). More ergonomic for complex interaction scenarios.

chromedp (Go) — talks CDP directly, no Puppeteer abstraction. Good if your stack is Go.

Gotenberg — Docker image wrapping Chromium + LibreOffice + Pandoc. Clean HTTP API, no client library required. Most turnkey.

Weasyprint (Python) — pure-Python renderer, not Chromium-based. Decent for simple reports; lacks modern CSS (no flex spec-compliance, no grid, no container queries, no @page margin-boxes).

Prince XML — commercial, CSS Paged Media Level 3-compliant, typeset quality. $3800+/server licence. Used by publishers.

WeasyPrint, wkhtmltopdf — neither is the right answer for greenfield work in 2026. Weasyprint for the CSS limitations above; wkhtmltopdf because upstream is dead.

The discussion below centres on Puppeteer/Playwright/Gotenberg — the three Chromium-based libraries that are credible self-host options today.

Cost model, with real numbers

Let’s cost out a concrete scenario: 50,000 PDFs/day (~1.5M/month), mixed sync and async.

Managed API option (21pdf Business tier)

50,000/day × 30 = 1.5M/month — above the 50k/month Business tier
Needs a custom plan at this volume, usually $199-$499/month range
Engineering time: ~2 hours/month on integration maintenance
Total monthly: ~$299 + 2hrs eng

Self-host option (Puppeteer on Kubernetes)

5 worker pods (2 vCPU, 4GB RAM each): ~$150/month cloud cost
Load balancer: ~$20/month
Redis queue: ~$15/month
Object storage (1M PDFs/month × avg 200KB = 200GB): ~$5/month S3 + $15/month egress
Monitoring (Datadog / Prometheus+Grafana): ~$50/month (small)
Engineering time to set up: 40-80 hours
Engineering time to maintain: 5-15 hours/month for upgrades, incident response, font updates, SSRF reviews, version pinning
Total monthly: ~$255 cloud + 10hrs eng

At a fully-loaded engineer cost of $150/hr, that’s $1,500/month in engineering time plus $255 cloud. Total: **$1,755/month** for self-host vs ~$600/month (including engineering) for the managed API.

The cloud bill is roughly the same. The engineering time is the difference. Self-hosting pays off when the cloud bill replaces a larger API bill AND you genuinely have spare engineering capacity.

At 500,000 PDFs/day (~15M/month), the numbers flip:

Managed API: ~$2000-$5000/month at volume-negotiated rate
Self-host: ~$500 cloud + 15hrs eng = ~$2,750/month

Approximately a wash. Now the decision is about control and product centrality, not cost.

What you have to build yourself (beyond Chromium)

If you self-host, you’re not done when page.pdf() works. You’re done when all of these work:

1. Process pool with recycling

One Chromium per worker, create a tab per request, close tab after. Recycle the whole process every N requests or M minutes because Chromium leaks memory. Typical N=500-2000, M=24hrs.

// Sketch — the real thing is hundreds of lines.
class BrowserPool {
  private browsers: Array<{ browser: Browser; uses: number; spawned: Date }> = [];

  async acquire(): Promise<Browser> {
    const eligible = this.browsers.find((b) => b.uses < 2000 && this.age(b) < 86400_000);
    if (eligible) { eligible.uses++; return eligible.browser; }
    return this.spawn();
  }

  async spawn(): Promise<Browser> {
    const browser = await puppeteer.launch({
      args: ['--no-sandbox', '--disable-dev-shm-usage', ...],
    });
    this.browsers.push({ browser, uses: 1, spawned: new Date() });
    this.reap();
    return browser;
  }

  reap(): void {
    // Kill browsers past their use count or age; spawn replacements on demand.
  }

  // ...
}

2. SSRF hardening

If you accept URL inputs, you need:

HTTP-boundary blocking of private IP ranges before fetch
In-browser request interception to re-check every sub-request (DNS rebinding defence)
Handling for redirects, WebSocket, EventSource, dynamic imports

// Boundary check example
function isSSRF(url: string): boolean {
  const u = new URL(url);
  const ip = await resolveHost(u.hostname); // DNS resolution
  return isPrivateIP(ip) || isLinkLocal(ip) || isLoopback(ip);
}

// In-page interception
page.on('request', (req) => {
  if (shouldBlockSSRF(req.url())) {
    req.abort();
  } else {
    req.continue();
  }
});

The real implementations are longer. Missing either layer is a CVE waiting to happen.

3. Queue and back-pressure

Don’t accept a request if the pool is saturated — return 429 with Retry-After. Enforce per-customer concurrency limits to prevent noisy-neighbour issues.

4. Font management

Ship a font catalogue with your Docker image. Update it when customers report missing glyphs. Handle CDN-fetched web fonts (wait_for_network_idle, or inline as base64).

5. Error classification

Distinguish “customer HTML is broken” (4xx) from “our service is broken” (5xx). Silence, retry strategy, alert routing all depend on this.

6. Metrics + alerting

Render duration histogram (p50/p95/p99), queue depth, worker memory, render success rate. Alert on anomalies without waking you up for normal traffic spikes.

7. Output storage

Stream PDFs to S3/R2/GCS. Handle upload failures. Lifecycle policies for retention. Signed URLs for delivery.

8. Auth

If the service is internal only, this is easy. If you expose it to your customers (many SaaS do), you need API-key auth, rate limits, billing integration.

Each of these is tractable — nothing exotic. The cumulative cost of doing them all well, maintaining them, and iterating on them is what pays for the managed API.

The Gotenberg shortcut

If you’re going to self-host, Gotenberg is the starting point. It’s a Docker image that exposes Chromium’s PDF path over a clean HTTP API. You get:

Chromium rendering with sensible defaults
Network-idle waiting
CSS @page support
Multi-format support (Word/Excel/Powerpoint via LibreOffice, if you need it)
Built-in URL and HTML endpoints
OpenTelemetry hooks

You miss:

Multi-tenant auth and quotas
SSRF hardening beyond basic URL allowlisting
Queue/back-pressure built-in

For an internal service (employees render PDFs through an auth’d front-end), Gotenberg is excellent — 10 minutes to Docker-up, working PDF service. For an external service (customers hit it with API keys), you wrap Gotenberg with your own auth/SSRF/queue layer.

Gotenberg deployment sketch

# docker-compose.yml
services:
  gotenberg:
    image: gotenberg/gotenberg:8
    ports: ["3000:3000"]
    restart: unless-stopped
    command:
      - "gotenberg"
      - "--api-port=3000"
      - "--api-timeout=60s"
      - "--chromium-disable-javascript=false"
      # SSRF: deny all private ranges for URL inputs
      - "--chromium-allow-list=^https?://.*$"
      - "--chromium-deny-list=^https?://(localhost|127\\..*|10\\..*|172\\.1[6-9]\\..*|172\\.2[0-9]\\..*|172\\.3[0-1]\\..*|192\\.168\\..*|169\\.254\\..*).*$"

Call it from your app:

curl --request POST \
  --url http://gotenberg:3000/forms/chromium/convert/html \
  --form files=@"index.html" \
  --form 'paperWidth=8.27' \
  --form 'paperHeight=11.69' \
  --form 'marginTop=0.79' \
  --output out.pdf

And you’ve got a self-hosted HTML-to-PDF service in less than a day.

Scaling Gotenberg

For higher volume:

Run 3-10 Gotenberg replicas behind a load balancer
Queue requests in Redis/Sidekiq/Celery rather than hitting Gotenberg synchronously
Monitor Chromium memory per container; recycle containers aged 24h+ with rolling restarts
Set --api-timeout=60s and enforce client-side timeouts below that

Kubernetes makes the scaling story ergonomic; plain Docker Compose works for smaller deployments. Either way, you own the uptime.

What you specifically lose with self-host

Beyond engineering hours, there are capabilities most managed APIs ship that you rebuild from scratch:

Multi-region rendering

If your customers span continents, you want rendering close to them (reduces fetch latency for url inputs). Managed APIs like PDFShift and DocRaptor run multi-region pools. Self-hosting this requires deploying to multiple regions and routing intelligently — weeks of work.

Hosted SSRF hardening with a track record

Vendors have been hit by SSRF CVEs, filed them, patched, and learned. Your new deployment starts at zero. If SSRF is a compliance concern, vendor track record matters.

Uptime SLA you can point at

Customer asks “what’s your PDF generator’s SLA?” Vendor has a public status page. You have “uhh, I’ll check Grafana.” Public SLAs matter for enterprise sales.

Fast patch cadence on Chromium CVEs

Chromium announces a critical renderer CVE. Vendors patch within 72 hours (because their customers demand it). Your Docker image update hits whenever you next rebuild.

Global font set

Vendor has Noto Sans with every script, plus Inter, Roboto, system fonts. You have whatever’s in your Debian base image. Miss a script and your customer’s Devanagari / CJK / Arabic content renders as rectangles.

Each of these is buyable/buildable, but each costs engineer-time. Sum them, and you have the vendor’s margin.

What you specifically gain with self-host

There are real reasons to self-host too. If one of these applies, self-hosting is legitimately correct:

Data residency

Customer HTML containing PII must not leave your infrastructure. Healthcare, legal, financial, some government contexts. Managed APIs can typically promise EU or US residency but not, say, “data never leaves our AWS account.” Self-host gives you that.

Truly custom rendering behaviour

You need a Chromium flag or patch that vendors don’t expose (rare but happens — heavily-customised CJK handling, specific colour-profile output, PDF/A-3 with embedded XML for Factur-X, etc.).

Cost at massive scale

At 1M+ PDFs/day, vendor margins stop being trivial. A $0.002-per-PDF vendor price × 30M/month = $60K/month. Self-hosting at that volume is ~$5K/month infra + 1 FTE partial attention = ~$15K/month. $45K/month in your pocket is real money.

Integrated with your internal infra

If your service needs to render PDFs with access to internal data (private Kubernetes networks, internal DNS, VPN-only databases), the Chromium fetch has to be inside that perimeter. Managed APIs can’t do this; self-host can.

Offline / air-gapped

Government, military, classified-data contexts. No external API calls at all. Self-host is the only option.

A practical migration path

If you’re self-hosted and considering API, or vice versa, here’s the cleanest migration pattern:

From self-host to API

Introduce a PdfRenderer interface in your code with two implementations: SelfHostRenderer (your current Puppeteer wrapper) and ApiRenderer (calls the managed API).
Route 1% of traffic to the API via a feature flag. Compare outputs byte-for-byte on a sample (they should be nearly identical for Chromium-based vendors).
Ramp up percentage as comparison looks clean. Monitor render-time distributions for regressions.
At 100%, deprecate the self-hosted path. Keep the code for a month in case you need to roll back.

From API to self-host

Same interface pattern.
Deploy Gotenberg in your cluster; test with internal traffic.
Compare byte-stability, render times, memory behaviour for a week.
Cut over production traffic gradually. Keep the API contract live for 30 days as fallback.

In both directions, the interface-first approach means the cutover is a configuration change, not a code rewrite. Worth investing in early.

When to revisit the decision

Even once you’ve decided, revisit annually:

From API side, revisit when:

Your bill exceeds $5k/month (start evaluating self-host)
You hit a vendor rate limit or feature gap you can’t work around
You want multi-region rendering and your vendor doesn’t offer it
Compliance or data residency requirements harden

From self-host side, revisit when:

The engineer maintaining it leaves the team
You’ve had > 2 Chromium-related incidents in a quarter
Your volume dropped below 50k/day (cost advantage weakens)
A new vendor launches a specifically-relevant capability

Before you self-host, try 21pdf

100 PDFs/month free tier. $29 Pro for 10,000/month — cheaper than the engineering hours you’d spend on a weekend setting up Puppeteer.

Get API key → See pricing

Closing

Self-hosting HTML-to-PDF rendering isn’t hard. Doing it well is time-consuming. That’s the whole tradeoff.

If your volume is real, your team is large, and PDF rendering is a core capability, self-host with eyes open — and pick Gotenberg as the starting point. You’ll save money and gain control.

If your volume is modest, your team is lean, and PDFs are one of a dozen things your product does, use a managed API and spend the engineering time on what actually differentiates your product. The managed API market is competitive enough that no vendor can safely gouge you.

The worst outcome is the middle path — a half-built self-hosted service that nobody has time to maintain, silently producing subtly wrong PDFs until a customer escalates. Pick one side and commit.

— 21pdf Engineering

Frequently asked questions

Is it cheaper to self-host Puppeteer than to use an HTML-to-PDF API?

Not until you're doing serious volume. For under 10k PDFs/day, a managed API at $29-$99/month is cheaper than a single engineer's partial attention on a self-hosted Puppeteer fleet. The crossover point is usually around 100k PDFs/day — below that, you're paying hidden engineering tax to save cash.

What's the best self-hosted HTML-to-PDF tool?

Gotenberg (Docker-packaged Chromium wrapper) is the most turnkey. Puppeteer/Playwright you run yourself gives the most control. Prince XML is the best for quality but commercial-licensed. Weasyprint is viable for simple reports but lacks modern CSS. Pick Gotenberg unless you have a specific reason to run raw.

How much infrastructure does self-hosted PDF rendering need?

For 100k PDFs/day, budget: 5-10 worker nodes (each ~2 vCPUs, 4GB RAM), object storage (S3/R2/GCS), a Redis queue, a load balancer, monitoring. ~$300-$800/month in cloud costs + engineering time to operate it. A managed API at this volume is typically $500-$2000/month — the break-even on engineering time is the deciding factor.

Can I use Puppeteer in AWS Lambda or Vercel Functions?

Yes, with caveats. @sparticuz/chromium is a stripped-down Chromium designed for Lambda's 250MB layer size limit. Cold starts are 2-4 seconds; warm requests 500ms-1s. Works for occasional-use PDF generation; not a good fit for high-volume or low-latency workloads. Vercel Functions, similar story with @sparticuz/chromium.

Does Gotenberg support the same features as managed APIs?

Mostly. Gotenberg wraps Chromium's PDF path, so CSS @page, network-idle waiting, and margin options all work. It lacks: built-in SSRF hardening beyond basic URL allowlisting, quota/concurrency gates, hosted auth, multi-tenant isolation. For internal use it's fine; as a multi-tenant service you'd rebuild these.

Is Puppeteer harder to secure than a managed API?

Yes. If you accept URLs from customers, you need: HTTP-boundary private-IP blocking, in-page request interception for DNS rebinding, process sandboxing (Chromium's --no-sandbox is unsafe in multi-tenant contexts), resource limits, timeout handling. Vendors have spent engineer-years on this; you'd start from zero.

What's the biggest hidden cost of self-hosting PDF rendering?

Operational on-call. Chromium leaks memory, crashes occasionally, needs font updates, segfaults on some PDFs. Someone has to wake up when the render queue backs up at 3am. Most founding teams underestimate this cost until it's too late.