Key Takeaways
You write documentation once in Markdown. You need it as PDF for stakeholders, HTML for your website, and DOCX for client reviews. So you manually convert it. Every. Single. Time. Until a minor change breaks something, or the generated files drift from the source, or someone forgets to regenerate them entirely.
Sound familiar? You're not alone. Documentation conversion is one of the most tedious, error-prone tasks in software development. But it doesn't have to be. With Docker and Pandoc, you can build a publication pipeline that converts Markdown to any format automatically — every time code changes, without manual intervention.
Why Documentation Automation Matters
Let's talk about the real cost of manual documentation. You spend two hours writing documentation in Markdown. Then you spend another hour converting it to PDF, copying it to the right locations, and updating the client portal. A week later, someone finds a typo. You update the Markdown, regenerate the PDF... and forget to update the DOCX version. Now you have three different documents with three different typos fixed at three different times.
Multiply that by every team member, every project, every week. The hours add up. But more importantly, inconsistent documentation erodes trust. When stakeholders receive different versions of the "same" document, they start questioning what else is inconsistent.
The solution isn't working harder — it's working smarter. Automate the conversion. Write once. Generate everything. Keep it in sync automatically. That's the power of a documentation pipeline built on Docker and Pandoc.
Struggling with documentation workflows?
Boundev's DevOps team helps automate documentation pipelines — saving hours of manual work every week.
See How We Do ItWhat is Pandoc and Why Docker?
Pandoc calls itself the "universal document converter" — and it earns that title. It converts between over 40 formats: Markdown, HTML, LaTeX, PDF, EPUB, DOCX, RST, AsciiDoc, and more. You write in lightweight Markdown, and Pandoc transforms it into whatever format your audience needs.
Here's the problem: installing Pandoc is easy, but installing all its dependencies is not. PDF generation requires a LaTeX distribution, which can consume 4+ gigabytes of disk space. Different projects might need different LaTeX packages. Different team members might have different versions. Version conflicts create subtle differences in output that are nearly impossible to debug.
Docker solves this elegantly. Package Pandoc with all its dependencies into a single Docker image. Now everyone on your team — Windows, macOS, Linux — uses the exact same environment. The output is always consistent. No more "it works on my machine" for documentation.
Official Pandoc Docker Variants
Getting Started: Your First Docker Pandoc Conversion
Let's start with the basics. You have a Markdown file called README.md. You want to convert it to HTML. Here's all it takes:
docker run --rm -v $(pwd):/data pandoc/core README.md -o README.html
That's it. No installation. No dependencies. Just pure conversion. The Docker image contains everything Pandoc needs to run. The -v flag mounts your current directory so the container can access your files.
Want PDF output instead? Use the latex variant:
docker run --rm -v $(pwd):/data pandoc/latex README.md -o README.pdf
And DOCX for your manager who refuses to use anything else?
docker run --rm -v $(pwd):/data pandoc/core README.md -o README.docx
Building a Publication Pipeline
Converting one file manually is nice. But what when you have an entire documentation folder? That's where a publication pipeline shines. Let me show you how to build one that generates all your formats automatically.
Step 1: Create a Batch Conversion Script
Create a script that converts all Markdown files in your docs folder to multiple formats:
#!/bin/bash
mkdir -p output/pdf output/html output/docx
for file in docs/*.md; do
filename=$(basename "$file" .md)
echo "Converting: $filename.md"
# Convert to HTML
docker run --rm -v "$(pwd):/data" pandoc/core "docs/$filename.md" -o "output/html/$filename.html" --standalone --toc
# Convert to PDF
docker run --rm -v "$(pwd):/data" pandoc/latex "docs/$filename.md" -o "output/pdf/$filename.pdf" --pdf-engine=xelatex --toc
# Convert to DOCX
docker run --rm -v "$(pwd):/data" pandoc/core "docs/$filename.md" -o "output/docx/$filename.docx"
done
echo "All conversions complete!"
Step 2: Create a Makefile for Easy Commands
Makefiles make your pipeline even more user-friendly:
.PHONY: all html pdf docx clean
all: html pdf docx
html:
@mkdir -p output/html
@for f in docs/*.md; do filename=$$(basename $$f .md); docker run --rm -v "$$(pwd):/data" pandoc/core $$f -o "output/html/$$filename.html" --standalone --toc; done
pdf:
@mkdir -p output/pdf
@for f in docs/*.md; do filename=$$(basename $$f .md); docker run --rm -v "$$(pwd):/data" pandoc/latex $$f -o "output/pdf/$$filename.pdf" --pdf-engine=xelatex --toc; done
docx:
@mkdir -p output/docx
@for f in docs/*.md; do filename=$$(basename $$f .md); docker run --rm -v "$$(pwd):/data" pandoc/core $$f -o "output/docx/$$filename.docx"; done
clean:
rm -rf output
Now everyone on your team can run simple commands: make html, make pdf, or make all. No one needs to remember the Docker commands or the directory structure.
Ready to Automate Your Docs?
Stop manually converting documentation. Build a pipeline that does it for you.
Talk to Our TeamCI/CD Integration: Documentation on Autopilot
The real power comes when you integrate your documentation pipeline with CI/CD. Every time someone pushes code, your documentation regenerates automatically. No one forgets. No one misses an update. The documentation is always in sync with the code.
GitHub Actions Workflow
Here's a GitHub Actions workflow that generates documentation on every push:
name: Build Documentation
on:
push:
branches:
- main
paths:
- 'docs/**'
- '.github/workflows/docs.yml'
jobs:
build-docs:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Build HTML documentation
run: |
mkdir -p output/html
for file in docs/*.md; do
filename=$(basename "$file" .md)
docker run --rm -v "${{ github.workspace }}:/data" pandoc/core "$file" -o "output/html/$filename.html" --standalone --toc
done
- name: Build PDF documentation
run: |
mkdir -p output/pdf
for file in docs/*.md; do
filename=$(basename "$file" .md)
docker run --rm -v "${{ github.workspace }}:/data" pandoc/latex "$file" -o "output/pdf/$filename.pdf" --pdf-engine=xelatex --toc
done
- name: Upload artifacts
uses: actions/upload-artifact@v4
with:
name: documentation
path: output/
This workflow triggers whenever Markdown files in the docs folder change. It converts everything to HTML and PDF, then uploads the results as artifacts. You can configure additional steps to deploy the HTML to your website or attach the PDFs to releases.
The Bottom Line
Advanced: Custom Docker Images for Enterprise
The official Pandoc images work for most cases, but sometimes you need custom LaTeX packages, proprietary fonts, or specialized templates. That's when you build your own Docker image.
FROM pandoc/latex
# Add custom LaTeX packages
RUN tlmgr install fonts-my-company package-custom && tlmgr update --all
# Copy custom templates
COPY templates/ /templates/
# Set default output directory
WORKDIR /data
Now your entire organization uses the same documentation generation environment. The same templates. The same fonts. The same packages. Consistent documentation, no matter who writes it or where they work.
How Boundev Solves This for You
Everything we've covered — Docker setup, Pandoc pipelines, CI/CD integration — is what our DevOps and development teams implement for clients every day. Here's how we approach documentation automation.
Our dedicated DevOps teams build complete documentation pipelines — from Docker setup to CI/CD integration.
Need a DevOps engineer to build your documentation pipeline? We provide experts who integrate seamlessly.
Hand us your documentation needs. We build the complete pipeline and hand you automation.
Ready to automate your documentation?
Our team has built documentation pipelines for companies across industries. Let's build yours.
Get StartedFrequently Asked Questions
What formats can Pandoc convert between?
Pandoc supports over 40 formats including Markdown, HTML, PDF, LaTeX, EPUB, DOCX, ODT, RST, AsciiDoc, Textile, and many more. You can convert between virtually any document format with a single command.
Do I need LaTeX installed for PDF output?
Yes, PDF generation requires a LaTeX distribution. However, using the pandoc/latex Docker image means you don't install LaTeX on your machine — it's all contained in the Docker image. This saves about 4GB of disk space and eliminates version conflicts.
How do I integrate Pandoc with GitHub Actions?
Use Docker containers directly in your GitHub Actions workflow. The example in this blog shows a complete workflow that checks out code, runs Docker-based Pandoc conversions, and uploads the results as artifacts. You can customize it to deploy HTML to websites or attach PDFs to releases.
Can I use custom templates with Pandoc?
Absolutely. Pandoc supports custom templates for HTML, PDF, DOCX, and other formats. You can create your own template or modify existing ones. For Docker-based workflows, copy your templates into the container or mount them as volumes.
What's the difference between pandoc/core and pandoc/latex?
pandoc/core contains just Pandoc without LaTeX — it can output HTML, EPUB, DOCX, and other formats that don't require LaTeX. pandoc/latex includes a full LaTeX distribution, enabling PDF output. Choose based on your output needs.
Explore Boundev's Services
Ready to automate your documentation workflow? Here's how we can help.
Let's Automate Your Documentation
Stop manually converting docs. Build a pipeline that does it for you.
200+ companies have trusted us with their DevOps and automation needs. Tell us what you need — we'll respond within 24 hours.
