Transform documentation sites
into AI-ready markdown
Most AIs can't navigate documentation like humans. DocFetch converts entire documentation websites into clean, single-file markdown with intelligent llm.txt indexing.
$ npm install -g doc-fetch
$ doc-fetch --url https://react.dev/learn --output docs.md --llm-txtWhy DocFetch?
AI/LLM Optimized
Single-file consumption with clean, structured markdown. Perfect token efficiency for LLM context windows.
LLM.txt Indexing
Intelligent semantic categorization. Your AI agents know whether they're reading an API reference or a tutorial.
One Command
Replace hours of manual copy-pasting with a single CLI command. Concurrent fetching with configurable depth.
Clean Extraction
Automatically strips navigation, headers, footers, ads, and buttons. Only the content matters.
Smart Classification
Automatic page classification: APIs, guides, references, examples. With semantic descriptions.
Production Ready
Respects robots.txt, rate limiting, cross-platform support. Multiple installation options.
Installation
Usage
Basic Usage
# Fetch entire documentation site
doc-fetch --url https://golang.org/doc/ --output ./docs/golang-full.md
# With LLM.txt generation for AI optimization
doc-fetch --url https://react.dev/learn --output docs.md --llm-txt Advanced Usage
doc-fetch \
--url https://docs.example.com \
--depth 4 \
--concurrent 10 \
--llm-txt \
--user-agent "MyBot/1.0" Command Options
| Flag | Short | Description | Default |
|---|---|---|---|
--url | -u | Base URL to fetch documentation from | Required |
--output | -o | Output file path | docs.md |
--depth | -d | Maximum crawl depth | 2 |
--concurrent | -c | Number of concurrent fetchers | 3 |
--llm-txt | Generate AI-friendly llm.txt index | false | |
--user-agent | Custom user agent string | DocFetch/1.0 |
Real-World Examples
Go Documentation
doc-fetch --url https://golang.org/doc/ \
--output ./docs/go-documentation.md \
--depth 4 --llm-txt Complete Go documentation with language spec, tutorials, and API references.
React Documentation
doc-fetch --url https://react.dev/learn \
--output ./docs/react-learn.md \
--concurrent 10 --llm-txt React learn section with all tutorials and guides in one file.
Your Project Docs
doc-fetch --url https://your-project.com/docs/ \
--output ./internal/docs.md \
--llm-txt Fetch your own project's documentation for AI agent training.
How It Works
Link Discovery
Parses the base URL to find all internal documentation links
Content Fetching
Downloads all pages concurrently with respect for robots.txt
HTML Cleaning
Removes non-content elements (navigation, headers, footers)
Markdown Conversion
Converts cleaned HTML to structured markdown
Classification
Categorizes pages as API, GUIDE, REFERENCE, or EXAMPLE
Output Generation
Combines all docs into one file + generates llm.txt index
How llm.txt Supercharges Your AI
Without DocFetch
"What does the net/http package do?"
Your AI has no idea. It needs to guess or ask you to provide context.
With DocFetch
"Check the [API] net/http section in llm.txt"
Your AI knows exactly where to look: "HTTP client/server implementation"
Example llm.txt Output
# llm.txt - AI-friendly documentation index
[GUIDE] Getting Started
https://golang.org/doc/install
Covers installation, setup, and first program.
[REFERENCE] Language Specification
https://golang.org/ref/spec
Complete Go language specification and syntax.
[API] net/http
https://pkg.go.dev/net/http
HTTP client/server implementation.Stop wasting time copying documentation
Start building AI agents with complete knowledge.