Universal web scraper with LLM-ready markdown, RAG chunking, PDF/DOCX support.
npx -y thecrawlerAdd this server entry to the mcpServers object in your Claude Desktop config, then restart the app.
{
"mcpServers": {
"io-github-manchittlab-thecrawler": {
"command": "npx",
"args": [
"-y",
"thecrawler"
]
}
}
}~/Library/Application Support/Claude/claude_desktop_config.json%APPDATA%\Claude\claude_desktop_config.jsonNo remote HTTP endpoint is advertised. Use the package or stdio setup shown in Install.
TheCrawler is an MCP server for Universal web scraper with LLM-ready markdown, RAG chunking, PDF/DOCX support.. It supports STDIO transport.
Use the generated config in Install. This server runs with npx -y thecrawler; add any required environment variables before starting your client.
Choose the Claude Desktop tab in Install and copy the config for npx -y thecrawler. Add required environment variables before starting Claude Desktop.
Choose the Claude Code tab in Install and copy the config for npx -y thecrawler. Add required environment variables before starting Claude Code.
Choose the Codex tab in Install and copy the config for npx -y thecrawler. Add required environment variables before starting Codex.
Choose the Cursor or VS Code tab in Install and copy the config for npx -y thecrawler. Add required environment variables before starting Cursor or VS Code.
TheCrawler uses STDIO transport. Use the package or command config in Install.
TheCrawler inventory is listed when the MCP endpoint exposes tools, resources, or prompts. Some servers require auth first.
TheCrawler does not advertise a verified auth requirement. If discovery fails, it may still need provider login, an API key, a bearer token, or a session header.
| Package | Registry | Version | Inputs |
|---|---|---|---|
thecrawlerstdio | npm | 0.1.1 | None advertised |