Your first crawl
When you create a bot, MimicBot automatically enqueues a Temporal workflow that crawls the clientUrl domain, extracts content, chunks it for retrieval, and indexes it. You don't trigger this — it's automatic.
Status transitions
A bot moves through three states during its first crawl:
| Status | Meaning |
|---|---|
draft | Bot created, crawl not started yet. Usually lasts under 5 seconds. |
indexing | Crawl in progress. Depending on site size, can take from seconds to tens of minutes. |
ready | Crawl finished and content is searchable. The widget will now answer questions. |
How the crawl works
botIndexingWorkflowinworker-tsorchestrates the crawl.- HTML fetching and action extraction are delegated to a Python activity in
worker-py. - Each discovered page becomes a row in the
pagestable with a chunked vector index. - Legacy
bot_actions(forms, newsletter signups) are discovered with statusreview— you activate them manually later.
Trigger a manual re-crawl
If you update your site content and want the bot to see the changes, re-crawl manually:
curl -X POST https://api.mimicbot.app/api/bots/{botId}/crawl \
-H "Authorization: Bearer $MIMICBOT_TOKEN"
The response is 202 Accepted — the crawl runs asynchronously.
Limits
crawlConfig.maxPages— defaults to 500, maximum 5000.crawlConfig.respectRobotsTxt— defaults to true. Pages disallowed byrobots.txtare skipped.- Scheduled crawls (
daily,weekly) — not yet wired. Manual re-crawl is the only supported mode today.