Skip to main content

Crawls API

A crawl job represents one execution of MimicBot's indexing workflow for a bot. Jobs are kicked off in three situations: automatically when a new source is added, manually via the endpoint below, and on a schedule when the bot's crawlConfig.schedule is set to daily or weekly. Each job runs inside Temporal — the workflowId returned from the trigger endpoint is the Temporal workflow id you can correlate in logs.

Endpoints

The crawl job object

FieldTypeDescription
idstring (uuid)Internal job id.
botIdstring (uuid)Owning bot.
sourceIdstring | nullSource that triggered the job, if scoped to one. null for multi-source manual recrawls.
temporalWorkflowIdstringTemporal workflow id for log correlation.
trigger"manual" | "scheduled" | "source_added"What kicked off this run.
status"queued" | "running" | "completed" | "failed" | "cancelled" | "degraded"Lifecycle.
pagesDiscoverednumberPages the crawler found.
pagesCrawlednumberPages successfully fetched and indexed.
pagesFailednumberPages that errored during fetch or parse.
pagesSkippedUnchangednumberPages whose content hash matched and were not re-embedded.
startedAtstringISO 8601 start timestamp.
finishedAtstring | nullISO 8601 completion timestamp, or null if still running.
errorstring | nullFailure reason when status is failed.

Trigger a crawl

POST /api/bots/{botId}/crawl

Starts a manual full recrawl across every source attached to the bot. Returns immediately with the new job id — check the list endpoint for progress.

Request

curl -X POST https://api.mimicbot.app/api/bots/$BOT_ID/crawl \
-H "Authorization: Bearer $MIMICBOT_TOKEN"

Response 202

{
"jobId": "8f3b...",
"workflowId": "bot-9c2e-crawl-1712998400000"
}

Errors: 401 UNAUTHENTICATED, 403 NO_AGENCY, 404 NOT_FOUND, 429 RATE_LIMITED (see Rate limits), 503 TEMPORAL_UNAVAILABLE. See Errors.

List crawl jobs

GET /api/bots/{botId}/crawl-jobs

Returns up to the 20 most recent crawl jobs for a bot, ordered by startedAt descending.

Request

curl https://api.mimicbot.app/api/bots/$BOT_ID/crawl-jobs \
-H "Authorization: Bearer $MIMICBOT_TOKEN"

Response 200

{
"jobs": [
{
"id": "8f3b...",
"botId": "9c2e...",
"sourceId": null,
"temporalWorkflowId": "bot-9c2e-crawl-1712998400000",
"trigger": "manual",
"status": "running",
"pagesDiscovered": 42,
"pagesCrawled": 12,
"pagesFailed": 0,
"pagesSkippedUnchanged": 0,
"startedAt": "2026-04-13T10:00:00.000Z",
"finishedAt": null,
"error": null
}
]
}

Errors: 401 UNAUTHENTICATED, 403 NO_AGENCY, 404 NOT_FOUND, 429 RATE_LIMITED (see Rate limits). See Errors.