OptSensBot
OptSensBot is the website scanner of the OptSens consent management platform. It visits websites whose owners added them to OptSens, detects the cookies, scripts and trackers those sites set, and turns the results into the site's cookie declaration and compliance checks.
What is OptSensBot
OptSensBot only visits customer-authorized domains. A scan runs when a site owner adds their domain to OptSens, starts a scan from the dashboard, or has scheduled rescans enabled on their plan. OptSensBot does not crawl the open web, does not follow links to other domains and does not index content.
The same agent performs three jobs:
| Job | What it does |
|---|---|
| Cookie scan | Loads a limited set of pages and records which cookies, scripts, iframes and trackers appear |
| Implementation check | Confirms the OptSens banner is installed and consent signals fire correctly |
| Consent Mode v2 check | Confirms Google Consent Mode v2 defaults and updates are set correctly |
User agent
OptSensBot renders pages with a real browser engine. Its user agent contains a standard browser signature plus the OptSensBot product token and a link to this page:
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko;
compatible; OptSensBot/1.0; +https://docs.optsens.com/bot) Chrome/120.0.0.0 Safari/537.36
Match on the token when filtering:
OptSensBot/1.0
Verify that a request is really from OptSensBot
A user-agent string can be spoofed. Use one of these signals to verify the sender.
Web Bot Auth (signed requests)
Every OptSensBot request is cryptographically signed using HTTP Message
Signatures (RFC 9421) with an Ed25519 key. The request carries
Signature, Signature-Input and Signature-Agent headers, and the
public key is published in our signed key directory:
https://docs.optsens.com/.well-known/http-message-signatures-directory
Cloudflare and other providers that support Web Bot Auth verify these signatures automatically. If a request claims to be OptSensBot but its signature does not validate against our key directory, it is not from us.
Published IP addresses
The current scanner addresses are published in a machine-readable list:
https://docs.optsens.com/optsensbot.json
The list is kept current and may change: fetch it fresh rather than copying values into long-lived rules. Prefer Web Bot Auth where available. IP addresses can change, signatures do not.
What OptSensBot collects and why
During a scan, OptSensBot records:
- Cookies set while loading the scanned pages, with name, provider, expiry and type
- Third-party scripts, iframes and tracking requests that appear
- The URLs of the scanned pages
This data becomes the domain's cookie declaration and compliance reports inside the site owner's OptSens account. OptSensBot does not submit forms, does not create accounts, does not download files and does not store page content beyond the detection results.
Scans and your analytics
Detection requires executing your pages like a real browser, and a scan
can therefore register a few pageviews in your analytics tool. To keep
reports clean: in GA4, define internal traffic by the addresses from
optsensbot.json (Admin > Data streams > Configure tag settings >
Define internal traffic) and filter it out. In tools that filter by
user agent, exclude OptSensBot.
Crawl behavior
- Pages render in a real Chromium browser with JavaScript executed, and cookies set by scripts and cookies set by your server (HTTP-only) are both detected.
- Scans start from the homepage and the sitemap of the registered domain.
- A scan covers a bounded number of pages, set by the page budget of the site's OptSens plan, runs once and then stops. There is no continuous crawling.
- Scan frequency follows the domain's plan: a scan at onboarding, manual
scans from the dashboard, and scheduled rescans. Results appear in the
dashboard when the scan completes. Business plans can also subscribe
to the
scan.completedwebhook. - OptSensBot reads
robots.txt(RFC 9309) and honorsDisallowrules for theOptSensBotuser agent, andCrawl-delayup to 10 seconds. - Requests are throttled and a scan never produces meaningful load on the scanned site.
Pages behind a login
OptSensBot scans publicly reachable pages only. Pages behind a login, basic-auth-protected staging environments and members-only areas are not scanned. If your consent setup lives behind authentication, write to support@optsens.com and we will look at your case.
Allow OptSensBot
If your site sits behind a WAF, a bot manager or a challenge page, OptSensBot may be blocked and the scan returns empty or incomplete results. Allow it with whichever option your setup supports:
Cloudflare
OptSensBot supports Cloudflare's bot verification through Web Bot Auth. If your scans fail with a Cloudflare-protected site, add an allow rule:
- Open your Cloudflare dashboard and select the zone.
- Go to Security > WAF > Custom rules (or IP Access Rules).
- Allow requests where the user agent contains
OptSensBot, or allow the addresses fromoptsensbot.json.
Other firewalls and bot managers
Allow the OptSensBot/1.0 user-agent token together with the published
IP list. Vendors that support HTTP Message Signatures can verify our
requests against the key directory instead.
robots.txt
OptSensBot honors robots.txt. To exclude parts of your site from
scanning:
User-agent: OptSensBot
Disallow: /private/
Blocking OptSensBot entirely stops cookie scans, the implementation check and the Consent Mode v2 check for your own OptSens account. Use targeted rules rather than a full block.
If a scan comes back empty or incomplete
The most common cause is a bot challenge: when Cloudflare Turnstile, reCAPTCHA or a WAF serves OptSensBot a challenge page instead of your site, the scan finds nothing or only the challenge provider's own resources. Add an allow rule as described above and run the scan again. Pages that fail to load within the scan window are skipped. The rest of the scan still completes.
Contact
Questions about OptSensBot, crawl rate or a request you want verified: support@optsens.com.