- Monthly quotas cap how many requests you can make against each pool over a billing period. Quotas reset on your billing anchor day (usually the first of the month).
- Rate limits cap how many requests per minute you can make against each tier. Rate limits slide in real time and are separate from monthly quotas — a single request counts against both.
Monthly Quota Pools (Beta)
Endpoints are grouped into quota pools. Each pool has its own monthly allowance. During beta, every customer is on the same plan with the following limits:| Pool | Beta monthly quota | Rate limit | Endpoints |
|---|---|---|---|
| search | 100,000 | 1,000/min | GET /v1/trademarks, POST /v1/trademarks, GET /v1/trademarks/suggest, GET /v1/suggest, GET /v1/owners, GET /v1/attorneys, GET /v1/firms, GET /v1/entities, GET /v1/entities/{id}/trademarks |
| read | 500,000 | 10,000/min | GET /v1/trademarks/{id}, POST /v1/trademarks/batch, GET /v1/trademarks/{id}/history, GET /v1/trademarks/{id}/changes, GET /v1/trademarks/{id}/related, GET /v1/trademarks/{id}/proceedings, GET /v1/trademarks/{id}/coverage, GET /v1/trademarks/{id}/source, GET /v1/owners/{id}, GET /v1/owners/{id}/*, GET /v1/attorneys/{id}, GET /v1/attorneys/{id}/*, GET /v1/firms/{id}, GET /v1/firms/{id}/*, GET /v1/proceedings, GET /v1/proceedings/{id} |
| monitoring | unmetered (per-watch resource quota instead) | 100/min | /v1/watches/*, /v1/alerts/*, /v1/webhooks/* (except test + delivery reads, which are utility) |
| check | 500,000 (not yet itemized in /v1/organization/usage) | 1,000/min | POST /v1/goods-services/suggest |
| reference | unmetered | 1,000/min | GET /v1/offices, GET /v1/jurisdictions, GET /v1/classifications, GET /v1/design-codes, GET /v1/event-types, GET /v1/deadline-rules, GET /v1/opposition-rules |
| utility | unmetered | 1,000/min | GET /v1/organization/*, GET /health/*, GET /docs, GET /v1/openapi.json, /mcp/* |
Additional pools (screening, clearance, image_search, export) exist in plan configuration for forward-compatibility, but no endpoints are shipped against them yet — they are not returned by
/v1/organization/usage or /v1/organization/plan until the corresponding endpoints ship, and they will appear in this table at the same time. The check pool is live: its one shipped endpoint is POST /v1/goods-services/suggest.Pool Rules of Thumb
- List endpoints (returning many results) are search.
- Detail endpoints (returning one resource, or a batch of IDs) are read.
- Static taxonomies (offices, classifications, design codes) are reference and do not count against any monthly quota.
- Dashboard / health / docs are utility and do not count against any monthly quota.
GET /v1/organization/usage. Your plan limits are also included in every /v1/organization/usage response under by_endpoint_type[*].limit.
Rate Limit Tiers
Rate limits are per-minute sliding-window caps applied independently of monthly quotas. Each request is evaluated against two independent sliding windows: your plan’s per-endpoint-type rpm (shared by every endpoint in that pool, regardless of HTTP method) and the transport-tier safety ceiling (shared by every endpoint in that tier, regardless of pool). A request is allowed only when both have headroom; each allowed request spends both. The per-endpoint-type limit is the one you will normally hit. For the beta plan:| Endpoint type | Beta limit |
|---|---|
search | 1,000/min |
read | 10,000/min |
monitoring | 100/min |
utility / reference / check | 1,000/min |
| Tier | Ceiling | Endpoints |
|---|---|---|
| Reads (tier-1) | 10,000/min | All GET/HEAD/OPTIONS requests |
| Search (tier-2) | 10,000/min | GET /v1/trademarks, POST /v1/trademarks, GET /v1/trademarks/suggest, GET /v1/suggest, POST /v1/classifications/suggest, POST /v1/goods-services/suggest |
| Writes (tier-3) | 1,000/min | All POST/PATCH/DELETE not in Search tier |
| MCP | 1,000/min | /mcp endpoints |
POST /v1/trademarks/batch belongs to the read pool (10,000/min) but, being a POST, is also subject to the writes ceiling (1,000/min).
POST /v1/trademarks is the search endpoint (POST is used only to accept JSON filter bodies too large for a URL) — it counts against the search rate tier and the search monthly quota pool, not writes. POST /v1/trademarks/batch is a bulk-lookup read and counts against the writes tier + read monthly pool.
Per-plan rate limits (future, post-beta)
When pricing launches, each plan will also advertise a per-plan rpm enforced alongside the tier ceiling (two independent windows — the binding one is whichever you are closest to exhausting). Beta customers see no change; paid plans see their advertised rpm become the enforced cap.| Plan (future) | read | search | monitoring | other |
|---|---|---|---|---|
| Free | 60/min | 60/min | 60/min | 60/min |
| Starter | 300/min | 300/min | 300/min | 300/min |
| Pro | 1,000/min | 1,000/min | 1,000/min | 1,000/min |
| Enterprise | 5,000/min | 5,000/min | 5,000/min | 5,000/min |
| Beta (current) | 10,000/min | 1,000/min | 100/min | 1,000/min |
These paid-plan numbers are provisional and will be tuned based on beta usage data before pricing launch. If you hit them in testing, tell us — that’s valuable signal.
Rate Limit Headers
API responses include IETF-standard rate limit headers so you can monitor your usage in real time. One exception: the CDN-cached reference-data routes (/v1/offices, /v1/jurisdictions, /v1/classifications, /v1/design-codes, /v1/event-types) omit the RateLimit-* headers — they carry per-organization values that must not leak into shared cache hits.
| Header | Format | Description |
|---|---|---|
RateLimit-Policy | {limit};w={windowSec}, comma-separated list | All policies in effect: your plan’s per-endpoint-type limit first, the transport-tier ceiling second (a single policy is shown when the two coincide). E.g., 1000;w=60 means 1,000 requests per 60-second window. |
RateLimit | remaining={N}, reset={seconds} | Remaining requests and seconds until reset for the binding policy — whichever you are closest to exhausting. |
Retry-After | {seconds} | Only present on 429 responses. Number of seconds to wait before retrying. |
Daily Sub-Caps
In addition to the monthly quota pool, each pool has a daily sub-cap set at 10% of the monthly limit. This is a hard cap for all plans (including paid plans that allow monthly overage) — its purpose is to prevent a single client from exhausting an entire month’s allowance in minutes.| Pool | Beta monthly | Beta daily |
|---|---|---|
| search | 100,000 | 10,000 |
| read | 500,000 | 50,000 |
| check | 500,000 | 50,000 |
Daily Quota Headers
Every metered response includes three additional headers alongside the existing rate-limit headers:| Header | Format | Description |
|---|---|---|
X-Quota-Daily-Limit | integer | Total daily units allowed for this pool. |
X-Quota-Daily-Remaining | integer | Remaining daily units. |
X-Quota-Daily-Reset | ISO 8601 timestamp | Next UTC midnight, when the daily counter resets. |
X-Quota-Limit, X-Quota-Remaining, and X-Quota-Reset headers continue to report the monthly counter.
X-Quota-* headers count quota units (per day / per month), not
requests per minute — don’t confuse them with the RateLimit-* headers,
which are the per-minute sliding windows. The X-Quota-* headers are also
omitted on responses that don’t spend quota: unmetered pools
(reference data, utility/dashboard routes) and 404s for unmatched routes
(a 404 never decrements quota, so emitting a “remaining” value there would
be misleading).429 with quota_scope
If a request exceeds either cap, you receive a 429 with error.type = "quota_exceeded". The error.quota_scope field tells you which counter was breached:
quota_scope is "monthly" or "daily". The error.type stays "quota_exceeded" regardless; switch on quota_scope if you need to distinguish.
Per-Endpoint Classification
Each endpoint falls into the tier determined by its HTTP method and path. Notable classifications:| Endpoint | Tier | Beta limit | Notes |
|---|---|---|---|
GET /v1/trademarks | Search | 1,000/min | The list/search endpoint. Routes to the search tier (POST shape is an alias). |
POST /v1/trademarks | Search | 1,000/min | Body-shaped search — counts against search tier + search quota pool, not writes. |
GET /v1/trademarks/suggest | Search | 1,000/min | Explicitly routed to the search tier. |
GET /v1/suggest | Search | 1,000/min | Cross-entity suggest — search tier. |
POST /v1/trademarks/batch | Writes | 1,000/min | Bulk-lookup read — one request per batch regardless of size. Exempt from Idempotency-Key. |
POST /v1/organization/api-keys | Writes | 1,000/min | Mint an API key — requires Idempotency-Key. |
GET /v1/watches, GET /v1/alerts | Reads | 100/min | Monitoring endpoint type — the 100/min plan limit binds before the reads tier ceiling. |
POST /v1/watches, POST /v1/webhooks | Writes | 100/min | Monitoring endpoint type — same 100/min pool as monitoring reads. |
429 Response
When you exceed your rate limit, the API returns a429 Too Many Requests status with details about when you can retry:
Retry-After response header and the retry_after field in the body both contain the number of seconds to wait.
Monitoring Usage
Check your current billing period usage and rate limit status with Get Usage:by_endpoint_type reports used and limit for every shipped metered endpoint type in the current billing period — today that is exactly search and read. Pools that exist in plan config but have no shipped endpoints (screening, clearance, image_search, export) do not appear here — and are not returned by /v1/organization/plan either — until their routes ship. check requests (POST /v1/goods-services/suggest) are quota-metered (watch the X-Quota-* headers on those responses) but not yet itemized in by_endpoint_type. A limit of null means unlimited; 0 means the endpoint type is not allowed on your plan.
rate_limits is the per-endpoint-type requests-per-minute matrix the limiter enforces. The top-level rate_limit field is deprecated — it only ever reflected the search-tier RPM (1,000/min on beta), which made it disagree with the RateLimit-Policy header on non-search endpoints; use rate_limits (or the response headers) instead.
Avoid polling
/v1/organization/usage in a tight loop. It is classified utility — unmetered against any monthly quota, but subject to the utility per-minute rate limit (1,000/min on beta).Handling 429 in Code
The Signa TypeScript SDK handles 429 responses automatically with built-in retry logic — see SDK Error Handling. If you are implementing your own retry logic, wait for theRetry-After duration and retry with exponential backoff.
Best Practices
Use batch endpoints to reduce request count
Use batch endpoints to reduce request count
A single batch request of 100 IDs counts as 1 request against your rate limit, compared to 100 individual GET requests. See Batch Get Trademarks.
Cache responses with ETags
Cache responses with ETags
Use targeted lookups instead of frequent searches
Use targeted lookups instead of frequent searches
If you are periodically checking for trademark status changes, use Trademark History on specific marks rather than re-running broad searches.
Spread requests evenly
Spread requests evenly
Bursting 500 requests in the first second of a window is more likely to trigger rate limiting than spreading them evenly across the minute. If you need to process a large batch, add a small delay (50—100ms) between requests.
Use separate API keys per concern
Use separate API keys per concern
If your application has both a user-facing dashboard and a background sync job, create separate API keys for each. This prevents a background job from exhausting the rate limit that your dashboard users depend on.