> ## Documentation Index
> Fetch the complete documentation index at: https://docs.grantex.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# Rate Limits

> Understand and work with Grantex API rate limits.

Grantex enforces per-IP rate limits using 1-minute sliding windows. These limits protect the service from abuse while allowing normal usage patterns. All rate limits apply at the IP level — there are no per-key limits.

## Default Limits

| Endpoint                     | Limit       | Notes                                      |
| ---------------------------- | ----------- | ------------------------------------------ |
| **All endpoints** (global)   | 100 req/min | Default for every route                    |
| `POST /v1/authorize`         | 10 req/min  | Stricter — creates auth requests           |
| `POST /v1/token`             | 20 req/min  | Stricter — exchanges codes for tokens      |
| `POST /v1/token/refresh`     | 20 req/min  | Stricter — refreshes grant tokens          |
| `GET /.well-known/jwks.json` | **Exempt**  | Public key distribution is never throttled |

## Response Headers

Every response includes rate limit headers so your application can track its budget:

| Header                  | Description                                               |
| ----------------------- | --------------------------------------------------------- |
| `X-RateLimit-Limit`     | Maximum requests allowed in the current window            |
| `X-RateLimit-Remaining` | Requests remaining in the current window                  |
| `X-RateLimit-Reset`     | Unix timestamp (seconds) when the window resets           |
| `Retry-After`           | Seconds to wait before retrying (only on `429` responses) |

## 429 Error Response

When you exceed a rate limit, the API returns a `429 Too Many Requests` status with the following body:

```json theme={null}
{
  "message": "Rate limit exceeded, retry in 42 seconds",
  "code": "BAD_REQUEST",
  "requestId": "a1b2c3d4-e5f6-..."
}
```

The `Retry-After` header tells you exactly how long to wait.

## Reading Rate Limits from SDKs

All three SDKs automatically parse rate limit headers from every response. You can read them via `client.lastRateLimit` (TypeScript/Python) or `client.LastRateLimit()` (Go).

### After a Successful Call

<CodeGroup>
  ```typescript TypeScript theme={null}
  import { Grantex } from '@grantex/sdk';

  const client = new Grantex({ apiKey: 'gx_...' });
  const agents = await client.agents.list();

  const rl = client.lastRateLimit;
  if (rl) {
    console.log(`${rl.remaining}/${rl.limit} requests left, resets at ${rl.reset}`);
  }
  ```

  ```python Python theme={null}
  from grantex import Grantex

  client = Grantex(api_key="gx_...")
  agents = client.agents.list()

  rl = client.last_rate_limit
  if rl:
      print(f"{rl.remaining}/{rl.limit} requests left, resets at {rl.reset}")
  ```

  ```go Go theme={null}
  client := grantex.NewClient("gx_...")
  agents, _ := client.Agents.List(ctx)

  if rl := client.LastRateLimit(); rl != nil {
      fmt.Printf("%d/%d requests left, resets at %d\n", rl.Remaining, rl.Limit, rl.Reset)
  }
  ```
</CodeGroup>

### Handling 429 Errors

When a `429` is returned, the error object includes rate limit info with the `retryAfter` value:

<CodeGroup>
  ```typescript TypeScript theme={null}
  import { GrantexApiError } from '@grantex/sdk';

  try {
    await client.tokens.verify(token);
  } catch (err) {
    if (err instanceof GrantexApiError && err.rateLimit?.retryAfter) {
      console.log(`Rate limited — retry in ${err.rateLimit.retryAfter}s`);
    }
  }
  ```

  ```python Python theme={null}
  from grantex import GrantexApiError

  try:
      client.tokens.verify(token)
  except GrantexApiError as err:
      if err.rate_limit and err.rate_limit.retry_after:
          print(f"Rate limited — retry in {err.rate_limit.retry_after}s")
  ```

  ```go Go theme={null}
  var apiErr *grantex.APIError
  if errors.As(err, &apiErr) && apiErr.RateLimit != nil && apiErr.RateLimit.RetryAfter > 0 {
      fmt.Printf("Rate limited — retry in %ds\n", apiErr.RateLimit.RetryAfter)
  }
  ```
</CodeGroup>

## Retry Strategy

Use exponential backoff with jitter to avoid thundering-herd problems when multiple clients hit the limit simultaneously.

<CodeGroup>
  ```typescript TypeScript theme={null}
  import { GrantexApiError } from '@grantex/sdk';

  async function withRetry<T>(fn: () => Promise<T>, maxRetries = 3): Promise<T> {
    for (let attempt = 0; attempt <= maxRetries; attempt++) {
      try {
        return await fn();
      } catch (err: unknown) {
        if (!(err instanceof GrantexApiError) || err.statusCode !== 429 || attempt === maxRetries) throw err;

        const base = (err.rateLimit?.retryAfter ?? 2 ** attempt) * 1000;
        const jitter = Math.random() * 500;
        await new Promise((r) => setTimeout(r, base + jitter));
      }
    }
    throw new Error('Unreachable');
  }
  ```

  ```python Python theme={null}
  import time
  import random
  from grantex import GrantexApiError

  def with_retry(fn, max_retries=3):
      for attempt in range(max_retries + 1):
          try:
              return fn()
          except GrantexApiError as err:
              if err.status_code != 429 or attempt == max_retries:
                  raise

              base = float(err.rate_limit.retry_after if err.rate_limit and err.rate_limit.retry_after else 2 ** attempt)
              jitter = random.uniform(0, 0.5)
              time.sleep(base + jitter)
  ```

  ```go Go theme={null}
  package main

  import (
  	"errors"
  	"math"
  	"math/rand"
  	"time"

  	grantex "github.com/mishrasanjeev/grantex-go"
  )

  func withRetry[T any](fn func() (T, error), maxRetries int) (T, error) {
  	var zero T
  	for attempt := 0; attempt <= maxRetries; attempt++ {
  		result, err := fn()
  		if err == nil {
  			return result, nil
  		}

  		var apiErr *grantex.APIError
  		if !errors.As(err, &apiErr) || apiErr.StatusCode != 429 || attempt == maxRetries {
  			return zero, err
  		}

  		base := math.Pow(2, float64(attempt))
  		if apiErr.RateLimit != nil && apiErr.RateLimit.RetryAfter > 0 {
  			base = float64(apiErr.RateLimit.RetryAfter)
  		}
  		jitter := rand.Float64() * 0.5
  		time.Sleep(time.Duration((base+jitter)*1000) * time.Millisecond)
  	}
  	return zero, errors.New("unreachable")
  }
  ```
</CodeGroup>

## Best Practices

<Note>
  The JWKS endpoint (`/.well-known/jwks.json`) is exempt from rate limits. Prefer [offline token verification](/concepts/grant-token) over online `POST /v1/tokens/verify` calls to avoid hitting limits entirely.
</Note>

* **Cache tokens** — Grant tokens are valid JWTs. Store and reuse them until they expire instead of requesting new ones per operation.
* **Use offline verification** — Call `verifyGrantToken()` with the JWKS URI to validate tokens locally. The JWKS endpoint is never rate-limited.
* **Use webhooks instead of polling** — Subscribe to [webhook events](/guides/webhooks) like `grant.created` and `grant.revoked` rather than polling grant or audit endpoints.
* **Honor `Retry-After`** — When you receive a `429`, always use the `Retry-After` header value as your minimum wait time.
* **Spread requests** — If your system makes burst requests (e.g., batch token exchanges), add short delays between calls.

## Self-Hosted Deployments

If you're running the Grantex auth service yourself, rate limits are fully configurable. The global limit is set in `apps/auth-service/src/server.ts` and per-route limits are set in individual route files.

See the [Self-Hosting guide](/guides/self-hosting) for deployment instructions.
