Configuration
The SDK uses an immutable ClientConfig dataclass for all configuration. You can pass parameters directly to the constructor, use environment variables, or build a config object manually.
Constructor Parameters
from datablue import DataBlue
client = DataBlue(
api_url="https://api.datablue.dev", # Base URL (default: https://api.datablue.dev)
api_key="wh_your_api_key", # API key (wh_ prefix)
timeout=120.0, # Request timeout in seconds (default: 60)
max_retries=5, # Max retry attempts (default: 3)
)
ClientConfig Object
For advanced control, build a ClientConfig and pass it to the constructor:
from datablue import DataBlue, ClientConfig
config = ClientConfig(
api_url="https://api.datablue.dev",
api_key="wh_your_api_key",
timeout=120.0,
max_retries=5,
backoff_factor=1.0, # Multiplier for exponential backoff (default: 0.5)
)
client = DataBlue(config=config)
Config from Environment
from datablue import DataBlue, ClientConfig
# Build config from DATABLUE_* env vars
config = ClientConfig.from_env()
# Use with either client type
sync_client = DataBlue(config=config)
from datablue import AsyncDataBlue, ClientConfig
config = ClientConfig.from_env()
async_client = AsyncDataBlue(config=config)
Cloning Configs
Configs are immutable (frozen dataclass). Use clone() to create modified copies for different environments:
from datablue import DataBlue, ClientConfig
# Base config
prod = ClientConfig(
api_url="https://api.datablue.dev",
api_key="wh_prod_key",
timeout=60.0,
max_retries=3,
)
# Derive staging config (inherits everything except overrides)
staging = prod.clone(
api_url="https://staging.datablue.dev",
api_key="wh_staging_key",
)
# Derive a fast config for time-sensitive operations
fast = prod.clone(timeout=10.0, max_retries=1)
# Use each
with DataBlue(config=prod) as client:
result = client.scrape("https://example.com")
ClientConfig Fields
| Field | Type | Default | Description |
|---|---|---|---|
api_url | str | https://api.datablue.dev | Base URL of the DataBlue API (trailing slash auto-stripped) |
api_key | str | None | None | API key with wh_ prefix |
timeout | float | 60.0 | HTTP request timeout in seconds |
max_retries | int | 3 | Maximum retry attempts on transient errors (429, 5xx, connection errors) |
backoff_factor | float | 0.5 | Multiplier for exponential backoff: delay = factor * 2^attempt |
Custom API URL
Point the SDK at a specific DataBlue API base URL by setting the api_url:
# Direct constructor
with DataBlue(
api_url="https://api.datablue.dev",
api_key="wh_your_api_key",
) as client:
result = client.scrape("https://example.com")
# Or via environment variables
export DATABLUE_API_URL=https://api.datablue.dev
export DATABLUE_API_KEY=wh_your_api_key
from datablue import DataBlue
with DataBlue.from_env() as client:
result = client.scrape("https://example.com")
print(result.data.markdown)
Default URL: The SDK defaults to https://api.datablue.dev. Set the URL explicitly when using a private environment or local development server.
Complete API Reference (v2.0.0)
| Method | Description |
|---|---|
scrape(url, **opts) | Scrape a single URL, returns ScrapeResult |
crawl(url, **opts) | Crawl a site (blocking with polling), returns CrawlStatus |
start_crawl(url, **opts) | Start crawl (non-blocking), returns CrawlJob |
get_crawl_status(job_id) | Poll crawl status, returns CrawlStatus |
cancel_crawl(job_id) | Cancel an in-progress crawl |
search(query, **opts) | Search the web (blocking with polling), returns SearchStatus |
start_search(query, **opts) | Start search (non-blocking), returns SearchJob |
get_search_status(job_id) | Poll search status, returns SearchStatus |
map(url, **opts) | Discover URLs on a site, returns MapResult |
login(email, password) | Authenticate with email/password, stores JWT internally |
close() | Close the HTTP connection pool |
from_env() | Class method: create client from DATABLUE_* env vars |
AI-Ready SDK Surface
The SDK packages expose typed request options, response models, and error classes that match the REST API examples on this page. Use the installed package types plus these docs when generating integration code with an AI assistant.
Why typed docs matter: AI coding assistants hallucinate API calls when they lack accurate signatures. DataBlue keeps the examples and SDK types aligned with the live API shape so generated code uses real parameter names and response fields.