Configuration

The SDK uses an immutable ClientConfig dataclass for all configuration. You can pass parameters directly to the constructor, use environment variables, or build a config object manually.

Constructor Parameters

from datablue import DataBlue

client = DataBlue(
    api_url="https://api.datablue.dev",   # Base URL (default: https://api.datablue.dev)
    api_key="wh_your_api_key",             # API key (wh_ prefix)
    timeout=120.0,                          # Request timeout in seconds (default: 60)
    max_retries=5,                          # Max retry attempts (default: 3)
)

ClientConfig Object

For advanced control, build a ClientConfig and pass it to the constructor:

from datablue import DataBlue, ClientConfig

config = ClientConfig(
    api_url="https://api.datablue.dev",
    api_key="wh_your_api_key",
    timeout=120.0,
    max_retries=5,
    backoff_factor=1.0,                  # Multiplier for exponential backoff (default: 0.5)
)

client = DataBlue(config=config)

Config from Environment

from datablue import DataBlue, ClientConfig

# Build config from DATABLUE_* env vars
config = ClientConfig.from_env()

# Use with either client type
sync_client = DataBlue(config=config)
from datablue import AsyncDataBlue, ClientConfig

config = ClientConfig.from_env()
async_client = AsyncDataBlue(config=config)

Cloning Configs

Configs are immutable (frozen dataclass). Use clone() to create modified copies for different environments:

from datablue import DataBlue, ClientConfig

# Base config
prod = ClientConfig(
    api_url="https://api.datablue.dev",
    api_key="wh_prod_key",
    timeout=60.0,
    max_retries=3,
)

# Derive staging config (inherits everything except overrides)
staging = prod.clone(
    api_url="https://staging.datablue.dev",
    api_key="wh_staging_key",
)

# Derive a fast config for time-sensitive operations
fast = prod.clone(timeout=10.0, max_retries=1)

# Use each
with DataBlue(config=prod) as client:
    result = client.scrape("https://example.com")

ClientConfig Fields

Field Type Default Description
api_urlstrhttps://api.datablue.devBase URL of the DataBlue API (trailing slash auto-stripped)
api_keystr | NoneNoneAPI key with wh_ prefix
timeoutfloat60.0HTTP request timeout in seconds
max_retriesint3Maximum retry attempts on transient errors (429, 5xx, connection errors)
backoff_factorfloat0.5Multiplier for exponential backoff: delay = factor * 2^attempt

Custom API URL

Point the SDK at a specific DataBlue API base URL by setting the api_url:

# Direct constructor
with DataBlue(
    api_url="https://api.datablue.dev",
    api_key="wh_your_api_key",
) as client:
    result = client.scrape("https://example.com")
# Or via environment variables
export DATABLUE_API_URL=https://api.datablue.dev
export DATABLUE_API_KEY=wh_your_api_key
from datablue import DataBlue

with DataBlue.from_env() as client:
    result = client.scrape("https://example.com")
    print(result.data.markdown)

Default URL: The SDK defaults to https://api.datablue.dev. Set the URL explicitly when using a private environment or local development server.

Complete API Reference (v2.0.0)

Method Description
scrape(url, **opts)Scrape a single URL, returns ScrapeResult
crawl(url, **opts)Crawl a site (blocking with polling), returns CrawlStatus
start_crawl(url, **opts)Start crawl (non-blocking), returns CrawlJob
get_crawl_status(job_id)Poll crawl status, returns CrawlStatus
cancel_crawl(job_id)Cancel an in-progress crawl
search(query, **opts)Search the web (blocking with polling), returns SearchStatus
start_search(query, **opts)Start search (non-blocking), returns SearchJob
get_search_status(job_id)Poll search status, returns SearchStatus
map(url, **opts)Discover URLs on a site, returns MapResult
login(email, password)Authenticate with email/password, stores JWT internally
close()Close the HTTP connection pool
from_env()Class method: create client from DATABLUE_* env vars

AI-Ready SDK Surface

The SDK packages expose typed request options, response models, and error classes that match the REST API examples on this page. Use the installed package types plus these docs when generating integration code with an AI assistant.

Why typed docs matter: AI coding assistants hallucinate API calls when they lack accurate signatures. DataBlue keeps the examples and SDK types aligned with the live API shape so generated code uses real parameter names and response fields.