The official Python SDK for the DataFlare API.
- Typed Models: Full Pydantic schemas mapping the Datasets API for rigid IDE autocompletion.
- Connection Pools: Subclass optimized
httpxlogic reusing TCP connections seamlessly. - Resilient Requests: Automated retries (
tenacity) wrapping Rate Limit and transient network faults over exponential backoffs. - Idiomatic Paginators:
client.datasets.stream(...)automatically handles cursor injection iteratively returning stream chunks cleanly. - Memory-safe Source Retrieval: For pipelines feeding Large Language Models directly from data archives, effortlessly invoke
download_file(...)natively chunking raw bytes down to the file system avoiding memory leaks.
# Standard REST client
pip install dataflare-sdk
# Include gRPC support
pip install "dataflare-sdk[grpc]"You will need a DataFlare API Key. The SDK provides two ways to configure it securely:
Set the DF_API_KEY system environment variable (or load it from a local .env using python-dotenv):
export DF_API_KEY="dfk_abc123"If you pull secrets from an external vault, pass it directly into the constructor:
from df import DFClient
client = DFClient(api_key="dfk_your_secret_key...")from df import DFClient, AuthenticationError
# Automatically discovers DF_API_KEY from the environment
try:
with DFClient() as client:
# Generator handles pagination constraints completely
for doc in client.datasets.stream("legal", search_term="التأمين", limit=100):
print(f"Doc category: {doc.category} | Title: {doc.title} | Summary: {doc.summary} | Decision: {doc.decision}")
# Helper to download the raw File to disk natively
if doc.source_url:
client.datasets.download_file(
doc.source_url,
destination=f"./archives/{doc.id}.pdf"
)
except AuthenticationError:
print("Invalid API Key.")For environments requiring persistent connections and reduced latency, the SDK provides a dynamic gRPC client that works right out of the box using Server Reflection.
from df import DFGRPCClient, AuthenticationError
# Requires installing extra dependencies: pip install dataflare-sdk[grpc]
try:
with DFGRPCClient() as client:
# Perform unary RPC instead of REST natively
results, next_cursor = client.datasets.query(
dataset="legal",
limit=10,
)
for doc in results:
print(f"Title: {getattr(doc, 'title', None)} | Summary: {getattr(doc, 'summary', None)}")
except AuthenticationError:
print("Invalid API Key.")MIT — see the root LICENSE file for full terms.
Note: The SDK is free and open source. Dataflare API access requires a paid subscription. See dataflare.com/developers.