Skip to content

Conversation

@jfw-ppi
Copy link
Contributor

@jfw-ppi jfw-ppi commented Nov 18, 2025

What this PR does / why we need it:

Added batching configuration for feature_servers /push endpoint for offline store writes

Which issue(s) this PR fixes:

Fixes #5683

@jfw-ppi jfw-ppi requested a review from a team as a code owner November 18, 2025 18:44
@ntkathole ntkathole changed the title feat: added batching to feature server /push to offline store (#5683) feat: Added batching to feature server /push to offline store (#5683) Nov 19, 2025

from feast.repo_config import FeastConfigBaseModel

class OfflinePushBatchingConfig(FeastConfigBaseModel):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think having a config is fine but do we actually need it? we probably could have just passed these as optional args, no?

Copy link
Contributor Author

@jfw-ppi jfw-ppi Nov 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I just noticed the FeatureLoggingConfig with 5 fields and decided that for 3 fields it would be justified to also add a config. Do you want me to refactor it to use optional args?

Copy link
Contributor Author

@jfw-ppi jfw-ppi Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think having a config is fine but do we actually need it? we probably could have just passed these as optional args, no?

I refactored it as you wanted, so that there is no config. @franciscojavierarceo

@jfw-ppi jfw-ppi force-pushed the 5683-configurable-batching-feature-server-push branch 5 times, most recently from 494912d to 107bf09 Compare November 26, 2025 21:42
@jfw-ppi jfw-ppi force-pushed the 5683-configurable-batching-feature-server-push branch 2 times, most recently from c980db9 to 04dc34f Compare November 30, 2025 11:32
…dev#5683)

Signed-off-by: Jacob Weinhold <29459386+jfw-ppi@users.noreply.github.com>

fix: formatting,l int errors (feast-dev#5683)
Signed-off-by: Jacob Weinhold <29459386+jfw-ppi@users.noreply.github.com>
@ntkathole ntkathole force-pushed the 5683-configurable-batching-feature-server-push branch from 20079eb to 1a3ccbd Compare December 15, 2025 17:00
@jfw-ppi jfw-ppi force-pushed the 5683-configurable-batching-feature-server-push branch from a60c605 to bb299d9 Compare December 28, 2025 21:19
return

batch_df = pd.concat(dfs, ignore_index=True)
self._buffers[key].clear()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to move clear inside try: self._store.push so that buffer gets cleared only after the write succeeds ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally makes sense, it's done. Thanks for seeing that.


# NOTE: offline writes are currently synchronous only, so we call directly
try:
self._store.push(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about splitting _flush_locked into two methods: one that extracts data (with lock) and one that does I/O (without lock) ?

Something like:

  • Extracting the batch data while holding the lock
  • Releasing the lock before doing I/O
  • Re-enqueueing data if the write fails

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion!
I split _flush_locked into _drain_locked (extract under lock) and _flush (I/O without lock). I also added _inflight to prevent concurrent flushes per key. On failure the drained batch is re‑enqueued so we don’t drop data.


# use a multi-row payload to ensure we test non-trivial dfs
resp = client.post("/push", json=push_body_many(push_mode, count=2, id_start=100))
assert resp.status_code == 200
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional but I think it's good to return 202 when batching is enabled and offline writes are involved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! It's done.

@franciscojavierarceo franciscojavierarceo requested review from Copilot and removed request for franciscojavierarceo December 31, 2025 11:10
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds configurable batching support for offline writes to the feature server's /push endpoint. The batching mechanism buffers offline writes and flushes them based on either a size threshold or time interval, improving throughput for high-volume offline push operations.

Key Changes:

  • Introduced OfflineWriteBatcher class that manages buffered writes in a background thread
  • Added configuration options for batch size and interval in BaseFeatureServerConfig
  • Modified /push endpoint logic to separate online and offline writes when batching is enabled

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
sdk/python/feast/feature_server.py Implemented OfflineWriteBatcher class and integrated batching logic into the /push endpoint
sdk/python/feast/infra/feature_servers/base_config.py Added three new configuration fields for offline push batching
sdk/python/tests/unit/test_feature_server.py Added comprehensive test coverage for batching behavior across different push modes and configurations
docs/reference/feature-store-yaml.md Documented the new feature_server configuration block with batching options
docs/reference/feature-servers/python-feature-server.md Added user-facing documentation explaining offline write batching functionality

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

allow_registry_cache = request.allow_registry_cache
transform_on_write = request.transform_on_write

# Async currently only applies to online store writes (ONLINE / ONLINE_AND_OFFLINE paths) as theres no async for offline store
Copy link

Copilot AI Dec 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling: 'theres' should be 'there's'.

Suggested change
# Async currently only applies to online store writes (ONLINE / ONLINE_AND_OFFLINE paths) as theres no async for offline store
# Async currently only applies to online store writes (ONLINE / ONLINE_AND_OFFLINE paths) as there's no async for offline store

Copilot uses AI. Check for mistakes.
fs, enabled: bool = True, batch_size: int = 1, batch_interval_seconds: int = 60
):
"""
Attach a minimal feature_server.offline_push_batching config
Copy link

Copilot AI Dec 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring has inconsistent indentation. The closing triple quotes and the description line should be aligned with the opening triple quotes for standard formatting.

Suggested change
Attach a minimal feature_server.offline_push_batching config
Attach a minimal feature_server.offline_push_batching config

Copilot uses AI. Check for mistakes.
…-dev#5683](feast-dev#5683))

Signed-off-by: Jacob Weinhold <29459386+jfw-ppi@users.noreply.github.com>
@jfw-ppi jfw-ppi force-pushed the 5683-configurable-batching-feature-server-push branch from a99a9b0 to 2747405 Compare January 1, 2026 15:28
Copy link
Member

@ntkathole ntkathole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@franciscojavierarceo franciscojavierarceo merged commit ce35ce6 into feast-dev:master Jan 3, 2026
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: Add configurable batching for offline store writes in Feature Server push API

3 participants