Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
8044220
add cache plan
ThomasWaldmann May 22, 2026
12f9d00
cache implementation
ThomasWaldmann May 22, 2026
408e890
cache: implement policy and max_age expiry
ThomasWaldmann May 25, 2026
9e21900
cache: rename "cache" mode to "writethrough"
ThomasWaldmann May 25, 2026
2a9065b
cache: add "size" to policy, close-time size-based LRU eviction
ThomasWaldmann May 25, 2026
106eba6
Apply latency/bandwidth emulation only to primary backend calls
ThomasWaldmann May 25, 2026
2a55e7d
optimize Store.find method
ThomasWaldmann May 25, 2026
5b0f99d
do not wrap find() with _backend_call
ThomasWaldmann May 26, 2026
c876027
log volume in _stats_updater
ThomasWaldmann May 26, 2026
9e74dba
document defrag and list cache behavior
ThomasWaldmann May 26, 2026
28744c6
remove plan
ThomasWaldmann May 26, 2026
5181562
implement cache_invalidate Store API method and unit tests
ThomasWaldmann May 26, 2026
9a1f0f9
tests: refactor to use public cache_invalidate API
ThomasWaldmann May 26, 2026
260312c
docs: document public cache_invalidate API and its use-cases
ThomasWaldmann May 26, 2026
e54883f
docs: re-format store_caching.rst to max 80 chars wide
ThomasWaldmann May 26, 2026
e39082d
caching: add notes to README and other docs
ThomasWaldmann May 26, 2026
fdd0853
docs: add notes about getting working atime on Linux/Windows
ThomasWaldmann May 27, 2026
d2eaa62
Optimize cache reads by removing strict read-time expiration check
ThomasWaldmann May 27, 2026
6b62e81
Change cache cleanup and eviction to both open- and close-time
ThomasWaldmann May 27, 2026
3dda854
Track cache load/store calls and volumes in Store class stats
ThomasWaldmann May 27, 2026
8562aa4
Replace internal primary backend wrappers in cache tests with cache s…
ThomasWaldmann May 27, 2026
902e6b3
Add cache_delete_calls and primary backend stats to Store and simplif…
ThomasWaldmann May 27, 2026
a848040
Add backend_delete_calls statistics to Store
ThomasWaldmann May 27, 2026
e5f9988
Rearrange stats ordering in Store class and caching documentation
ThomasWaldmann May 27, 2026
c221888
Store: rename cache methods
ThomasWaldmann May 27, 2026
940228a
Store: reorder methods
ThomasWaldmann May 27, 2026
cd67805
tests: simplify test_open_cleans_up_expired_cache_items with cache_de…
ThomasWaldmann May 27, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,9 @@ Store features
- supports URLs, like `file:///srv/borgstore` or `https://myserver/path`
- easy to use, high-level `Store` API: create/destroy, open/close, list,
load/store, delete, move, soft delete/undelete, hash, defrag, ...
- uses a backend to implement the storage
- optionally uses an additional caching backend, with a configurable cache
policy per namespace
- name nesting / unnesting, recursive directory listing
- statistics collection
- latency/bandwidth emulator
Expand Down
16 changes: 16 additions & 0 deletions docs/backends.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ basic operations.

Existing backends are listed below; more might come in the future.

See also :doc:`store_caching` for optional Store-level caching with a secondary backend.

posixfs
-------

Expand Down Expand Up @@ -80,6 +82,20 @@ Use storage on a local POSIX filesystem:
# ...
store.close()

Note:

When using posixfs as a caching backend, it needs to use a filesystem with
``atime`` support for ``max_age`` and LRU-based ``size`` limits to work as
expected.

For Linux that means you must not use ``noatime`` mount option.

For Windows / NTFS, atime is disabled by default and you need:

::

fsutil behavior set DisableLastAccess 0 # re-enable (requires admin, reboot)

sftp
----

Expand Down
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@

installation
store
store_caching
backends
servers
changes
Expand Down
33 changes: 30 additions & 3 deletions docs/store.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,39 @@ API can be much simpler:
- defrag: general purpose defragmentation helper (copies blocks to new items)
- quota: return quota limit and usage (-1 if quotas not enabled or not supported)
- stats: API call counters, time spent in API methods, data volume/throughput.
- latency/bandwidth emulator: can emulate higher latency (via BORGSTORE_LATENCY
[us]) and lower bandwidth (via BORGSTORE_BANDWIDTH [bit/s]) than what is
actually provided by the backend.
- latency/bandwidth emulator: see :ref:`store-latency-bandwidth-emulator`.

Store operations (and per-op timing and volume) are logged at DEBUG log level.

See also :doc:`store_caching` for optional Store-level caching with a secondary backend.


.. _store-latency-bandwidth-emulator:

Latency and bandwidth emulator
------------------------------

The Store can emulate slower backend behavior using environment variables:

- ``BORGSTORE_LATENCY``: per-primary-call latency in microseconds (``[us]``).
- ``BORGSTORE_BANDWIDTH``: primary-call bandwidth limit in bits per second
(``[bit/s]``).

Current behavior with Store caching enabled:

- Emulation applies to **primary backend** operations.
- Emulation does **not** apply to **cache backend** operations.

This means:

- On cache miss paths (for example writethrough/mirror reads that load from the
primary backend), emulation affects the primary backend calls.
- On cache hit paths, cached reads avoid primary backend load operations and
therefore do not incur emulated bandwidth delay for the cache backend read.
- Name resolution for Store operations still uses primary backend lookups,
therefore configured latency can still be visible even when data comes from
cache.

Keys
----

Expand Down
130 changes: 130 additions & 0 deletions docs/store_caching.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
Store caching
=============

The ``Store`` can optionally use a second backend as a local cache for selected
namespaces, which is especially useful when the primary backend is remote,
slower or otherwise more "expensive" than the cache.

Configuration
-------------

- ``cache_url`` or ``cache_backend``: where cached data is stored
- ``cache``: mapping of namespace to cache policy

Each cache policy can be provided either as:

- ``CachePolicy(mode=..., max_age=..., size=...)``
- ``{"mode": ..., "max_age": ..., "size": ...}``

``mode`` accepts ``CacheMode`` values or string aliases:

- ``CacheMode.C_OFF`` or ``"off"``: bypass cache completely.
- ``CacheMode.C_MIRROR`` or ``"mirror"``: always read from primary backend,
but update the cache after successful primary backend reads and writes.
- ``CacheMode.C_WRITETHROUGH`` or ``"writethrough"``: read-through +
write-through.
For now, only content-hash addressed namespaces should use this mode.

``max_age`` is optional and expressed in seconds since last access. The default
is ``None`` (no age limit).

``size`` is optional and expressed in bytes. It sets a per-namespace cache
size budget enforced by evicting least-recently-used items until the namespace
total size is within the configured budget.

Example::

from borgstore.store import Store, CacheMode

store = Store(
url="sftp://user@host/repo",
levels={"data": [2], "meta": [1]},
cache={
"data": {
"mode": "writethrough",
"max_age": 3600,
"size": 4 * 1024**3,
},
"meta": {"mode": CacheMode.C_MIRROR},
},
cache_url="file:///home/user/.cache/borgstore/repo",
)

Behavior
--------

- Cache keys are identical to primary backend keys (same nesting).
- Soft-deleted items are cached under the same ``.del`` name as primary.
- Soft delete/undelete renames cache entries as well.
- On ``Store.open()`` and ``Store.close()``, cache-enabled namespaces are scanned
to clean up the cache. Cleanup order per namespace is:

1. remove expired cache objects when ``max_age`` is configured,
2. if ``size`` is configured, evict the least-recently-used remaining items
until the namespace total size is ``<= size``.

Expired entries are always removed first, even if total size is already below
the ``size`` limit.
- Cache failures are non-fatal and logged as warnings.

Manual Cache Invalidation
-------------------------

If you need to programmatically clear or invalidate parts of the cache (for
example, to resolve stale objects after primary backend deletes by other
clients, or if cache corruption is suspected), you can use the
``cache_invalidate`` method:

- To invalidate a single item::

store.cache_invalidate("data/00000000")

- To invalidate all cached items in a specific namespace (e.g. ``"data/"``)::

store.cache_invalidate("data/")

- To invalidate all cached items across all configured namespaces, pass
``ROOTNS``::

from borgstore.constants import ROOTNS
store.cache_invalidate(ROOTNS)

Limitations
-----------

- Eviction by ``max_age`` or ``size`` is open-time and close-time only
(``Store.open()`` / ``Store.close()``), not continuous during
``store()``/``load()`` operations.
- No proactive cache validation/revalidation.
- If an object is deleted in the primary backend by another client, the local
cache will still have a stale object.
- ``max_age`` and LRU-by-``size`` depend on backend ``ItemInfo.atime`` support.
If ``atime`` is 0 (not implemented):

- using ``max_age`` would empty the cache on ``Store.open()`` or ``Store.close()``
- using ``size`` would not work in LRU order, because order can't be
determined
- If a partial range ``load`` call for an object in a cached namespace causes
a cache miss, the full object will be read from the primary backend and the
cache will be populated with the full object.

Statistics
----------

``Store.stats`` includes cache counters:

- ``backend_load_volume``
- ``backend_store_volume``
- ``backend_load_calls``
- ``backend_store_calls``
- ``backend_delete_calls``
- ``cache_disabled``
- ``cache_hits``
- ``cache_misses``
- ``cache_hit_ratio``
- ``cache_errors``
- ``cache_load_volume``
- ``cache_store_volume``
- ``cache_load_calls``
- ``cache_store_calls``
- ``cache_delete_calls``
3 changes: 2 additions & 1 deletion src/borgstore/backends/_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@

from ..constants import MAX_NAME_LENGTH, TMP_SUFFIX, HID_SUFFIX

ItemInfo = namedtuple("ItemInfo", "name exists size directory")
# atime is the last read access UNIX timestamp [s] or 0 if not implemented
ItemInfo = namedtuple("ItemInfo", "name exists size directory atime", defaults=(0,))


def validate_name(name):
Expand Down
4 changes: 2 additions & 2 deletions src/borgstore/backends/posixfs.py
Original file line number Diff line number Diff line change
Expand Up @@ -205,7 +205,7 @@ def info(self, name):
return ItemInfo(name=path.name, exists=False, directory=False, size=0)
else:
is_dir = stat.S_ISDIR(st.st_mode)
return ItemInfo(name=path.name, exists=True, directory=is_dir, size=st.st_size)
return ItemInfo(name=path.name, exists=True, directory=is_dir, size=st.st_size, atime=st.st_atime)

def load(self, name, *, size=None, offset=0):
if not self.opened:
Expand Down Expand Up @@ -361,7 +361,7 @@ def list(self, name):
pass
else:
is_dir = stat.S_ISDIR(st.st_mode)
yield ItemInfo(name=p.name, exists=True, size=st.st_size, directory=is_dir)
yield ItemInfo(name=p.name, exists=True, size=st.st_size, directory=is_dir, atime=st.st_atime)

def quota(self) -> dict:
"""Return quota information: limit and usage in bytes. -1 means not set / not tracked."""
Expand Down
Loading
Loading