Skip to content

Add API to support overriding protocol_config in WrappedFileSystemFlavour #291

Description

@rahuliyer95

I know that WrappedFileSystemFlavour is an internal and experimental class but I am in a situation where I need to override the protocol_config in this class to support my custom UPath which mimicks the behavior of S3Path. Here is a minimal example to reproduce this where I want to create a UPath called foo

test.py
import fsspec  # type: ignore [import-untyped]
from fsspec.implementations.arrow import ArrowFSWrapper  # type: ignore [import-untyped]
from fsspec.utils import infer_storage_options  # type: ignore [import-untyped]
from upath import UPath
from upath import registry as upath_registry
from upath._flavour import WrappedFileSystemFlavour
from upath.implementations.cloud import S3Path


class FooFileSystem(ArrowFSWrapper):
    protocol = "foo"

    def __init__(self, *args, **kwargs):
        from pyarrow.fs import S3FileSystem  # type: ignore [import-untyped]

        fs = S3FileSystem()
        super().__init__(fs=fs, **kwargs)

    @classmethod
    def _strip_protocol(cls, path: str) -> str:
        # upstream fsspec has hardcoded `host + path` for s3/s3a we need this for `foo` as well.
        storage_opts = infer_storage_options(path)
        if host := storage_opts.get("host"):
            storage_opts["path"] = host + storage_opts["path"]
        path_without_protocol = str(storage_opts["path"])
        if path_without_protocol.startswith("//"):
            # special case for "hdfs://path" (without the triple slash)
            path_without_protocol = path_without_protocol[1:]
        return path_without_protocol


class FooPath(S3Path):
    pass


fsspec.register_implementation("foo", FooFileSystem)
upath_registry.register_implementation("foo", FooPath)
path = UPath("foo://bar/baz")

throws the following error

Traceback (most recent call last):
  File "/Users/rahuliyer/test.py", line 47, in <module>
    path = UPath("foo://bar/baz")
  File "/Users/rahuliyer/.venv/lib/python3.10/site-packages/upath/implementations/cloud.py", line 92, in __init__
    raise ValueError("non key-like path provided (bucket/container missing)")
ValueError: non key-like path provided (bucket/container missing)

The same works if I override protocol_config in the following manner before initializing the UPath

WrappedFileSystemFlavour.protocol_config["netloc_is_anchor"] |= {"foo"}
WrappedFileSystemFlavour.protocol_config["supports_empty_parts"] |= {"foo"}

What's the best approach here to get this working without having the override protocol_config of an internal class?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions