[Feature Request]: Support LZMA compression in python I/O SDKs

### What would you like to happen?

LZMA compression is standard in python but not one of the strategies in the `beam.io.{Read,Write}FromText` `PTransform`s. [openwebtext](https://skylion007.github.io/OpenWebTextCorpus/), for example, uses this compression. I think this may be a pretty simple change. For example, I hacked up a naive "shim" [here](https://github.com/wrossmorrow/efnlp/blob/main/beam/shims/beam-io-with-lzma-filesystem.py) for use in Dataflow with a custom container by just overwriting `apache_beam/io/filesystem.py` in the `site-packages`. It's working (a) locally with decompression and compression (though the output filenames are malformed, the part schema follows the compression extension) and (b) in a `DataflowRunner` reading a GCS dump of all the openwebtext `.xz` archives. (Without this I've been having a hell of a time getting any horizontal scaling while reading openwebtext.) It may be this simple, but I haven't run any Beam tests on these minor changes. I will probably do a bit more research into that myself. 



### Issue Priority

Priority: 2 (default / most feature requests should be filed as P2)

### Issue Components

- [X] Component: Python SDK
- [ ] Component: Java SDK
- [ ] Component: Go SDK
- [ ] Component: Typescript SDK
- [ ] Component: IO connector
- [ ] Component: Beam examples
- [ ] Component: Beam playground
- [ ] Component: Beam katas
- [ ] Component: Website
- [ ] Component: Spark Runner
- [ ] Component: Flink Runner
- [ ] Component: Samza Runner
- [ ] Component: Twister2 Runner
- [ ] Component: Hazelcast Jet Runner
- [ ] Component: Google Cloud Dataflow Runner

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request]: Support LZMA compression in python I/O SDKs #25316

What would you like to happen?

Issue Priority

Issue Components

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request]: Support LZMA compression in python I/O SDKs #25316

Description

What would you like to happen?

Issue Priority

Issue Components

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions