Skip to content

ARROW-6883: [C++][Python] Allow writing dictionary deltas#8811

Closed
pitrou wants to merge 2 commits into
apache:masterfrom
pitrou:ARROW-6883-ipc-write-deltas
Closed

ARROW-6883: [C++][Python] Allow writing dictionary deltas#8811
pitrou wants to merge 2 commits into
apache:masterfrom
pitrou:ARROW-6883-ipc-write-deltas

Conversation

@pitrou

@pitrou pitrou commented Dec 1, 2020

Copy link
Copy Markdown
Member
  • Add an ipc::IpcWriteOptions member to govern emission of dictionary deltas.

    If the option is enabled, deltas are detected by checking whether the new
    dictionary starts with the last emitted one for the same field.
    However, for nested dictionaries, deltas are not emitted for the outer dictionary,
    as the read path doesn't support it.

  • Add a stats() method to ipc::StreamDecoder

  • Expose the IPC statistics in Python, and add tests

@github-actions

github-actions Bot commented Dec 1, 2020

Copy link
Copy Markdown

@wesm

wesm commented Dec 3, 2020

Copy link
Copy Markdown
Member

Acknowledged, will review when I can

@pitrou pitrou force-pushed the ARROW-6883-ipc-write-deltas branch from 47a1e91 to 3e11e6b Compare December 7, 2020 10:41
* Add an ipc::IpcWriteOptions member to govern emission of dictionary deltas.

  If the option is enabled, deltas are detected by checking whether the new
  dictionary starts with the last emitted one for the same field.
  However, for nested dictionaries, deltas are not emitted for the outer dictionary,
  as the read path doesn't support it.

* Add a stats() method to ipc::StreamDecoder

* Expose the IPC statistics in Python, and add tests
@pitrou pitrou force-pushed the ARROW-6883-ipc-write-deltas branch from 3e11e6b to 16beb03 Compare December 10, 2020 11:44
@pitrou pitrou force-pushed the ARROW-6883-ipc-write-deltas branch from 16beb03 to be9d921 Compare December 10, 2020 11:48
@pitrou

pitrou commented Dec 14, 2020

Copy link
Copy Markdown
Member Author

cc @lidavidm

@lidavidm

Copy link
Copy Markdown
Member

The Flight changes look good to me (though there's still ARROW-10787).

@wesm wesm left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 here. Sorry for the long delay

Comment thread cpp/src/arrow/ipc/options.h Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might help to be explicit to say that otherwise the order of dictionaries may be altered by a passage from sender to receiver (I assume that's what you mean here by "stream compatibility" but others might be scratching their heads)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I simply meant that it's off for compatibility with implementations that don't support delta dictionaries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants