Skip to content

ARROW-16426: [C++] Add TeeNode to execution engine#13040

Closed
rtpsw wants to merge 7 commits into
apache:masterfrom
rtpsw:ARROW-16426
Closed

ARROW-16426: [C++] Add TeeNode to execution engine#13040
rtpsw wants to merge 7 commits into
apache:masterfrom
rtpsw:ARROW-16426

Conversation

@rtpsw

@rtpsw rtpsw commented May 1, 2022

Copy link
Copy Markdown
Contributor

The existing write node is a consuming one while the proposed tee node is a pass-through one.

@github-actions

github-actions Bot commented May 1, 2022

Copy link
Copy Markdown

@github-actions

github-actions Bot commented May 1, 2022

Copy link
Copy Markdown

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

@westonpace westonpace self-requested a review May 2, 2022 21:16
@rtpsw

rtpsw commented May 3, 2022

Copy link
Copy Markdown
Contributor Author

@westonpace, the build failures shows a dependency problem in some of the platforms. I'm not sure how to resolve this.

@westonpace

Copy link
Copy Markdown
Member

Ah, it seems MapNode had not previously ever been used outside of the compute module. file_base.cc is in the datasets module. You will need to add the appropriate exports to MapNode. This is an unfortunate process that seems to be unique to Windows. This should be as simple as...

class ARROW_EXPORT MapNode : public ExecNode {
...

Although exposing things tends to have somewhat infectious consequences. For example, I'm pretty sure you will then also need to expose AtomicCounter since it is referenced by MapNode. However, that should be about it and it makes sense to expose both of those things as they will be generally useful to anyone building ExecNode implementations outside of Arrow.

@westonpace westonpace left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very clever idea. Only a few minor suggestions.

Comment thread cpp/src/arrow/dataset/file_base.cc Outdated
Comment thread cpp/src/arrow/dataset/file_base.cc Outdated
Comment thread cpp/src/arrow/dataset/file_base.cc
@rtpsw

rtpsw commented May 5, 2022

Copy link
Copy Markdown
Contributor Author

@westonpace, the R / AMD64 Windows C++ RTools 40 ucrt64 build error seems unrelated to the change. How to proceed?

@westonpace

Copy link
Copy Markdown
Member

I agree it is unrelated. In the future, if there are build failures and they seem unrelated feel free to just mention something like "build failures seem unrelated" and (assuming its ready) re-request review.

@ursabot

ursabot commented May 11, 2022

Copy link
Copy Markdown

Benchmark runs are scheduled for baseline = d00caa9 and contender = 7bfc732. 7bfc732 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.08% ⬆️0.04%] test-mac-arm
[Finished ⬇️0.71% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.0% ⬆️0.08%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 7bfc7320 ec2-t3-xlarge-us-east-2
[Finished] 7bfc7320 test-mac-arm
[Finished] 7bfc7320 ursa-i9-9960x
[Finished] 7bfc7320 ursa-thinkcentre-m75q
[Finished] d00caa94 ec2-t3-xlarge-us-east-2
[Finished] d00caa94 test-mac-arm
[Finished] d00caa94 ursa-i9-9960x
[Finished] d00caa94 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants