Skip to content

ARROW-15040: [R] Enable write_csv_arrow to take a Dataset or arrow_dplyr_query as input#11971

Closed
thisisnic wants to merge 13 commits into
apache:masterfrom
thisisnic:ARROW-15040_rb_reader
Closed

ARROW-15040: [R] Enable write_csv_arrow to take a Dataset or arrow_dplyr_query as input#11971
thisisnic wants to merge 13 commits into
apache:masterfrom
thisisnic:ARROW-15040_rb_reader

Conversation

@thisisnic

Copy link
Copy Markdown
Member

No description provided.

@github-actions

Copy link
Copy Markdown

@github-actions

Copy link
Copy Markdown

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

@jonkeane jonkeane left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this is exciting! One comment about types. It also looks like there are some(?) related segfaults in the CI

Comment thread r/R/csv.R Outdated
@paleolimbot

Copy link
Copy Markdown
Member

It may be worth rebasing this and seeing if it still segfaults! I know there were some updates to the Scanner that fixed some segfaults for me when I was writing tests for the Python bridge (from the Python side). Another way to write this test that would be independent of that is to Scan a Table or InMemoryDataset rather than a FileSystemDataset.

I'm also a fan of the "allow a general record batch reader" approach, which might also make this easier to test.

@paleolimbot

Copy link
Copy Markdown
Member

Also linking ARROW-15128 since it seems like a blocker for this PR.

@thisisnic thisisnic force-pushed the ARROW-15040_rb_reader branch from b45f9e6 to 43f08f3 Compare February 1, 2022 12:50
@thisisnic

Copy link
Copy Markdown
Member Author

Thanks for the suggestion @paleolimbot but the segfault is still happening after a rebase.

Comment thread r/R/csv.R
@thisisnic thisisnic force-pushed the ARROW-15040_rb_reader branch from ff62a45 to 01e1ae5 Compare February 1, 2022 18:10
@thisisnic

Copy link
Copy Markdown
Member Author

Have now updated this and removed the blocker ticket which isn't a blocker!

@thisisnic thisisnic requested a review from jonkeane February 2, 2022 13:37
Comment thread r/R/csv.R Outdated
Comment thread r/R/csv.R Outdated
Comment thread r/tests/testthat/test-csv.R Outdated
@thisisnic thisisnic changed the title ARROW-15040: [R] Enable write_csv_arrow to take a RecordBatchReader as input ARROW-15040: [R] Enable write_csv_arrow to take a Dataset or arrow_dplyr_query as input Feb 11, 2022

@nealrichardson nealrichardson left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also write a test for accepting RecordBatchReader, and need to update the docs too

Comment thread r/R/csv.R Outdated
Comment thread r/R/csv.R Outdated
Comment thread r/R/csv.R Outdated
thisisnic and others added 2 commits February 11, 2022 15:38
Co-authored-by: Neal Richardson <neal.p.richardson@gmail.com>

@nealrichardson nealrichardson left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of final suggestions

Comment thread r/tests/testthat/test-csv.R Outdated
Comment thread r/tests/testthat/test-csv.R Outdated
Co-authored-by: Neal Richardson <neal.p.richardson@gmail.com>
Co-authored-by: Neal Richardson <neal.p.richardson@gmail.com>
Comment thread r/tests/testthat/test-csv.R Outdated
@thisisnic thisisnic closed this in 13045f4 Mar 1, 2022
@ursabot

ursabot commented Mar 1, 2022

Copy link
Copy Markdown

Benchmark runs are scheduled for baseline = 676b49f and contender = 13045f4. 13045f4 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.63% ⬆️0.67%] test-mac-arm
[Finished ⬇️0.0% ⬆️3.21%] ursa-i9-9960x
[Finished ⬇️0.3% ⬆️0.68%] ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants