Skip to content

ARROW-5791: [C++] Fix infinite loop with more the 32768 columns. #4762

Closed
emkornfield wants to merge 6 commits into
apache:masterfrom
emkornfield:csv
Closed

ARROW-5791: [C++] Fix infinite loop with more the 32768 columns. #4762
emkornfield wants to merge 6 commits into
apache:masterfrom
emkornfield:csv

Conversation

@emkornfield

@emkornfield emkornfield commented Jun 30, 2019

Copy link
Copy Markdown
Contributor

But really 32768 columns should be enough for anyone :)

@codecov-io

codecov-io commented Jul 1, 2019

Copy link
Copy Markdown

Codecov Report

Merging #4762 into master will increase coverage by 0.56%.
The diff coverage is 95.45%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4762      +/-   ##
==========================================
+ Coverage   88.39%   88.95%   +0.56%     
==========================================
  Files         908      717     -191     
  Lines      114157    98953   -15204     
  Branches     1418        0    -1418     
==========================================
- Hits       100904    88021   -12883     
+ Misses      12891    10932    -1959     
+ Partials      362        0     -362
Impacted Files Coverage Δ
cpp/src/arrow/csv/parser-test.cc 100% <100%> (ø) ⬆️
cpp/src/arrow/csv/parser.cc 97.11% <75%> (+1.26%) ⬆️
cpp/src/arrow/io/readahead.cc 95.91% <0%> (-1.03%) ⬇️
cpp/src/arrow/util/thread-pool-test.cc 97.66% <0%> (-0.94%) ⬇️
go/arrow/ipc/writer.go
go/arrow/math/uint64_amd64.go
go/arrow/memory/memory_avx2_amd64.go
go/arrow/ipc/file_reader.go
js/src/builder/index.ts
js/src/enum.ts
... and 189 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 175ad65...ab0504c. Read the comment docs.

@TheNeuralBit

Copy link
Copy Markdown
Member

Thanks for tracking this down! Interesting that it's just a magic number and not an overflow. This LGTM, but I guess someone else should take a look.

@emkornfield

Copy link
Copy Markdown
Contributor Author

actually, I should remove code duplication. I also think maybe I allowed for too many columns.

@pitrou pitrou left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix!

Comment thread cpp/src/arrow/csv/parser.cc Outdated
Comment thread cpp/src/arrow/csv/parser.cc Outdated
@emkornfield

emkornfield commented Jul 1, 2019

Copy link
Copy Markdown
Contributor Author

Just in case its useful run from travis on my branch: https://travis-ci.org/emkornfield/arrow/builds/552739714

@emkornfield

Copy link
Copy Markdown
Contributor Author

@ursabot build

@wesm wesm left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@wesm wesm closed this in 8192902 Jul 1, 2019
kou pushed a commit that referenced this pull request Jul 4, 2019
But really 32768 columns should be enough for anyone :)

Author: Micah Kornfield <emkornfield@gmail.com>

Closes #4762 from emkornfield/csv and squashes the following commits:

ab0504c <Micah Kornfield> lower number of columns in test to satisfy ming
8f53a8a <Micah Kornfield> remove test
acfe2d8 <Micah Kornfield> remove cap, make min rows_in_chunk 512
08ddc22 <Micah Kornfield> remove floor duplication
211472a <Micah Kornfield> powers of 2 are better
b91a9e1 <Micah Kornfield> ARROW-5791:  Fix infinite loop with more the 32768 columns.  Cap max columns
wesm pushed a commit that referenced this pull request Jul 13, 2019
But really 32768 columns should be enough for anyone :)

Author: Micah Kornfield <emkornfield@gmail.com>

Closes #4762 from emkornfield/csv and squashes the following commits:

ab0504c <Micah Kornfield> lower number of columns in test to satisfy ming
8f53a8a <Micah Kornfield> remove test
acfe2d8 <Micah Kornfield> remove cap, make min rows_in_chunk 512
08ddc22 <Micah Kornfield> remove floor duplication
211472a <Micah Kornfield> powers of 2 are better
b91a9e1 <Micah Kornfield> ARROW-5791:  Fix infinite loop with more the 32768 columns.  Cap max columns
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants