Skip to content

[SPARK-13530][SQL] Add ShortType support to UnsafeRowParquetRecordReader#11412

Closed
viirya wants to merge 1 commit into
apache:masterfrom
viirya:add-shorttype-support
Closed

[SPARK-13530][SQL] Add ShortType support to UnsafeRowParquetRecordReader#11412
viirya wants to merge 1 commit into
apache:masterfrom
viirya:add-shorttype-support

Conversation

@viirya

@viirya viirya commented Feb 27, 2016

Copy link
Copy Markdown
Member

JIRA: https://issues.apache.org/jira/browse/SPARK-13530

What changes were proposed in this pull request?

By enabling vectorized parquet scanner by default, the unit test ParquetHadoopFsRelationSuite based on HadoopFsRelationTest will be failed due to the lack of short type support in UnsafeRowParquetRecordReader. We should fix it.

The error exception:

[info] ParquetHadoopFsRelationSuite:
[info] - test all data types - StringType (499 milliseconds)
[info] - test all data types - BinaryType (447 milliseconds)
[info] - test all data types - BooleanType (520 milliseconds)
[info] - test all data types - ByteType (418 milliseconds)
00:22:58.920 ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 124.0 (TID 1949)
org.apache.commons.lang.NotImplementedException: Unimplemented type: ShortType
at org.apache.spark.sql.execution.datasources.parquet.UnsafeRowParquetRecordReader$ColumnReader.readIntBatch(UnsafeRowParquetRecordReader.java:769)
at org.apache.spark.sql.execution.datasources.parquet.UnsafeRowParquetRecordReader$ColumnReader.readBatch(UnsafeRowParquetRecordReader.java:640)
at org.apache.spark.sql.execution.datasources.parquet.UnsafeRowParquetRecordReader$ColumnReader.access$000(UnsafeRowParquetRecordReader.java:461)
at org.apache.spark.sql.execution.datasources.parquet.UnsafeRowParquetRecordReader.nextBatch(UnsafeRowParquetRecordReader.java:224)

How was this patch tested?

The unit test ParquetHadoopFsRelationSuite based on HadoopFsRelationTest will be failed due to the lack of short type support in UnsafeRowParquetRecordReader. By adding this support, the test can be passed.

@SparkQA

SparkQA commented Feb 27, 2016

Copy link
Copy Markdown

Test build #52117 has finished for PR 11412 at commit e923f7d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya

viirya commented Feb 27, 2016

Copy link
Copy Markdown
Member Author

cc @nongli @rxin

@nongli

nongli commented Feb 27, 2016

Copy link
Copy Markdown
Contributor

lgtm

@rxin

rxin commented Feb 27, 2016

Copy link
Copy Markdown
Contributor

Thanks - merging this in master. It would've been better if we could have a unit test for this module, rather than relying on some integration tests.

@asfgit asfgit closed this in 3814d0b Feb 27, 2016
@JoshRosen

Copy link
Copy Markdown
Contributor

@viirya

viirya commented Feb 27, 2016

Copy link
Copy Markdown
Member Author

@JoshRosen looks like at the same module but a different problem. I will look at it.

@nongli

nongli commented Feb 28, 2016

Copy link
Copy Markdown
Contributor

@viirya I fixed it with this patch:
#11414

@viirya

viirya commented Feb 28, 2016

Copy link
Copy Markdown
Member Author

@nongli Got it. Thanks!

@viirya viirya deleted the add-shorttype-support branch December 27, 2023 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants