[SPARK-33453][SQL][TESTS] Unify v1 and v2 SHOW PARTITIONS tests#30377
[SPARK-33453][SQL][TESTS] Unify v1 and v2 SHOW PARTITIONS tests#30377MaxGekk wants to merge 18 commits into
Conversation
|
@cloud-fan @HyukjinKwon May I ask you to take a look at this PR, please. |
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
| } | ||
| } | ||
|
|
||
| test("show partitions of not partitioned table") { |
There was a problem hiding this comment.
"non-partitioned" sounds a bit more natural.
There was a problem hiding this comment.
Thank you. I will address your comment together with others. @HyukjinKwon @cloud-fan Do you have any comments for this PR?
There was a problem hiding this comment.
There are a few places with not partitioned:
$ find . -name '*.scala' -print0|xargs -0 grep -i -n 'not partitioned'
./core/src/test/scala/org/apache/spark/rdd/SortingSuite.scala:138: test("get a range of elements in an array not partitioned by a range partitioner") {
./mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala:925: * Note that the term "rating block" is a bit of a misnomer, as the ratings are not partitioned by
./streaming/src/main/scala/org/apache/spark/streaming/dstream/MapWithStateDStream.scala:134: // If the RDD is not partitioned the right way, let us repartition it using the
./sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:556: // One side of join is not partitioned in the desired way. Need to shuffle one side.
./sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:590: // One side of join is not partitioned in the desired way. Since the number of partitions of
./sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala:591: // the side that has already partitioned is smaller than the side that is not partitioned,
./sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala:308: test("OverwritePartitions: overwrite all rows if not partitioned") {
./sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowPartitionsSuite.scala:130: test("show partitions of not partitioned table") {
./sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowPartitionsSuite.scala:137: assert(errMsg.contains("not allowed on a table that is not partitioned"))
./sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala:1982: // not supported since the table is not partitioned
./sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileIndex.scala:73: /** Schema of the partitioning columns, or the empty schema if the table is not partitioned. */
./sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DescribeTableExec.scala:79: rows += toCatalystRow("Not partitioned", "", "")
./sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala:533: failAnalysis(s"Insert into a partition is not allowed because $l is not partitioned.")
./sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala:149: // This dataset is not partitioned.
./sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala:462: s"for tables that are not partitioned: $tableIdentWithDB")
./sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala:989: * 1. If the table is not partitioned.
./sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala:999: s"SHOW PARTITIONS is not allowed on a table that is not partitioned: $tableIdentWithDB")
./sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala:1755: // not supported since the table is not partitioned
I will change the title of this test but I am not sure about other places. I will leave them AS IS so far. @janekdb If you want, you can open a PR and fix them when it makes sense.
|
Test build #131095 has finished for PR 30377 at commit
|
| .partitionBy("a") | ||
| .format("parquet") | ||
| .mode(SaveMode.Overwrite) | ||
| .saveAsTable("part_datasrc") |
There was a problem hiding this comment.
this seems like testing the DataFrameWriter API not the SHOW PARTITIONS command.
There was a problem hiding this comment.
ah the test was already there. Let's keep it then.
|
|
||
| override protected def createDateTable(table: String): Unit = { | ||
| sql(s""" | ||
| |CREATE TABLE $table (price int, qty int) |
There was a problem hiding this comment.
CREATE TABLE ... USING hive PARTITIONED BY (...) doesn't work?
There was a problem hiding this comment.
Let me check that. I just didn't want to change the original test.
There was a problem hiding this comment.
I removed the functions from Hive's suite.
|
Test build #131148 has finished for PR 30377 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #131152 has finished for PR 30377 at commit
|
|
thanks, merging to master! |
What changes were proposed in this pull request?
SHOW PARTITIONSparsing tests toShowPartitionsParserSuiteSHOW PARTITIONSfromHiveCommandSuiteto the base test suitev1.ShowPartitionsSuiteBase. This will allow to run the tests w/ and w/o Hive.The changes follow the approach of #30287.
Why are the changes needed?
SHOW PARTITIONStests for both DSv1 and Hive DSv1, DSv2Does this PR introduce any user-facing change?
No
How was this patch tested?
By running:
build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly *ShowPartitionsSuite"build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly org.apache.spark.sql.hive.execution.HiveCommandSuite"