[SPARK-8432] [SQL] fix hashCode() and equals() of BinaryType in Row#6876
[SPARK-8432] [SQL] fix hashCode() and equals() of BinaryType in Row#6876davies wants to merge 11 commits into
Conversation
|
Test build #35112 has finished for PR 6876 at commit
|
There was a problem hiding this comment.
Is it same with java.util.Arrays.hashCode?
There was a problem hiding this comment.
Good idea, we should use that.
|
Test build #35114 has finished for PR 6876 at commit
|
|
Test build #35144 has finished for PR 6876 at commit
|
|
Test build #35146 has finished for PR 6876 at commit
|
|
@marmbrus Could you help to review this one? |
There was a problem hiding this comment.
Existing: can you add some javadoc to this class to explain what its used for and why its in Java?
There was a problem hiding this comment.
Because Row is a trait, UnsafeRow and SpecificRow are both in Java, they can not inherit some default implementations from Row, so created BaseRow in Java for them. Right now, we have InternalRow, will be clean these in another PR.
|
@davies, thanks for working on this! I'm okay with this approach, but did you consider the alternative, where we instead change the internal type of |
|
@marmbrus We're working to have more efficient representation in catalyst, putting Array[Byte] inside a wrapper sounds not in the same direction. I'd like to go this approach. |
|
I think using a wrapper might be necessary for efficiency. For example, we will want to reuse the same byte array when reading from something like parquet, instead of needing to allocate one of the exact size each time (think |
|
Test build #35189 timed out for PR 6876 at commit |
There was a problem hiding this comment.
Will we change it to isInstanceOf[InternalRow] after #6869?
There was a problem hiding this comment.
this is a pretty weird way of indenting. you can do
List(0, 1, Int.MinValue, Int.MaxValue).foreach { d =>
...
}or
for (d <- List(0, 1, Int.MinValue, Int.MaxValue)) {
...
}|
Test build #35225 has finished for PR 6876 at commit
|
|
Test build #35283 has finished for PR 6876 at commit
|
|
Test build #937 has finished for PR 6876 at commit
|
|
Test build #35284 has finished for PR 6876 at commit
|
|
test this please |
|
Test build #939 has finished for PR 6876 at commit
|
|
Test build #35307 has finished for PR 6876 at commit
|
|
Test build #35322 has finished for PR 6876 at commit
|
There was a problem hiding this comment.
Should we also update equals for generated code?
There was a problem hiding this comment.
Ah I see, genEqual has already handled BinaryType.
|
Test build #946 has finished for PR 6876 at commit
|
Conflicts: unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java
|
Test build #35483 has finished for PR 6876 at commit
|
|
Thanks, merging to master. |
Also added more tests in LiteralExpressionSuite