Merge move join by andrewlawhh · Pull Request #191 · mc2-project/opaque-sql

andrewlawhh · 2021-04-02T02:15:50Z

No description provided.

* add date_add, interval sql still running into issues * Add Interval SQL support * uncomment out the other tests * resolve comments * change interval equality Co-authored-by: Eric Feng <fengeric11@berkeley.edu>

…ncryptedblocks wip

…n generated

…ilds

…into comp-integrity

…ses all_outputs_mac as Mac table

…nto comp-integrity

…LastPrimary(?)

Refactor construction of executed DAG.

…into expected-dag

This PR implements the scalar subquery expression, which is triggered whenever a subquery returns a scalar value. There were two main problems that needed to be solved. First, support for matching the scalar subquery expression is necessary. Spark implements this by wrapping a SparkPlan within the expression and calls executeCollect. Then it constructs a literal with that value. However, this is problematic for us because that value should not be decrypted by the driver and serialized into an expression, since it's an intermediate value. Therefore, the second issue to be addressed here is supporting an encrypted literal. This is implemented in this PR by serializing an encrypted ciphertext into a base64 encoded string, and wrapping a Decrypt expression on top of it. This expression is then evaluated in the enclave and returns a literal. Note that, in order to test our implementation, we also implement a Decrypt expression in Scala. However, this should never be evaluated on the driver side and serialized into a plaintext literal. This is because Decrypt is designated as a Nondeterministic expression, and therefore will always evaluate on the workers.

* logic decoupling in TPCH.scala for easier benchmarking * added TPCHBenchmark.scala * Benchmark.scala rewrite * done adding all support TPC-H query benchmarks * changed commandline arguments that benchmark takes * TPCHBenchmark takes in parameters * fixed issue with spark conf * size error handling, --help flag * add Utils.force, break cluster mode * comment out logistic regression benchmark * ensureCached right before temp view created/replaced * upgrade to 3.0.1 * upgrade to 3.0.1 * 10 scale factor * persistData * almost done refactor * more cleanup * compiles * 9 passes * cleanup * collect instead of force, sf_none * remove sf_none * defaultParallelism * no removing trailing/leading whitespace * add sf_med * hdfs works in local case * cleanup, added new CLI argument * added newly supported tpch queries * function for running all supported tests

…OperatorTest

This PR adds float normalization expressions [implemented in Spark](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala#L170). TPC-H query 2 also passes.

This PR is the first of two parts towards making TPC-H 16 work: the other will be implementing `is_distinct` for aggregate operations. `BroadcastNestedLoopJoin` is Spark's "catch all" for non-equi joins. It works by first picking a side to broadcast, then iterating through every possible row combination and checking the non-equi condition against the pair.

…oject#164) * Add in TPC-H 21 * Add condition processing in enclave code * Code clean up * Enable query 18 * WIP * Local tests pass * Apply suggestions from code review Co-authored-by: octaviansima <34696537+octaviansima@users.noreply.github.com> * WIP * Address comments * q21.sql Co-authored-by: octaviansima <34696537+octaviansima@users.noreply.github.com>

…taframe field instead of string parsing

Andrew Law and others added 30 commits October 1, 2020 18:14

Support for multiple branched CaseWhen

f17d8a8

Interval (mc2-project#116)

366e92c

* add date_add, interval sql still running into issues * Add Interval SQL support * uncomment out the other tests * resolve comments * change interval equality Co-authored-by: Eric Feng <fengeric11@berkeley.edu>

Remove partition ID argument from enclaves

c7fcd98

Fix comments

93dbf5e

updates

f357ab2

Merge serialization of ecall string as int

bb4018a

Modifications to integrate crumb, log-mac, and all-outputs_mac, wip

56ace17

Store log mac after each output buffer, add all-outputs-mac to each e…

21bbbfb

…ncryptedblocks wip

Add all_outputs_mac to all EncryptedBlocks once all log_macs have bee…

549566f

…n generated

Almost builds

55ee664

cpp builds

057caec

Use ubyte for all_outputs_mac

db54c44

use Mac for all_outputs_mac

e77f1eb

Hopefully this works for flatbuffers all_outputs_mac mutation, cpp bu…

736b8f6

…ilds

merge

cbb2373

Merge branch 'comp-integrity' of https://github.com/mc2-project/opaque …

0351b5d

…into comp-integrity

Scala builds now too, running into error with union

3002bd3

Stuff builds, error with all outputs mac serialization. this commit u…

dc54741

…ses all_outputs_mac as Mac table

Fixed bug, basic encryption / show works

5be9b7c

All single partition tests pass, multiple partiton passes until tpch-9

86fab02

All tests pass except tpch-9 and skew join

8b1a1d1

comment tpch back in

18f45d6

Merge branch 'crumb-path' of https://github.com/chester-leung/opaque i…

123fa1f

…nto comp-integrity

Check same number of ecalls per partition - exception for scanCollect…

bfc06ba

…LastPrimary(?)

First attempt at constructing executed DAG

c818a41

Fix typos

39a4945

Rework graph

c970965

Add log macs to graph nodes

43ccd2e

Construct expected DAG and refactor JobNode.

69fc49e

Refactor construction of executed DAG.

Implement 'paths to sink' for a DAG

35691ff

Andrew Law and others added 29 commits February 8, 2021 16:15

Merge comp-integrity

40e8e13

Merge master

6e60c7c

Merge branch 'expected-dag' of https://github.com/andrewlawhh/opaque …

1321eaa

…into expected-dag

Join update (mc2-project#145)

b4ba2db

Merge join update

375de7f

Integrate new join

8682f22

Add expected operator for sortexec

c21cb7b

Merge comp-integrity with join update

c1adf85

Merge comp-integrity with join update

9391435

Merge join integration with expected dag update

2b37dab

Remove some print statements

8a93c6c

Migrate from Travis CI to Github Actions (mc2-project#156)

c190aae

Upgrade to OE 0.12 (mc2-project#153)

41ea7b9

Update README.md

29da474

Construct expected DAG from dataframe physical plan

b350992

Refactor collect and add integrity checking helper function to Opaque…

20f4749

…OperatorTest

Float expressions (mc2-project#160)

3c28b5f

This PR adds float normalization expressions [implemented in Spark](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala#L170). TPC-H query 2 also passes.

Remove addExpectedOperator from JobVerificationEngine, add comments

e9b075b

Implement expected DAG construction by doing graph manipulation on da…

dabc178

…taframe field instead of string parsing

Merge

38c9da5

Fix merge errors in the test cases

98bcfdb

Fix merge errors

592ec17

Merge BNLJ into integrity branch

e3e140d

Merge join logic migration into integrity branch

67fd713

Merge join logic migration into integrity branch

29db9e6

andrewlawhh merged commit 697644b into mc2-project:comp-integrity Apr 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge move join#191

Merge move join#191
andrewlawhh merged 84 commits into
mc2-project:comp-integrityfrom
andrewlawhh:merge-move-join

andrewlawhh commented Apr 2, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

andrewlawhh commented Apr 2, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants