Skip to content

[SQL] SPARK-6981: Factor out SparkPlanner and QueryExecution from SQLContext#6122

Closed
evacchi wants to merge 12 commits into
apache:masterfrom
evacchi:sqlctx-refactoring
Closed

[SQL] SPARK-6981: Factor out SparkPlanner and QueryExecution from SQLContext#6122
evacchi wants to merge 12 commits into
apache:masterfrom
evacchi:sqlctx-refactoring

Conversation

@evacchi

@evacchi evacchi commented May 13, 2015

Copy link
Copy Markdown
Contributor

Cleaned-up version of PR #5556

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be private[sql] now?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is protected[sql] in master.

BTW, this PR is geared towards making it easier for third parties (and HiveContext) to add new processing rules without requiring to subclass (see PR #5556 and SPARK-6320)

I would actually advise making these classes public (or at least protected, without the [sql] qualifier)

evacchi added 3 commits May 14, 2015 11:33
Conflicts:
	sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
…refactoring

Conflicts:
	sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
@evacchi

evacchi commented May 18, 2015

Copy link
Copy Markdown
Contributor Author

If everybody agrees, I think we can restart the Jenkins build (this is just the same as the other PR, after all)

@evacchi

evacchi commented May 21, 2015

Copy link
Copy Markdown
Contributor Author

@rxin may I ask if you can trigger a test build? code is the same as PR #5556

@rxin

rxin commented May 21, 2015

Copy link
Copy Markdown
Contributor

Jenkins, test this please.

@SparkQA

SparkQA commented May 21, 2015

Copy link
Copy Markdown

Test build #33237 has finished for PR 6122 at commit ac03efe.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • trait QueryPlanner[PhysicalPlan <: TreeNode[PhysicalPlan]]
    • protected[sql] class QueryExecution(val sqlContext: SQLContext, val logical: LogicalPlan)
    • protected[sql] class SparkPlanner(val sqlContext: SQLContext) extends SparkStrategies
    • protected[sql] class HiveQueryExecution(hiveContext: HiveContext, logicalPlan: LogicalPlan)

@evacchi

evacchi commented May 21, 2015

Copy link
Copy Markdown
Contributor Author

Build fails because of this:

[info] spark-sql: found 3 potential binary incompatibilities (filtered 328)
[error]  * class org.apache.spark.sql.SQLContext#SparkPlanner does not have a correspondent in new version
[error]    filter with: ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.SQLContext$SparkPlanner")
[error]  * method prepareForExecution()org.apache.spark.sql.catalyst.rules.RuleExecutor in class org.apache.spark.sql.SQLContext does not have a correspondent in new version
[error]    filter with: ProblemFilters.exclude[MissingMethodProblem]("org.apache.spark.sql.SQLContext.prepareForExecution")
[error]  * class org.apache.spark.sql.SQLContext#QueryExecution does not have a correspondent in new version
[error]    filter with: ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.SQLContext$QueryExecution")

this is obvious, since that's the purpose of the PR. How should we proceed?

@evacchi

evacchi commented May 21, 2015

Copy link
Copy Markdown
Contributor Author

Possible solution: add shims to preserve binary compatibility as follows:

class SQLContext {
    ...
   @deprecated
    class SparkPlanner extends org.apache.spark.sql.SparkPlanner
   @deprecated
    class QueryExecution extends org.apache.spark.sql.QueryExecution 
   @deprecated
    lazy val prepareForExecution = ...
    ...
}

class QueryExecution(sqlContext: SQLContext) {
   ...
   lazy val prepareForExecution = sqlContext.prepareForExecution
   ...
}

etc.

However, I wouldn't do that unless it is really necessary, because it really makes extending the non-deprecated classes a bit awkward (must be careful with imports)

…refactoring

Conflicts:
	sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
@evacchi

evacchi commented May 22, 2015

Copy link
Copy Markdown
Contributor Author

addressed in PR #6356

@AmplabJenkins

Copy link
Copy Markdown

Can one of the admins verify this patch?

@asfgit asfgit closed this in cdc36ee Jul 18, 2015
asfgit pushed a commit that referenced this pull request Sep 14, 2015
…LContext

Alternative to PR #6122; in this case the refactored out classes are replaced by inner classes with the same name for backwards binary compatibility

   * process in a lighter-weight, backwards-compatible way

Author: Edoardo Vacchi <uncommonnonsense@gmail.com>

Closes #6356 from evacchi/sqlctx-refactoring-lite.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants