Skip to content

[SPARK-10287] [SQL] Fixes JSONRelation refreshing on read path#8469

Closed
yhuai wants to merge 1 commit into
apache:masterfrom
yhuai:jsonRefresh
Closed

[SPARK-10287] [SQL] Fixes JSONRelation refreshing on read path#8469
yhuai wants to merge 1 commit into
apache:masterfrom
yhuai:jsonRefresh

Conversation

@yhuai

@yhuai yhuai commented Aug 26, 2015

Copy link
Copy Markdown
Contributor

https://issues.apache.org/jira/browse/SPARK-10287

After porting json to HadoopFsRelation, it seems hard to keep the behavior of picking up new files automatically for JSON. This PR removes this behavior, so JSON is consistent with others (ORC and Parquet).

@yhuai

yhuai commented Aug 26, 2015

Copy link
Copy Markdown
Contributor Author

@liancheng Maybe it is better to make JSON, Parquet, and ORC consistent instead of fixing JSON's refresh problem.

@liancheng

Copy link
Copy Markdown
Contributor

LGTM. We should mention this in the release note and migration guide.

@SparkQA

SparkQA commented Aug 27, 2015

Copy link
Copy Markdown

Test build #41650 has finished for PR 8469 at commit acec3ca.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yhuai

yhuai commented Aug 27, 2015

Copy link
Copy Markdown
Contributor Author

I will test it with my partitioned JSON table.

@yhuai

yhuai commented Aug 27, 2015

Copy link
Copy Markdown
Contributor Author

It works. I will update doc.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the release note, we need to add JSON data source will not automatically load new files that are created by other applications (i.e. files that are not inserted to the dataset through Spark SQL). [SPARK-10287].

@SparkQA

SparkQA commented Aug 27, 2015

Copy link
Copy Markdown

Test build #41705 has finished for PR 8469 at commit dead685.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA

SparkQA commented Aug 27, 2015

Copy link
Copy Markdown

Test build #1698 has finished for PR 8469 at commit dead685.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class LogisticRegressionModel @Since("1.3.0") (
    • class SVMModel @Since("1.1.0") (
    • class GaussianMixtureModel @Since("1.3.0") (
    • class KMeansModel @Since("1.1.0") (@Since("1.0.0") val clusterCenters: Array[Vector])
    • class PowerIterationClusteringModel @Since("1.3.0") (
    • class StreamingKMeansModel @Since("1.2.0") (
    • class StreamingKMeans @Since("1.2.0") (
    • class BinaryClassificationMetrics @Since("1.3.0") (
    • class MulticlassMetrics @Since("1.1.0") (predictionAndLabels: RDD[(Double, Double)])
    • class MultilabelMetrics @Since("1.2.0") (predictionAndLabels: RDD[(Array[Double], Array[Double])])
    • class RegressionMetrics @Since("1.2.0") (
    • class ChiSqSelectorModel @Since("1.3.0") (
    • class ChiSqSelector @Since("1.3.0") (
    • class ElementwiseProduct @Since("1.4.0") (
    • class IDF @Since("1.2.0") (@Since("1.2.0") val minDocFreq: Int)
    • class Normalizer @Since("1.1.0") (p: Double) extends VectorTransformer
    • class PCA @Since("1.4.0") (@Since("1.4.0") val k: Int)
    • class StandardScaler @Since("1.1.0") (withMean: Boolean, withStd: Boolean) extends Logging
    • class StandardScalerModel @Since("1.3.0") (
    • class FPGrowthModel[Item: ClassTag] @Since("1.3.0") (
    • class FreqItemset[Item] @Since("1.3.0") (
    • class FreqSequence[Item] @Since("1.5.0") (
    • class PrefixSpanModel[Item] @Since("1.5.0") (
    • class DenseMatrix @Since("1.3.0") (
    • class SparseMatrix @Since("1.3.0") (
    • class DenseVector @Since("1.0.0") (
    • class SparseVector @Since("1.0.0") (
    • class BlockMatrix @Since("1.3.0") (
    • class CoordinateMatrix @Since("1.0.0") (
    • class IndexedRowMatrix @Since("1.0.0") (
    • class RowMatrix @Since("1.0.0") (
    • class PoissonGenerator @Since("1.1.0") (
    • class ExponentialGenerator @Since("1.3.0") (
    • class GammaGenerator @Since("1.3.0") (
    • class LogNormalGenerator @Since("1.3.0") (
    • case class Rating @Since("0.8.0") (
    • class MatrixFactorizationModel @Since("0.8.0") (
    • abstract class GeneralizedLinearModel @Since("1.0.0") (
    • class IsotonicRegressionModel @Since("1.3.0") (
    • case class LabeledPoint @Since("1.0.0") (
    • class LassoModel @Since("1.1.0") (
    • class LinearRegressionModel @Since("1.1.0") (
    • class RidgeRegressionModel @Since("1.1.0") (
    • class MultivariateGaussian @Since("1.3.0") (
    • case class BoostingStrategy @Since("1.4.0") (
    • class Strategy @Since("1.3.0") (
    • class DecisionTreeModel @Since("1.0.0") (
    • class Node @Since("1.2.0") (
    • class Predict @Since("1.2.0") (
    • class RandomForestModel @Since("1.2.0") (
    • class GradientBoostedTreesModel @Since("1.2.0") (
    • abstract class SetOperation(left: LogicalPlan, right: LogicalPlan) extends BinaryNode
    • case class Union(left: LogicalPlan, right: LogicalPlan) extends SetOperation(left, right)
    • case class Intersect(left: LogicalPlan, right: LogicalPlan) extends SetOperation(left, right)
    • case class Except(left: LogicalPlan, right: LogicalPlan) extends SetOperation(left, right)

@yhuai

yhuai commented Aug 27, 2015

Copy link
Copy Markdown
Contributor Author

I am merging it to master and branch 1.5.

asfgit pushed a commit that referenced this pull request Aug 27, 2015
https://issues.apache.org/jira/browse/SPARK-10287

After porting json to HadoopFsRelation, it seems hard to keep the behavior of picking up new files automatically for JSON. This PR removes this behavior, so JSON is consistent with others (ORC and Parquet).

Author: Yin Huai <yhuai@databricks.com>

Closes #8469 from yhuai/jsonRefresh.

(cherry picked from commit b3dd569)
Signed-off-by: Yin Huai <yhuai@databricks.com>
@asfgit asfgit closed this in b3dd569 Aug 27, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants