Skip to content

[SPARK-12597] [ML] Use udf to replace callUDF for ML#10544

Closed
yanboliang wants to merge 1 commit into
apache:masterfrom
yanboliang:spark-12597
Closed

[SPARK-12597] [ML] Use udf to replace callUDF for ML#10544
yanboliang wants to merge 1 commit into
apache:masterfrom
yanboliang:spark-12597

Conversation

@yanboliang

Copy link
Copy Markdown
Contributor

callUDF has been deprecated and will be removed in Spark 2.0. We should replace the use of callUDF with udf for ML.
I was trying to directly wrap createTransformFunc to udf(which was illustrated by the following code snippet) as an initial attempt, but it hits a bug of TypeTag ... NotSerializableException which exists at Scala 2.10.

abstract class UnaryTransformer[IN: TypeTag, OUT: TypeTag, T <: UnaryTransformer[IN, OUT, T]]
  extends Transformer with HasInputCol with HasOutputCol with Logging {
  ......
  override def transform(dataset: DataFrame): DataFrame = {
    transformSchema(dataset.schema, logging = true)
    val transformFunc = udf { input: IN => this.createTransformFunc(input) }
    dataset.withColumn($(outputCol), transformFunc(col($(inputCol))))
  }
  ......
}

@SparkQA

SparkQA commented Jan 1, 2016

Copy link
Copy Markdown

Test build #48565 has finished for PR 10544 at commit b4c4329.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin

rxin commented Jan 2, 2016

Copy link
Copy Markdown
Contributor

This works too. I have a separate pull request that adds a new API for this: #10547

@rxin

rxin commented Jan 2, 2016

Copy link
Copy Markdown
Contributor

BTW is transformFunc a public API that custom transformers are supposed to implement? If it is, this is technically an API breaking change you are making.

@yanboliang

Copy link
Copy Markdown
Contributor Author

@rxin transformFunc is not a public API, but I think your PR is more concise and I will close my PR.

@yanboliang yanboliang closed this Jan 2, 2016
@yanboliang yanboliang deleted the spark-12597 branch January 2, 2016 09:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants