Skip to content

[SPARK-30260][SQL] Spark-Shell throw ClassNotFoundException exception for more than one statement to use UDF jar#26888

Closed
southernriver wants to merge 1 commit into
apache:masterfrom
southernriverchen:SPARK-30260
Closed

[SPARK-30260][SQL] Spark-Shell throw ClassNotFoundException exception for more than one statement to use UDF jar#26888
southernriver wants to merge 1 commit into
apache:masterfrom
southernriverchen:SPARK-30260

Conversation

@southernriver

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

When we start spark-shell and use the udf for the first statement ,it's ok. But for the other statements it failed to load jar to current classpath and would throw ClassNotFoundException.It seems like that the first ClassLoader is different from the other's. For Spark-shell, the maintained class loader is always IMainsTranslatingClassLoader ,and for addJar Operation, the current classLoader is NonClosableMutuableclassLoader. For the first statement, there jar was loaded to right classLoader,and for other statements, the jar has been registered to functionRegistry and would not reload to NonClosableMutuableclassLoader, we need to reset classloader to active sparkSession's.

Here, I will show difference between the first statement and the second.
First statement is NonClosableMutuableclassLoader:
image

Second statement is IMainsTranslatingClassLoader:
image

Why are the changes needed?

The problem can be reproduced as described in the below.

scala> val res = spark.sql("select  bigdata_test.Add(1,2)").show()
 ----------------------
 |bigdata_test.Add(1, 2)|
 ----------------------
 |                     3|
 ----------------------
 scala> val res = spark.sql("select  bigdata_test.Add(1,2)").show()
 org.apache.spark.sql.AnalysisException: No handler for UDF/UDAF/UDTF 'scala.didi.udf.Add': java.lang.ClassNotFoundException: scala.didi.udf.Add; line 1 pos 8
   at scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:62)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
   at org.apache.spark.sql.hive.HiveShim$HiveFunctionWrapper.createFunction(HiveShim.scala:251)
   at org.apache.spark.sql.hive.HiveSimpleUDF.function$lzycompute(hiveUDFs.scala:56)
   at org.apache.spark.sql.hive.HiveSimpleUDF.function(hiveUDFs.scala:56)
   at org.apache.spark.sql.hive.HiveSimpleUDF.method$lzycompute(hiveUDFs.scala:60)
   at org.apache.spark.sql.hive.HiveSimpleUDF.method(hiveUDFs.scala:59)
   at org.apache.spark.sql.hive.HiveSimpleUDF.dataType$lzycompute(hiveUDFs.scala:77)
   at org.apache.spark.sql.hive.HiveSimpleUDF.dataType(hiveUDFs.scala:77)
   at org.apache.spark.sql.hive.HiveSessionCatalog$$anonfun$makeFunctionExpression$3.apply(HiveSessionCatalog.scala:79)
   at org.apache.spark.sql.hive.HiveSessionCatalog$$anonfun$makeFunctionExpression$3.apply(HiveSessionCatalog.scala:71)
   at scala.util.Try.getOrElse(Try.scala:79)
   at org.apache.spark.sql.hive.HiveSessionCatalog.makeFunctionExpression(HiveSessionCatalog.scala:71)
   at org.apache.spark.sql.catalyst.catalog.SessionCatalog$$anonfun$org$apache$spark$sql$catalyst$catalog$SessionCatalog$$makeFunctionBuilder$1.apply(SessionCatalog.scala:1133)

After fix:

scala> val res = spark.sql("select  bigdata_test.Add(1,2)").show()
+----------------------+                                                       
|bigdata_test.Add(1, 2)|
+----------------------+
|                     3|
+----------------------+
 
scala> val res = spark.sql("select  bigdata_test.Add(1,2)").show()
+----------------------+
|bigdata_test.Add(1, 2)|
+----------------------+
|                     3|
+----------------------+

we should resolve this bug!

Does this PR introduce any user-facing change?

No

How was this patch tested?

manual.

@southernriver

Copy link
Copy Markdown
Contributor Author

cc @cloud-fan @maropu @dongjoon-hyun

@southernriver

Copy link
Copy Markdown
Contributor Author

@AmplabJenkins

@dongjoon-hyun

Copy link
Copy Markdown
Member

ok to test

@maropu

maropu commented Dec 15, 2019

Copy link
Copy Markdown
Member

Is this related to #23921? It seems the #23921 has a different approach from this. cc: @HyukjinKwon

@SparkQA

SparkQA commented Dec 15, 2019

Copy link
Copy Markdown

Test build #115349 has finished for PR 26888 at commit 7b24673.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon

Copy link
Copy Markdown
Member

Yeah, seems a duplicate.

@HyukjinKwon

Copy link
Copy Markdown
Member

@southernriver can you explain why/when the class loaders are changed?

@cloud-fan

Copy link
Copy Markdown
Contributor

should IMainTranslatingClassLoader fallback to spark context class loader?

@HeartSaVioR

Copy link
Copy Markdown
Contributor

That's duplicated with #23921 and IMHO #23921 is clearer fix. As it is quite old and no test, I've just took over and raised #27025.

@dongjoon-hyun

Copy link
Copy Markdown
Member

According to the above discussion, I'll close this PR and SPARK-30260 . Thank you, @southernriver and all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants