From 40ff67a835fa6304cf3fe72e642e4ac6a3af76a8 Mon Sep 17 00:00:00 2001 From: Yik San Chan Date: Wed, 28 Apr 2021 08:13:45 +0800 Subject: [PATCH 1/7] [hotfix][python][docs] add bundling udfs section --- docs/content/docs/dev/python/table/udfs/python_udfs.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/content/docs/dev/python/table/udfs/python_udfs.md b/docs/content/docs/dev/python/table/udfs/python_udfs.md index 55d87d5cdc683..06770066d0a93 100644 --- a/docs/content/docs/dev/python/table/udfs/python_udfs.md +++ b/docs/content/docs/dev/python/table/udfs/python_udfs.md @@ -30,6 +30,10 @@ User-defined functions are important features, because they significantly extend **NOTE:** Python UDF execution requires Python version (3.6, 3.7 or 3.8) with PyFlink installed. It's required on both the client side and the cluster side. +## Bundling UDFs + +**NOTE:** To run Python UDFs (as well as Pandas UDF) in any non-local mode, it is strongly recommended to bundle your UDF definitions using the config option [`python-files`]({{< ref "docs/dev/python/python_config" >}}#python-files), if your UDFs live outside of the file where the `main()` function is defined. Otherwise, you may run into `ModuleNotFoundError: No module named 'my_udf'` if you define UDFs in a file called `my_udf.py`. + ## Scalar Functions It supports to use Python scalar functions in Python Table API programs. In order to define a Python scalar function, From 1a991b8a89c776052ffdf0cd683b6454900b0b1c Mon Sep 17 00:00:00 2001 From: yiksanchan Date: Tue, 27 Apr 2021 19:16:05 -0700 Subject: [PATCH 2/7] Take suggestion Co-authored-by: Dian Fu --- docs/content/docs/dev/python/table/udfs/python_udfs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/content/docs/dev/python/table/udfs/python_udfs.md b/docs/content/docs/dev/python/table/udfs/python_udfs.md index 06770066d0a93..ee016092eb8aa 100644 --- a/docs/content/docs/dev/python/table/udfs/python_udfs.md +++ b/docs/content/docs/dev/python/table/udfs/python_udfs.md @@ -32,7 +32,7 @@ User-defined functions are important features, because they significantly extend ## Bundling UDFs -**NOTE:** To run Python UDFs (as well as Pandas UDF) in any non-local mode, it is strongly recommended to bundle your UDF definitions using the config option [`python-files`]({{< ref "docs/dev/python/python_config" >}}#python-files), if your UDFs live outside of the file where the `main()` function is defined. Otherwise, you may run into `ModuleNotFoundError: No module named 'my_udf'` if you define UDFs in a file called `my_udf.py`. +To run Python UDFs (as well as Pandas UDFs) in any non-local mode, it is strongly recommended to bundle your Python UDF definitions using the config option [`python-files`]({{< ref "docs/dev/python/python_config" >}}#python-files), if your Python UDFs live outside of the file where the `main()` function is defined. Otherwise, you may run into `ModuleNotFoundError: No module named 'my_udf'` if you define Python UDFs in a file called `my_udf.py`. ## Scalar Functions From 27162f802d5439fc9507beb719f4076661d441a7 Mon Sep 17 00:00:00 2001 From: Yik San Chan Date: Wed, 28 Apr 2021 10:17:00 +0800 Subject: [PATCH 3/7] per suggestion --- docs/content/docs/dev/python/table/udfs/python_udfs.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/content/docs/dev/python/table/udfs/python_udfs.md b/docs/content/docs/dev/python/table/udfs/python_udfs.md index ee016092eb8aa..f4782a7eaaa4b 100644 --- a/docs/content/docs/dev/python/table/udfs/python_udfs.md +++ b/docs/content/docs/dev/python/table/udfs/python_udfs.md @@ -30,10 +30,6 @@ User-defined functions are important features, because they significantly extend **NOTE:** Python UDF execution requires Python version (3.6, 3.7 or 3.8) with PyFlink installed. It's required on both the client side and the cluster side. -## Bundling UDFs - -To run Python UDFs (as well as Pandas UDFs) in any non-local mode, it is strongly recommended to bundle your Python UDF definitions using the config option [`python-files`]({{< ref "docs/dev/python/python_config" >}}#python-files), if your Python UDFs live outside of the file where the `main()` function is defined. Otherwise, you may run into `ModuleNotFoundError: No module named 'my_udf'` if you define Python UDFs in a file called `my_udf.py`. - ## Scalar Functions It supports to use Python scalar functions in Python Table API programs. In order to define a Python scalar function, @@ -557,3 +553,7 @@ class ListViewConcatTableAggregateFunction(TableAggregateFunction): def get_result_type(self): return DataTypes.ROW([DataTypes.FIELD("a", DataTypes.STRING())]) ``` + +## Bundling UDFs + +To run Python UDFs (as well as Pandas UDFs) in any non-local mode, it is strongly recommended to bundle your Python UDF definitions using the config option [`python-files`]({{< ref "docs/dev/python/python_config" >}}#python-files), if your Python UDFs live outside of the file where the `main()` function is defined. Otherwise, you may run into `ModuleNotFoundError: No module named 'my_udf'` if you define Python UDFs in a file called `my_udf.py`. From 57608d10dc311491f674c9e0c1f350d127b49c4c Mon Sep 17 00:00:00 2001 From: Yik San Chan Date: Wed, 28 Apr 2021 10:17:42 +0800 Subject: [PATCH 4/7] newline --- docs/content/docs/dev/python/table/udfs/python_udfs.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/content/docs/dev/python/table/udfs/python_udfs.md b/docs/content/docs/dev/python/table/udfs/python_udfs.md index f4782a7eaaa4b..e6168b89378b3 100644 --- a/docs/content/docs/dev/python/table/udfs/python_udfs.md +++ b/docs/content/docs/dev/python/table/udfs/python_udfs.md @@ -556,4 +556,5 @@ class ListViewConcatTableAggregateFunction(TableAggregateFunction): ## Bundling UDFs -To run Python UDFs (as well as Pandas UDFs) in any non-local mode, it is strongly recommended to bundle your Python UDF definitions using the config option [`python-files`]({{< ref "docs/dev/python/python_config" >}}#python-files), if your Python UDFs live outside of the file where the `main()` function is defined. Otherwise, you may run into `ModuleNotFoundError: No module named 'my_udf'` if you define Python UDFs in a file called `my_udf.py`. +To run Python UDFs (as well as Pandas UDFs) in any non-local mode, it is strongly recommended to bundle your Python UDF definitions using the config option [`python-files`]({{< ref "docs/dev/python/python_config" >}}#python-files), if your Python UDFs live outside of the file where the `main()` function is defined. +Otherwise, you may run into `ModuleNotFoundError: No module named 'my_udf'` if you define Python UDFs in a file called `my_udf.py`. From 86c54e97d3657b5e572988c8987979f39983590d Mon Sep 17 00:00:00 2001 From: Yik San Chan Date: Wed, 28 Apr 2021 10:29:45 +0800 Subject: [PATCH 5/7] add zh docs --- docs/content.zh/docs/dev/python/table/udfs/python_udfs.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/docs/content.zh/docs/dev/python/table/udfs/python_udfs.md b/docs/content.zh/docs/dev/python/table/udfs/python_udfs.md index c4d5aa11ae835..71d9a09690da0 100644 --- a/docs/content.zh/docs/dev/python/table/udfs/python_udfs.md +++ b/docs/content.zh/docs/dev/python/table/udfs/python_udfs.md @@ -554,3 +554,11 @@ class ListViewConcatTableAggregateFunction(TableAggregateFunction): def get_result_type(self): return DataTypes.ROW([DataTypes.FIELD("a", DataTypes.STRING())]) ``` + +## 打包 UDFs + +To run Python UDFs (as well as Pandas UDFs) in any non-local mode, it is strongly recommended to bundle your Python UDF definitions using the config option [`python-files`]({{< ref "docs/dev/python/python_config" >}}#python-files), if your Python UDFs live outside of the file where the `main()` function is defined. +Otherwise, you may run into `ModuleNotFoundError: No module named 'my_udf'` if you define Python UDFs in a file called `my_udf.py`. + +如果你在非 local 模式下运行 Python UDFs 和 Pandas UDFs,且 Python UDFs 没有定义在含 `main()` 入口的 Python 主文件中,我们强烈建议你通过 [`python-files`]({{< ref "docs/dev/python/python_config" >}}#python-files) 配置项将 Python UDF 的定义打包起来。 +否则,如果你将 Python UDFs 定义在名为 `my_udf.py` 的文件中,你可能会遇到 `ModuleNotFoundError: No module named 'my_udf'` 这样的报错。 From 573ea5bcde2249d489c16070d4574ef48faee8dd Mon Sep 17 00:00:00 2001 From: yiksanchan Date: Tue, 27 Apr 2021 19:58:10 -0700 Subject: [PATCH 6/7] take suggestion Co-authored-by: Dian Fu --- docs/content.zh/docs/dev/python/table/udfs/python_udfs.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/content.zh/docs/dev/python/table/udfs/python_udfs.md b/docs/content.zh/docs/dev/python/table/udfs/python_udfs.md index 71d9a09690da0..cbee67f355965 100644 --- a/docs/content.zh/docs/dev/python/table/udfs/python_udfs.md +++ b/docs/content.zh/docs/dev/python/table/udfs/python_udfs.md @@ -560,5 +560,5 @@ class ListViewConcatTableAggregateFunction(TableAggregateFunction): To run Python UDFs (as well as Pandas UDFs) in any non-local mode, it is strongly recommended to bundle your Python UDF definitions using the config option [`python-files`]({{< ref "docs/dev/python/python_config" >}}#python-files), if your Python UDFs live outside of the file where the `main()` function is defined. Otherwise, you may run into `ModuleNotFoundError: No module named 'my_udf'` if you define Python UDFs in a file called `my_udf.py`. -如果你在非 local 模式下运行 Python UDFs 和 Pandas UDFs,且 Python UDFs 没有定义在含 `main()` 入口的 Python 主文件中,我们强烈建议你通过 [`python-files`]({{< ref "docs/dev/python/python_config" >}}#python-files) 配置项将 Python UDF 的定义打包起来。 +如果你在非 local 模式下运行 Python UDFs 和 Pandas UDFs,且 Python UDFs 没有定义在含 `main()` 入口的 Python 主文件中,强烈建议你通过 [`python-files`]({{< ref "docs/dev/python/python_config" >}}#python-files) 配置项指定 Python UDF 的定义。 否则,如果你将 Python UDFs 定义在名为 `my_udf.py` 的文件中,你可能会遇到 `ModuleNotFoundError: No module named 'my_udf'` 这样的报错。 From 1c2dd4a9e8e93256e53daec1825a8c30a199e0e7 Mon Sep 17 00:00:00 2001 From: Yik San Chan Date: Wed, 28 Apr 2021 10:59:05 +0800 Subject: [PATCH 7/7] rm --- docs/content.zh/docs/dev/python/table/udfs/python_udfs.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/docs/content.zh/docs/dev/python/table/udfs/python_udfs.md b/docs/content.zh/docs/dev/python/table/udfs/python_udfs.md index cbee67f355965..432c16fac5cba 100644 --- a/docs/content.zh/docs/dev/python/table/udfs/python_udfs.md +++ b/docs/content.zh/docs/dev/python/table/udfs/python_udfs.md @@ -557,8 +557,5 @@ class ListViewConcatTableAggregateFunction(TableAggregateFunction): ## 打包 UDFs -To run Python UDFs (as well as Pandas UDFs) in any non-local mode, it is strongly recommended to bundle your Python UDF definitions using the config option [`python-files`]({{< ref "docs/dev/python/python_config" >}}#python-files), if your Python UDFs live outside of the file where the `main()` function is defined. -Otherwise, you may run into `ModuleNotFoundError: No module named 'my_udf'` if you define Python UDFs in a file called `my_udf.py`. - 如果你在非 local 模式下运行 Python UDFs 和 Pandas UDFs,且 Python UDFs 没有定义在含 `main()` 入口的 Python 主文件中,强烈建议你通过 [`python-files`]({{< ref "docs/dev/python/python_config" >}}#python-files) 配置项指定 Python UDF 的定义。 否则,如果你将 Python UDFs 定义在名为 `my_udf.py` 的文件中,你可能会遇到 `ModuleNotFoundError: No module named 'my_udf'` 这样的报错。