From fe419439e4197341d3f81f01cf645197042e0264 Mon Sep 17 00:00:00 2001 From: Hyukjin Kwon Date: Sun, 27 Jan 2019 13:14:04 +0800 Subject: [PATCH 1/4] Add a note that 'spark.executor.pyspark.memory' is dependent on 'resource' --- docs/configuration.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/configuration.md b/docs/configuration.md index 7d3bbf93ae96..cb830af5e4db 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -192,6 +192,7 @@ of the most common options to set are: is added to executor resource requests. NOTE: Python memory usage may not be limited on platforms that do not support resource limiting, such as Windows. + NOTE: Python memory usage is dependent on Python's 'resource' module; therefore, the behaviors and limitations are inherited. From 88e2aa10b4be9523dc54fa3b554b4b2ad0a1f8c3 Mon Sep 17 00:00:00 2001 From: Hyukjin Kwon Date: Sun, 27 Jan 2019 13:30:36 +0800 Subject: [PATCH 2/4] Make the note style consistent --- docs/configuration.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/docs/configuration.md b/docs/configuration.md index cb830af5e4db..d32f858c9bb1 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -190,9 +190,10 @@ of the most common options to set are: and it is up to the application to avoid exceeding the overhead memory space shared with other non-JVM processes. When PySpark is run in YARN or Kubernetes, this memory is added to executor resource requests. - - NOTE: Python memory usage may not be limited on platforms that do not support resource limiting, such as Windows. - NOTE: Python memory usage is dependent on Python's 'resource' module; therefore, the behaviors and limitations are inherited. +
+ Note: Python memory usage may not be limited on platforms that do not support resource limiting, such as Windows. +
+ Note: Python memory usage is dependent on Python's 'resource' module; therefore, the behaviors and limitations are inherited. @@ -224,7 +225,8 @@ of the most common options to set are: stored on disk. This should be on a fast, local disk in your system. It can also be a comma-separated list of multiple directories on different disks. - NOTE: In Spark 1.0 and later this will be overridden by SPARK_LOCAL_DIRS (Standalone), MESOS_SANDBOX (Mesos) or +
+ Note: In Spark 1.0 and later this will be overridden by SPARK_LOCAL_DIRS (Standalone), MESOS_SANDBOX (Mesos) or LOCAL_DIRS (YARN) environment variables set by the cluster manager. From b24a4ac40d5ebfe0d6f1ac9a2e2880a7b45b6e06 Mon Sep 17 00:00:00 2001 From: Hyukjin Kwon Date: Mon, 28 Jan 2019 11:14:09 +0800 Subject: [PATCH 3/4] Address comments --- docs/configuration.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/docs/configuration.md b/docs/configuration.md index d32f858c9bb1..47e388e8caeb 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -191,9 +191,7 @@ of the most common options to set are: shared with other non-JVM processes. When PySpark is run in YARN or Kubernetes, this memory is added to executor resource requests.
- Note: Python memory usage may not be limited on platforms that do not support resource limiting, such as Windows. -
- Note: Python memory usage is dependent on Python's 'resource' module; therefore, the behaviors and limitations are inherited. + Note: This feature is dependent on Python's `resource` module; therefore, the behaviors and limitations are inherited. @@ -226,7 +224,7 @@ of the most common options to set are: comma-separated list of multiple directories on different disks.
- Note: In Spark 1.0 and later this will be overridden by SPARK_LOCAL_DIRS (Standalone), MESOS_SANDBOX (Mesos) or + Note: This will be overridden by SPARK_LOCAL_DIRS (Standalone), MESOS_SANDBOX (Mesos) or LOCAL_DIRS (YARN) environment variables set by the cluster manager. From 7fa33ed39053a98d36502cf8f9237004ee968c69 Mon Sep 17 00:00:00 2001 From: Hyukjin Kwon Date: Thu, 31 Jan 2019 14:29:21 +0800 Subject: [PATCH 4/4] Address comment --- docs/configuration.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/configuration.md b/docs/configuration.md index 47e388e8caeb..0b35cfeafcfb 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -191,7 +191,9 @@ of the most common options to set are: shared with other non-JVM processes. When PySpark is run in YARN or Kubernetes, this memory is added to executor resource requests.
- Note: This feature is dependent on Python's `resource` module; therefore, the behaviors and limitations are inherited. + Note: This feature is dependent on Python's `resource` module; therefore, the behaviors and + limitations are inherited. For instance, Windows does not support resource limiting and actual + resource is not limited on MacOS.