From 5eeb9f09e9cca52ffadfd1f17fe2a8119ced7271 Mon Sep 17 00:00:00 2001 From: Kevin Rutten Date: Thu, 20 Aug 2020 17:25:25 -0700 Subject: [PATCH 1/2] Documentation for pre-start timeout [#174127011](https://www.pivotaltracker.com/story/show/174127011) --- docs/config.md | 6 +++--- docs/runtime.md | 21 ++++++++++++++++++++- 2 files changed, 23 insertions(+), 4 deletions(-) diff --git a/docs/config.md b/docs/config.md index fc20848e..a9ccd9c5 100644 --- a/docs/config.md +++ b/docs/config.md @@ -53,9 +53,9 @@ directory of your job. #### `hooks` Schema -| **Property** | **Type** | **Required** | **Description** | -|--------------|----------|--------------|---------------------------------------------------------------------------------------| -| `pre_start` | string | No | The path to an executable to run before starting the main executable of this process. | +| **Property** | **Type** | **Required** | **Description** | +|--------------|----------|--------------|-----------------------------------------------------------------------------------------------------------------------| +| `pre_start` | string | No | The path to an executable to run before starting the main executable of this process. Should not exceed 30 seconds | #### `limits` Schema diff --git a/docs/runtime.md b/docs/runtime.md index 9193dba2..42174785 100644 --- a/docs/runtime.md +++ b/docs/runtime.md @@ -6,7 +6,10 @@ us know so that we can be explicit about the interface and guarantees provided. ## Lifecycle -Your process is started and has an unlimited amount of time to start up. You +Your process is started and has an unlimited amount of time to start up. If +present, a [pre-start script][pre-start] can run before starting to prepare +machine and/or persistent data before your process starting its operation. The +[pre-start script][pre-start] must complete in Monits 30 second timeout. You should use a [post-start script][post-start] and a health check if you want your job to only say it has completed deploying after it has started up. You do not need to manage any PID files yourself. @@ -22,6 +25,7 @@ can shutdown within 15 seconds. It is acceptable and supported to terminate your process while running the drain script. However, if you do terminate the process then you should also delete the PID file. +[pre-start]:https://bosh.io/docs/pre-start.html [post-start]:https://bosh.io/docs/post-start.html [drain]:https://bosh.io/docs/drain.html @@ -161,3 +165,18 @@ conditions. It is completely safe (from a correctness perspective, you may still break your service) to run `monit restart` on a job which uses bpm. [monit-mail]: https://lists.nongnu.org/archive/html/monit-general/2012-09/msg00103.html + +### Execution Failed + +In some cases Monit reports the process as execution failed when it is healthy. +This is a [known][execution-failed] race condition that can occur in certain unavoidable circumstances. +This usually occurs if the process start up takes longer then monits timeout. + +bpm provides monit with a successful startup sooner and bpm then monitors the +processes actual startup in its lifecycle avoiding this race condition. + +The [pre-start script][pre-start] runs before the startup and if this script exceeds +the timeout it can hit this race conditon. The pre-start script must complete +quickly. + +[execution-failed]:https://community.pivotal.io/s/article/Deployment-fails-because-monit-reports-job-as-failed?language=en_US From 1e8aed24d8e46b17579bf09063adfa591ccaa25e Mon Sep 17 00:00:00 2001 From: Maya Rosecrance Date: Thu, 24 Sep 2020 11:51:32 -0700 Subject: [PATCH 2/2] Update docs pr - specify the timing difference between bpm's pre-start and any other job's prestart. Only bpm's is timebound to 30 seconds. [#174127011](https://www.pivotaltracker.com/story/show/174127011) --- docs/runtime.md | 21 +++++++-------------- 1 file changed, 7 insertions(+), 14 deletions(-) diff --git a/docs/runtime.md b/docs/runtime.md index 42174785..e12029f9 100644 --- a/docs/runtime.md +++ b/docs/runtime.md @@ -7,10 +7,9 @@ us know so that we can be explicit about the interface and guarantees provided. ## Lifecycle Your process is started and has an unlimited amount of time to start up. If -present, a [pre-start script][pre-start] can run before starting to prepare -machine and/or persistent data before your process starting its operation. The -[pre-start script][pre-start] must complete in Monits 30 second timeout. You -should use a [post-start script][post-start] and a health check if you want +present, a bpm [pre-start script][pre-start] can run before your process' pre-start and start. Bpm's +[pre-start script][pre-start] must complete in within 30 seconds to avoid a monit timeout. A job's [pre-start][pre-start] +script is not bound by this timeout. You should use a [post-start script][post-start] and a health check if you want your job to only say it has completed deploying after it has started up. You do not need to manage any PID files yourself. @@ -168,15 +167,9 @@ still break your service) to run `monit restart` on a job which uses bpm. ### Execution Failed -In some cases Monit reports the process as execution failed when it is healthy. -This is a [known][execution-failed] race condition that can occur in certain unavoidable circumstances. -This usually occurs if the process start up takes longer then monits timeout. - -bpm provides monit with a successful startup sooner and bpm then monitors the -processes actual startup in its lifecycle avoiding this race condition. - -The [pre-start script][pre-start] runs before the startup and if this script exceeds -the timeout it can hit this race conditon. The pre-start script must complete -quickly. +In some cases Monit reports the process as execution failed when the process is actually healthy. +There is a [known][execution-failed] race condition that can occur in certain unavoidable circumstances. +This same failure will also appear if bpm's [pre-start script][pre-start] exceeds monit's timeout when executing. +It is recommended to move time consuming logic from bpm's pre-start to the job's pre-start script. [execution-failed]:https://community.pivotal.io/s/article/Deployment-fails-because-monit-reports-job-as-failed?language=en_US