Skip to content

0.9.3

Choose a tag to compare

@mholt mholt released this 28 Sep 19:00
· 2509 commits to master since this release
v0.9.3
c885edd

This release contains bug fixes, including patches and more tests for bugs introduced in 0.9.2.

If you use proxy for load balancing in failure scenarios, pay attention to a few changes. We've made improvements that will help debug and eliminate sporadic, long-lasting 502 errors, but changed the way the failure logic works in order to do this.

Summarized change list:

  • Updated QUIC to newer version
  • import: Glob pattern matching 0 files is no longer an error
  • fastcgi: Fixed persistent connections (disabled by default)
  • fastcgi: Configurable connection pool size parameter
  • proxy: Improved failover load balancing logic
  • proxy: Avoids duplicating header fields that would be confusing
  • proxy: New try_duration and try_interval parameters
  • proxy: Fix for IP hash policy when downed hosts come back up
  • Several other bug fixes and new tests

Changes specific to proxy (see PR #1135):

  • fail_timeout now defaults to 0. This means that requests which fail will not count against that host's availability. With a value > 0, request failure counting is enabled, and proxy will remember a failed request for this long. If the number of remembered failures accumulates to max_fails, the backend will be considered down (for everyone) until the failed requests begin to be forgotten.
  • max_fails defaults to 1 as before, but cannot be set to 0. If your network is flaky (almost all are), try a more reasonable value like 5. Remember, once the number of failed requests to a backend reaches this number within the window of fail_timeout, the host will be considered down for all clients until the window shifts ahead.
  • try_duration is a new parameter that specifies how long proxy will check for available hosts. So if a host becomes available within this duration, the request may still succeed. The default is 0, meaning that proxy will not retry when a host initially goes down or no hosts are available. You must set this to a reasonable value > 0 (e.g. 30s) if you want robust redundancy.
  • try_interval specifies how long to wait between attempts to reach an upstream host. This defaults to 250ms. The idea is to avoid spinning the CPU, so if you set this to 0 along with a non-zero fail_timeout, your CPU may spin until hosts become available again.

Basically: If you want to have proper, redundant load balancing, you must set fail_timeout and try_duration to durations > 0.

We may continue to tweak this logic in the future to get the best defaults for as many users as possible.

Thank you to all who contributed for this release!