Skip to content

Improve usage of available DB-Connections for small amounts of delivery-service-pod-replicas #723

@8R0WNI3

Description

@8R0WNI3

Context / Motivation

As we learnt, even small bursts of HTTP-Requests, as typically issued upon startup of Delivery-Dashboard often result in HTTP-503-errors being returned from Delivery-Service, resulting from timeouts from Delivery-Service-pods when waiting for free DB-Connections. This occurs especially in case there is only a small amount of delivery-service-pod-replicas, i.e. just a single one.

We currenty limit Delivery-Service-Pods to a small amount of db-connections (5 (regular - with an extra-10 for bursts) + 2 (low-prio)). In cases where we only have a small amount of Delivery-Service-Pods, this leaves a lot of db-connections unused.

Improvement Proposal

Make db-connectionpool-limits dynamic. Configure maximum allowed DB-connections, and calculate allowable number of db-connections per delivery-service-replica. Pods will need to recalculate their allowed db-pool-size, either periodically, or upon rescaling. Pods might expose their pool-sizes (+ ideally also actual usage) via k8s-api to allow for further optimisations.

To avoid over/under-provisioning, a protocol like the following might be pursued when scaling-up replica-count:

  • target-usage: 90 % of available DB-Connections
  • assign remainder (10%) of connections to new pod
  • calculate new quota per-pod
  • request all pods with too many connection-quota to free connections
  • allow (new) pod with too little connections to increase poolsize

Presumably, this will need an additional controller, depending on how customisable horizontal-pod-autoscaler is.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/ipceiImportant Project of Common European Interestkind/bugfixBug

    Type

    Projects

    Status

    🛠️ Needs Refinement

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions