Bug description
Deploying Defect Dojo to a Kubernetes cluster causes the uWSGI container to consume a lot of memory resulting in the node killing the pod. This is due to the unbound number of file descriptors on the node. See unbit/uwsgi#2299 for a description of the issue with uWSGI.
Steps to reproduce
- Deploy helm chart to a kubernetes cluster with nodes running Flatcar Container Linux by Kinvolk 3602.2.1 (Oklo)
- watch the pod get deployed and after <15 sec killed by the node due to OOM.
Expected behavior
Expected the pod to start up and not get OOMKilled by the node.
I locally build my own container adding the --max-fd argument to docker/entrypoint-uwsgi.sh and used that image in the my cluster, this resolved the issue.
Deployment method (select with an X)
Environment information
- Kubernetes nodes running:
Kernel Version: 5.15.136-flatcar
OS Image: Flatcar Container Linux by Kinvolk 3602.2.1 (Oklo)
Operating System: linux
Architecture: amd64
Container Runtime Version: containerd://1.6.21
Kubelet Version: v1.28.3
Kube-Proxy Version: v1.28.3
- DefectDojo version: 2.30.4
Logs
Logs from the defectdojo-django pod
$ k logs defect-dojo-defectdojo-django
Defaulted container "uwsgi" out of: uwsgi, nginx
[13/Feb/2024 08:50:57] INFO [dojo.models:4295] enabling audit logging
/usr/local/lib/python3.11/site-packages/coreapi/codecs/download.py:5: DeprecationWarning: 'cgi' is deprecated and slated for removal in Python 3.13
import cgi
System check identified no issues (0 silenced).
*** Starting uWSGI 2.0.23 (64bit) on [Tue Feb 13 08:50:58 2024] ***
compiled with version: 10.2.1 20210110 on 29 January 2024 15:50:06
os: Linux-5.15.136-flatcar #1 SMP Mon Oct 23 16:44:45 -00 2023
nodename: defect-dojo-defectdojo-django
machine: x86_64
clock source: unix
detected number of CPU cores: 4
current working directory: /app
detected binary path: /usr/local/bin/uwsgi
!!! no internal routing support, rebuild with pcre support !!!
*** WARNING: you are running uWSGI without its master process manager ***
your memory page size is 4096 bytes
detected max file descriptor number: 1073741816
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uWSGI http bound on 0.0.0.0:8081 fd 3
spawned uWSGI http 1 (pid: 13)
uwsgi socket 0 bound to UNIX address /run/defectdojo/uwsgi.sock fd 6
Python version: 3.11.4 (main, Aug 16 2023, 05:31:52) [GCC 10.2.1 20210110]
Python main interpreter initialized at 0x7fb82cac7558
python threads support enabled
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 405672 bytes (396 KB) for 15 cores
*** Operational MODE: preforking+threaded ***
note that uWSGI logs detected max file descriptor number: 1073741816 which causes the container to use a lot of memory.
Running the same deployment locally on my kind cluster i get:
Defaulted container "uwsgi" out of: uwsgi, nginx
[16/Feb/2024 08:57:04] INFO [dojo.models:4295] enabling audit logging
System check identified no issues (0 silenced).
*** Starting uWSGI 2.0.23 (64bit) on [Fri Feb 16 08:57:05 2024] ***
compiled with version: 11.2.1 20220219 on 05 February 2024 16:57:27
os: Linux-6.5.11-linuxkit #1 SMP PREEMPT Wed Dec 6 17:08:31 UTC 2023
nodename: defect-dojo-defectdojo-django-7774dcb687-gn5wn
machine: aarch64
clock source: unix
detected number of CPU cores: 10
current working directory: /app
detected binary path: /usr/local/bin/uwsgi
!!! no internal routing support, rebuild with pcre support !!!
*** WARNING: you are running uWSGI without its master process manager ***
your memory page size is 4096 bytes
detected max file descriptor number: 1048576
lock engine: pthread robust mutexes
thunder lock: disabled (you can enable it with --thunder-lock)
uWSGI http bound on 0.0.0.0:8081 fd 3
spawned uWSGI http 1 (pid: 17)
uwsgi socket 0 bound to UNIX address /run/defectdojo/uwsgi.sock fd 6
Python version: 3.11.3 (main, May 3 2023, 08:27:37) [GCC 11.2.1 20220219]
Python main interpreter initialized at 0xffffa64d55c0
python threads support enabled
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 183136 bytes (178 KB) for 4 cores
*** Operational MODE: preforking+threaded ***
[16/Feb/2024 08:57:05] INFO [dojo.models:4295] enabling audit logging
WSGI app 0 (mountpoint='') ready in 0 seconds on interpreter 0xffffa64d55c0 pid: 1 (default app)
*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI worker 1 (pid: 1, cores: 2)
spawned uWSGI worker 2 (pid: 18, cores: 2)
where we see that detected max file descriptor number: 1048576. This is much lower and does not result in a OOMKilled event.
Suggestion
We add the option to include the --max-fd argument with a configurable value to the docker/entrypoint-uwsgi.sh script such that it is possible to set it to set it to a lower value, e.g. 1048576.
Bug description
Deploying Defect Dojo to a Kubernetes cluster causes the uWSGI container to consume a lot of memory resulting in the node killing the pod. This is due to the unbound number of file descriptors on the node. See unbit/uwsgi#2299 for a description of the issue with uWSGI.
Steps to reproduce
Expected behavior
Expected the pod to start up and not get OOMKilled by the node.
I locally build my own container adding the
--max-fdargument todocker/entrypoint-uwsgi.shand used that image in the my cluster, this resolved the issue.Deployment method (select with an
X)Environment information
Logs
Logs from the defectdojo-django pod
note that uWSGI logs
detected max file descriptor number: 1073741816which causes the container to use a lot of memory.Running the same deployment locally on my kind cluster i get:
where we see that
detected max file descriptor number: 1048576. This is much lower and does not result in a OOMKilled event.Suggestion
We add the option to include the
--max-fdargument with a configurable value to thedocker/entrypoint-uwsgi.shscript such that it is possible to set it to set it to a lower value, e.g.1048576.