Skip to content

Fix race against systemd#1683

Merged
hqhq merged 1 commit into
opencontainers:masterfrom
vikaschoudhary16:runc-systemd-race
Jan 9, 2018
Merged

Fix race against systemd#1683
hqhq merged 1 commit into
opencontainers:masterfrom
vikaschoudhary16:runc-systemd-race

Conversation

@vikaschoudhary16
Copy link
Copy Markdown
Contributor

  • T0: runc triggers a systemd unit creation asynchronously from here
  • T1: runc then moves ahead and starts creating cgroup paths(.scope directories), here. Kernel creates .scope directory and cgroup.procs file(along with other default files) in the directory automatically, in an atomic manner.
  • T3: systemd execution thread which was invoked at time T0, is still in the process of unit creation. systemd also trying to create cgroup paths and deletes the .scope directory which is created at time T1 by runc from here in the code

Fixes openshift/origin#16246

@vikaschoudhary16
Copy link
Copy Markdown
Contributor Author

/cc @mrunalp

@runcom
Copy link
Copy Markdown
Member

runcom commented Jan 8, 2018

Looks ok, /cc @cyphar

@runcom
Copy link
Copy Markdown
Member

runcom commented Jan 8, 2018

You need to sign your commit @vikaschoudhary16

- T0: runc triggers a systemd unit creation asynchronously from [here](https://github.com/opencontainers/runc/blob/master/libcontainer/cgroups/systemd/apply_systemd.go#L298)
- T1: runc then moves ahead and starts creating cgroup paths(.scope directories), [here](https://github.com/opencontainers/runc/blob/master/libcontainer/cgroups/systemd/apply_systemd.go#L348). Kernel creates .scope directory and cgroup.procs file(along with other default files) in the directory automatically, in an atomic manner.
- T3: systemd execution thread which was invoked at time `T0`, is still in the process of unit creation. systemd also trying to create cgroup paths and deletes the `.scope` directory which is created at time `T1` by runc from [here](https://github.com/systemd/systemd/blob/v219/src/shared/cgroup-util.c#L1630) in the code

Signed-off-by: vikaschoudhary16 <choudharyvikas16@gmail.com>
@mrunalp
Copy link
Copy Markdown
Contributor

mrunalp commented Jan 8, 2018

LGTM
We have a similar fix in cri-o for waiting on transient units.

Approved with PullApprove

@hqhq
Copy link
Copy Markdown
Contributor

hqhq commented Jan 9, 2018

LGTM

Approved with PullApprove

@hqhq
Copy link
Copy Markdown
Contributor

hqhq commented Jan 9, 2018

CI issues is addressed in #1682

@hqhq hqhq merged commit 96086e5 into opencontainers:master Jan 9, 2018
vikaschoudhary16 pushed a commit to vikaschoudhary16/kubernetes that referenced this pull request Jan 11, 2018
This fixes a race condition in runc/systemd at container creation time
opencontainers/runc#1683

Signed-off-by: vikaschoudhary16 <vichoudh@redhat.com>
vikaschoudhary16 added a commit to vikaschoudhary16/kubernetes that referenced this pull request Jan 12, 2018
This fixes a race condition in runc/systemd at container creation time
opencontainers/runc#1683

Signed-off-by: vikaschoudhary16 <vichoudh@redhat.com>
openshift-merge-robot added a commit to openshift/origin that referenced this pull request Jan 15, 2018
Automatic merge from submit-queue (batch tested with PRs 18040, 18097, 18098, 18106, 18087).

UPSTREAM: opencontainers/runc: 1683: Fix race against systemd

Thanks to @vikaschoudhary16 for doing all the legwork upstream on this!

Fixes #16246

opencontainers/runc#1683
google/cadvisor#1861
kubernetes/kubernetes#58117

@derekwaynecarr
k8s-github-robot pushed a commit to kubernetes/kubernetes that referenced this pull request Jan 15, 2018
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Bump runc to d5b4a3e

**What this PR does / why we need it**:
This fixes a race condition in runc/systemd at container creation time
opencontainers/runc#1683

**Special notes for your reviewer**:
Depends on google/cadvisor#1861
`pull-kubernetes-verify` is failing because of this dependency only.


**Release note**:

```release-note
None
```
/sig node
/cc @derekwaynecarr @sjenning @aveshagarwal @vishh @dchen1107 @dashpole @jeremyeder
mrunalp added a commit to projectatomic/runc that referenced this pull request Jun 12, 2018
mrunalp added a commit to projectatomic/runc that referenced this pull request Jun 12, 2018
@smileusd
Copy link
Copy Markdown

smileusd commented Oct 18, 2022

@vikaschoudhary16 I still see this issue on runc v1.1.2, containerd 1.6.6, systemd 245
containerd/containerd#7506

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants