You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is one of the most critical structures in the whole kernel - it contains all information about a running task. We already briefly touched `task_struct` in lessons 2 and we even have implemented our own `task_struct` for the RPi OS, so I assume that by this time you should already have a basic understanding how it is used. Now I want to highlit a few important fields of this struct that are relevant to our discussion.
7
+
This is one of the most critical structures in the whole kernel - it contains all information about a running task. We already briefly touched `task_struct` in lesson 2 and we even have implemented our own `task_struct` for the RPi OS, so I assume that by this time you should already have a basic understanding how it is used. Now I want to highlit a few important fields of this struct that are relevant to our discussion.
8
8
9
-
*[thread_info](https://github.com/torvalds/linux/blob/v4.14/include/linux/sched.h#L525) This is the first field of the `task_struct` and it contains all fields that must be accessed by the low-level architecture assembler code. We have already seen how this happens in lesson 2 and will encounter a few other examples later. [thread_info](https://github.com/torvalds/linux/blob/v4.14/arch/arm64/include/asm/thread_info.h#L39) is architecture specific. In `arm64` case it is a simple structure with a few fields.
9
+
*[thread_info](https://github.com/torvalds/linux/blob/v4.14/include/linux/sched.h#L525) This is the first field of the `task_struct` and it contains all fields that must be accessed by the low-level architecture code. We have already seen how this happens in lesson 2 and will encounter a few other examples later. [thread_info](https://github.com/torvalds/linux/blob/v4.14/arch/arm64/include/asm/thread_info.h#L39) is architecture specific. In `arm64` case, it is a simple structure with a few fields.
10
10
```
11
11
struct thread_info {
12
12
unsigned long flags; /* low level flags */
@@ -19,13 +19,13 @@ This is one of the most critical structures in the whole kernel - it contains al
19
19
```
20
20
`flags` field is used very frequently - it contains information about the current task state (whether it is under a trace, whether a signal is pending, etc.) All possible flags values can be found [here](https://github.com/torvalds/linux/blob/v4.14/arch/arm64/include/asm/thread_info.h#L79)
21
21
*[state](https://github.com/torvalds/linux/blob/v4.14/include/linux/sched.h#L528) Task current state (whether it is currently running, waiting for an interrupt, exited etc.) All possible task states are described [here](https://github.com/torvalds/linux/blob/v4.14/include/linux/sched.h#L69)
22
-
*[stack](https://github.com/torvalds/linux/blob/v4.14/include/linux/sched.h#L536) When working on RPi OS we have seen that `task_struct` is always kept at the bottom of the task stack, so we can use a pointer to `task_struct` as a pointer to the stack. Kernel stacks have constant size, so finding stack end is also an easy task. I think that the same approach was used in the early versions of the Linux kernel, right now, after the introduction of the [vitually mapped stacks](https://lwn.net/Articles/692208/), `stack` field is used to store a pointer to the kernel stack.
22
+
*[stack](https://github.com/torvalds/linux/blob/v4.14/include/linux/sched.h#L536) When working on the RPi OS, we have seen that `task_struct` is always kept at the bottom of the task stack, so we can use a pointer to `task_struct` as a pointer to the stack. Kernel stacks have constant size, so finding stack end is also an easy task. I think that the same approach was used in the early versions of the Linux kernel, but right now, after the introduction of the [vitually mapped stacks](https://lwn.net/Articles/692208/), `stack` field is used to store a pointer to the kernel stack.
23
23
*[thread](https://github.com/torvalds/linux/blob/v4.14/include/linux/sched.h#L1108) Another important architecture specific structure is [thread_struct](https://github.com/torvalds/linux/blob/v4.14/arch/arm64/include/asm/processor.h#L81). It contains all information (such as [cpu_context](https://github.com/torvalds/linux/blob/v4.14/arch/arm64/include/asm/processor.h#L65)) that is used during a context switch. In fact, the RPi OS implements its own `cpu_context` that is used exactly in the same way as the original one.
24
24
*[sched_class and sched_entity](https://github.com/torvalds/linux/blob/v4.14/include/linux/sched.h#L562-L563) Those fields are used in schedule algorithm - more on them follows.
25
25
26
26
### Scheduler class
27
27
28
-
In Linux, there is an extendable mechanism that allows each task to use its own scheduling algorithm. This mechanism uses a structure [sched_class](https://github.com/torvalds/linux/blob/v4.14/kernel/sched/sched.h#L1400). You can think about this structure as an interface that defines all methods that a schedules class have to implement. Let's see what kind of methods are defined in the `sched_class` interface. (Not all of the methods are shown, but only those wichs I consider the most important for us)
28
+
In Linux, there is an extendable mechanism that allows each task to use its own scheduling algorithm. This mechanism uses a structure [sched_class](https://github.com/torvalds/linux/blob/v4.14/kernel/sched/sched.h#L1400). You can think about this structure as an interface that defines all methods that a schedules class have to implement. Let's see what kind of methods are defined in the `sched_class` interface. (Not all of the methods are shown, but only those that I consider the most important for us)
29
29
30
30
*[enqueue_task](https://github.com/torvalds/linux/blob/v4.14/kernel/sched/sched.h#L1403) is executed each time a new task is added to a scheduler class.
31
31
*[dequeue_task](https://github.com/torvalds/linux/blob/v4.14/kernel/sched/sched.h#L1404) is called when a task can be removed from the scheduler.
@@ -43,9 +43,9 @@ The principals behind CFS algorithm are very simple:
43
43
44
44
Linux scheduler uses another important data structure that is called "runqueue" and is described by the [rq](https://github.com/torvalds/linux/blob/v4.14/kernel/sched/sched.h#L667) struct. There is a single instance of a runqueue per CPU. When a new task needs to be selected for execution, the selection is made only from the local runqueue. But if there is a need, tasks can be balanced between different `rq` structures.
45
45
46
-
Runqueues are used by all scheduler classes, not only by CFS. All CFS specific information is kept in [cfs_rq](https://github.com/torvalds/linux/blob/v4.14/kernel/sched/sched.h#L420) struct, wich is embedded in the `rq` struct. One important field of the `cfs_rq` struct is called [min_vruntime](https://github.com/torvalds/linux/blob/v4.14/kernel/sched/sched.h#L425) - this is the lowest `vruntime` from all tasks, assigned to a runqueue. `min_vruntime` is assigned to a newly forked task - this ensures that the task will be selected next because CFS always ensures that a task with the smallest `vruntime` is selected. This also ensures that the new task will not be running for an unreasonably long time before it will be preempted.
46
+
Runqueues are used by all scheduler classes, not only by CFS. All CFS specific information is kept in [cfs_rq](https://github.com/torvalds/linux/blob/v4.14/kernel/sched/sched.h#L420) struct, wich is embedded in the `rq` struct. One important field of the `cfs_rq` struct is called [min_vruntime](https://github.com/torvalds/linux/blob/v4.14/kernel/sched/sched.h#L425) - this is the lowest `vruntime` from all tasks, assigned to a runqueue. `min_vruntime` is assigned to a newly forked task - this ensures that the task will be selected next, because CFS always ensures that a task with the smallest `vruntime` is picked. This approachlso ensures that the new task will not be running for an unreasonably long time before it will be preempted.
47
47
48
-
All tasks, assigned to a particular runqueue and tracked by CFS are kept in [tasks_timeline](https://github.com/torvalds/linux/blob/v4.14/kernel/sched/sched.h#L430) field of the `cfs_rq` struct. `tasks_timeline` represents a [Red–black tree](https://en.wikipedia.org/wiki/Red%E2%80%93black_tree), which can be used to pick all tasks sorted by there`vruntime` value. Red-black trees have an important property: all operations on it (search, insert, delete) can be done in [O(log n)](https://en.wikipedia.org/wiki/Big_O_notation) time. This means that even if we have thousands of concurrent tasks in the system all scheduler methods still executes very quickly. Another important property of a red-black tree is that for any node in the tree its right child will always have larger `vruntime` value than the parent, and left child's `vruntime` will be always less or equal then the parent's `vruntime`. This has an important implication: the leftmost node is always the one with the smallest `vruntime`.
48
+
All tasks, assigned to a particular runqueue and tracked by CFS are kept in [tasks_timeline](https://github.com/torvalds/linux/blob/v4.14/kernel/sched/sched.h#L430) field of the `cfs_rq` struct. `tasks_timeline` represents a [Red–black tree](https://en.wikipedia.org/wiki/Red%E2%80%93black_tree), which can be used to pick tasks ordered by their`vruntime` value. Red-black trees have an important property: all operations on it (search, insert, delete) can be done in [O(log n)](https://en.wikipedia.org/wiki/Big_O_notation) time. This means that even if we have thousands of concurrent tasks in the system all scheduler methods still executes very quickly. Another important property of a red-black tree is that for any node in the tree its right child will always have larger `vruntime` value than the parent, and left child's `vruntime` will be always less or equal then the parent's `vruntime`. This has an important implication: the leftmost node is always the one with the smallest `vruntime`.
0 commit comments