A brief overview of linux scheduler, context switch , priorities and scheduling classes as well as new features. Also provides an overview of preemption models in linux and how to use each model. all the examples are taken from http://www.discoversdk.com
3. Processes and Threads
A process is an instance of a running program.
Multiple instances of the same program can be running.
Program code (“text section”) memory is shared.
Each process has its own data section, address space, open
files and signal handlers.
A thread is a single task in a program.
It belongs to a process and shares the common data
section, address space, open files and pending signals.
It has its own stack, pending signals and state.
It's common to refer to single threaded programs as
processes.
3
4. The Kernel and Threads
In 2.6 an explicit notion of processes and threads
was introduced to the kernel.
Scheduling is done on a thread by thread basis.
The basic object the kernel works with is a task,
which is analogous to a thread.
4
5. Thread 1 Thread
1
Thread
2
Thread
3
Thread
4
Process 123 Process 124
File
Descriptors
Memory
Signal
Handlers
File
Descriptors
Memory
Signal
Handlers
Stack
State
Signal
Mask
Stack
State
Signal
Mask
Stack
State
Signal
Mask
Stack
State
Signal
Mask
Stack
State
Signal
Mask
Priority Priority Priority Priority Priority
5
7. Linux Priorities
0
1
2
3
4
98
99
97
...
Non real-time processes
SCHED_OTHER
SCHED_BATCH
SCHED_IDLE
Real time processes
SCHED_FIFO
SCHED_RR
SCHED_DEADLINE (3.14)
19
18
17
16
-19
-20
-18
...
Nice
level
Real Time priority
7
8. API
— int sched_setscheduler(pid_t pid, int policy,
const struct sched_param *param);
— int setpriority(int which, id_t who, int prio);
— int sched_setparam(pid_t pid, const struct sched_param
*param);
— int sched_setattr(pid_t pid, struct sched_attr *attr, unsigned
int flags);
8
10. Blocking Threads
— A nonblocking infinite loop in a thread scheduled under the
SCHED_FIFO, SCHED_RR, or SCHED_DEADLINE policy will
block all threads with lower priority forever
— Solution: Limiting the CPU usage of real-time and deadline
processes
— /proc/sys/kernel/sched_rt_period_us
— Period that is equivalent to 100% CPU (default:
1000000)
— /proc/sys/kernel/sched_rt_runtime_us
— how much of the "period" time can be used by all real-
time and deadline scheduled processes on the system
(default: 950000)
10
11. Preemption
— The Linux kernel is a preemptive operating system
— When a task runs in user space mode and gets
interrupted by an interruption, if the interrupt
handler wakes up another task, this task can be
scheduled as soon as we return from the interrupt
handler
11
12. — However, when the interrupt comes while the task is executing
a system call, this system call has to finish before another task
can be scheduled.
— By default, the Linux kernel does not do kernel preemption.
— This means that the time before which the scheduler will be
called to schedule another task is unbounded
12
14. CONFIG_PREEMPT_NONE
— Kernel code (interrupts, exceptions, system calls)
never preempted. Default behavior in standard
kernels.
— Best for systems making intense computations, on
which overall throughput is key.
— Best to reduce task switching to maximize CPU and
cache usage (by reducing context switching).
14
15. CONFIG_PREEMPT_VOLUNTARY
— Kernel code can preempt itself
— Typically for desktop systems, for quicker application
reaction to user input.
— Adds explicit rescheduling points throughout kernel
code.
— Minor impact on throughput.
— Used in: Ubuntu Desktop 15.04, Ubuntu Server 14.04
— Use: cond_resched()
15
16. CONFIG_PREEMPT
— Most kernel code can be involuntarily preempted at any
time. When a process becomes runnable, no more need
to wait for kernel code (typically a system call) to return
before running the scheduler.
— Exception: kernel critical sections (holding spinlocks). In
a case you hold a spinlock on a uni-processor system,
kernel preemption could run another process, which
would loop forever if it tried to acquire the same
spinlock.
— Typically for desktop or embedded systems with latency
requirements in the milliseconds range.
16
17. CONFIG_PREEMPT_RT
— The PREEMPT_RT patch adds a new level of preemption, called
CONFIG_PREEMPT_RT_FULL
— This level of preemption replaces all kernel spinlocks by mutexes (or so-called
sleeping spinlocks)
— Instead of providing mutual exclusion by disabling interrupts and preemption, they
are just normal locks: when contention happens, the process is blocked and
another one is selected by the scheduler.
— Works well with threaded interrupts, since threads can block, while usual interrupt
handlers could not.
— Some core, carefully controlled, kernel spinlocks remain as normal spinlocks.
— With CONFIG_PREEMPT_RT_FULL, virtually all kernel code becomes preemptible
— An interrupt can occur at any time, when returning from the interrupt handler, the
woken up process can start immediately.
17