XV6 CPU Scheduling
Why we design scheduling?
To time-share the CPUs among the processes.
How xv6 achieves multiplexing?
Xv6 multiplexes by switching each CPU from one process to another in 2 situations: 1. Sleep and wake mechanism 2. Timer fired for a process running for long periods
This multiplexing creates an illusion that each process has its own CPU!
How kernel does the context switch visualized?
How to implement context switch?
Preparation:
Xv6 has a scheduler running in a dedicated thread per CPU. We define a struct to contain all important contents, to save all registers for a process.
CPU saves the scheduler process’s context. Each process saves context in its state.
A context switch assembly function to save current registers in old. Load from new
‘ra’ is return address register. ‘sp’ is stack pointer. Follow by a bunch of callee saved register in RISC-V. The last assembly is 'ret'. when swtch returns, it returns to the instructions pointed to by the restored ra register
Run time:
Process A calling yield() to give up CPU.
yield() calls sched(), which internally calls the context switch function above.
The switch assembly func saves the A’s context on A’s stack. (I thought in kernel’s process stack?) Yes, the process state are all saved in kernel process stack.
Switch restores scheduler context, and jump immediately to scheduler last saved checkpoint. It does not return back to A.
Scheduler is going to run next available process.
Eventually scheduler will resume process A, and calling swtch() to resume where process A left before.
Let’s take a closer look to see How scheduler works
Any process is willing to give up Cpu must do the following: 1. Acquire its own process lock. 2. Release any other locks its holding 3. Update its own state 4. Call sched() C function. Yield(), Sleep(), exit() all follows this conversion.
Scheduler code
A non-stop for loop tries to find next RUNNABLE process. Note: As soon as another process switching back to scheduler, it resumes at the code line after ‘swtch()’. It will do ‘c->proc = 0’
One crucial part is lock. Scheduler will acquire a lock, then do context switch. The resumed process will release the lock. If process wants to give up CPU, it needs to acquire the lock, then scheduler is going to release the lock. The above is different than normal lock/unlock conventions. But it has to be designed this way to make context switch critical section code protected!
There are 2 invariants:
1. If a process is running, a timer can safely switch away it. The CPU registers must hold process’s register values, and c->proc refer to it.
2. If a process is runnable, its p->context must hold its registers, no CPU is executing on the process’s kernel stack, no CPU references to it. If above invariant is not true, that’s the place we need a lock.
Enable/Disable interrupts Values of cpuid and mycpu are fragile: if the timer were to interrupt and cause the thread to yield and then move to a different CPU, a previously returned value would no longer be correct. To avoid this problem, xv6 requires that callers disable interrupts, and only enable them after they finish using the returned struct cpu.
Similarly, myproc()
disable interrupts, and re-enable after invokes mycpu()
The return value of myproc is safe to use even if interrupts are enabled: if a timer interrupt moves the calling process to a different CPU, its struct proc pointer will stay the same.
Graph
How process intentionally interact with each other?
Use sleep and wakeup.
How it is used?
Note: we need sleep
to atomically release s->lock
and put the consuming process to sleep. Before process waking up, sleep
needs to acquire the s->lock
again.
Sleep and wakeup implementation
A few things:
Before process sleeping, it must hold the lk lock, so no wake up is lost.
Before process sleeping, it must hold the process lock, so it can do context switch to scheduler.
Scheduler will release the process lock.
When the sleeping process is resumed, it must release the process lock, previously locked by scheduler.
The resuming process must acquire the lk lock. Since it needs to consume the data without any interruption.
Last updated