📂
build a OS
  • Learn OS with me
  • OS Interfaces
    • OS interfaces
    • I/O and File descriptors
    • Process and Memory
    • Pipes
    • File
  • OS Organization
    • OS Organization
    • Challenge yourself
  • Memory Management
    • XV6 Virtual Memory
    • Page Table
      • Part 1: How to translate address
      • Part 2: Create an Address Space
      • Part 3: How Page Table is used
      • Part 4: Page Fault and Swap
      • Part 5: How to operate on page tables
    • xv6 buddy allocator
      • How to display physical memory
    • Memory Management Walk Through
  • Traps and Interrupts
    • Trap Home Page
      • 系统调用的核心原理
    • What is trapframe
    • What is trampoline
    • Traps from kernel space
    • How fork() works
    • How system calls get into/out of the kernel
    • How exec() works
  • Scheduling
    • XV6 CPU Scheduling
    • How unix pipes work?
    • How does wait(), exit(), kill() work?
  • File System
    • Overview and Disk Layout
    • Buffer Cache
    • Design Inode Layer
    • Inode Content
    • Block Allocator
    • Design a log system for crash recovery
    • Directory Layer
    • Path names
    • File Descriptor Layer
    • FS System Calls
    • XV6 VS Real World
    • Make Xv6 File disk management system
    • Write FS simulator in python
    • How Redirect Shell command works
  • Concurrency
    • Spinlock
    • How linux select work
    • Hardware Support Locking
    • Exercise: Implement atomic counter
    • Locking in Xv6
    • Concurrency in Xv6
    • Exercise: Socket Programming with Event loop
  • Labs
    • Lab 1 Xv6 and Unix utilities
    • Lab 2 Shell
    • Lab 3 Buddy Allocator
    • Lab 4 Lazy
    • Lab 5 Copy-on-Write Fork for xv6
    • Lab 6 RISC-V assembly
    • Lab 6 Uthread: switching between threads
    • Lab 6 Alarm
    • Lab 7 Lock
    • Lab 8 File System: Large Files
    • Lab 8 File System: Symbolic links
    • Lab 9 mmap
    • Lab 10 Networking Part 1
    • Lab 10 Networking Part 2
  • Hardware, Device, Assembly
    • RISC-V assembly
    • Assembly: Access and Store information in Memory
    • Start xv6 and the first process
    • Why first user process loads another program?
    • What does kernel.ld do in XV6?
    • XV6 Device Driver
Powered by GitBook
On this page
  • The problem
  • Task
  • The solution
  • 心得

Was this helpful?

  1. Labs

Lab 5 Copy-on-Write Fork for xv6

Your task is to implement copy-on-write fork in the xv6 kernel

The problem

The fork() system call in xv6 copies all of the parent process’s user-space memory into the child. If the parent is large, copying can take a long time. In addition, the copies often waste memory; in many cases neither the parent nor the child modifies a page, so that in principle they could share the same physical memory. The inefficiency is particularly clear if the child calls exec(), since exec() will throw away the copied pages, probably without using most of them. On the other hand, if both parent and child use a page, and one or both writes it, a copy is truly needed.

Task

Your task is to implement copy-on-write fork in the xv6 kernel

The solution

The goal of copy-on-write (COW) fork() is to defer allocating and copying physical memory pages for the child until the copies are actually needed, if ever.

COW fork() creates just a pagetable for the child, with PTEs for user memory pointing to the parent’s physical pages. COW fork() marks all the user PTEs in both parent and child as not writable.

Modify uvmcopy to do so. fork calls this function to allocate child memory.

// Given a parent process’s page table, copy
// its memory into a child’s page table.
// Copies both the page table and the
// physical memory.
// returns 0 on success, -1 on failure.
// frees any allocated pages on failure.
int
uvmcopy(pagetable_t old, pagetable_t new, uint64 sz)
{
  pte_t *pte;
  uint64 pa, I;
  uint flags;
  for(I = 0; I < sz; I += PGSIZE){
    if((pte = walk(old, I, 0)) == 0)
      panic(“uvmcopy: pte should exist”);
    if((*pte & PTE_V) == 0)
      panic(“uvmcopy: page not present”);
    pa = PTE2PA(*pte);
    flags = PTE_FLAGS(*pte);
    // Record the page is COW mapping.
    flags |= PTE_RSW;
    // clear PTE_W in the PTEs of both child and parent*
    flags &= (~PTE_W);

    // map the parent’s physical pages into the child
    if(mappages(new, I, PGSIZE, (uint64)pa, flags) != 0){
      //kfree(mem);
      goto err;
    }
    // Bump the reference count*
    add_ref((void*)pa);
    // Remove parent page table mapping.
    uvmunmap(old, I, PGSIZE, 0);
    // Re-add the mapping with write bit cleared flags.
    if (mappages(old, I, PGSIZE, pa, flags) != 0) {
      goto err;
    }
  }

When either process tries to write one of these COW pages, the CPU will force a page fault. The kernel page-fault handler detects this case, allocates a page of physical memory for the faulting process, copies the original page into the new page, and modifies the relevant PTE in the faulting process to refer to the new page, this time with the PTE marked writeable. When the page fault handler returns, the user process will be able to write its copy of the page.

trap.c usertrap handler:
else if (r_scause() == 15) {
// Modify usertrap() to recognize page faults.
// When a page-fault occurs on a COW page, allocate a new page with kalloc(),
// copy the old page to the new page,
// and install the new page in the PTE with PTE_W set.

  // Get physial page address and correct flags.
  uint64 start_va = PGROUNDDOWN(r_stval());
  pte_t *pte;
  pte = walk(p->pagetable, start_va, 0);
  if (pte == 0) {
    printf(“page not found\n”);
    p->killed = 1;
    goto end;
  }
  if ((*pte & PTE_V) && (*pte & PTE_U) && (*pte & PTE_RSW)) {
    uint flags = PTE_FLAGS(*pte);
    // +Write, -COW
    flags |= PTE_W;
    flags &= (~PTE_RSW);

    char *mem = kalloc();
    char *pa = (char *)PTE2PA(*pte);
    memmove(mem, pa, PGSIZE);
    uvmunmap(p->pagetable, start_va, PGSIZE, 0);
    // decrement old pa ref count.
    dec_ref((void*)pa);
    if (mappages(p->pagetable, start_va, PGSIZE, (uint64)mem, flags) != 0) {
      p->killed = 1;
      printf(“sometthing is wrong in mappages in trap.\n”);
    }
  }
  ...

COW fork() makes freeing of the physical pages that implement user memory a little trickier. A given physical page may be referred to by multiple processes’ page tables, and should be freed only when the last reference disappears.

void add_ref(void *pa) {
  int index = get_ref_index(pa);
  if (index == -1) {
    return;
  }
  refc[index] = refc[index] + 1;
}

void dec_ref(void *pa) {
  int index = get_ref_index(pa);
  if (index == -1) {
    return;
  }
  int cur_count = refc[index];
  if (cur_count <= 0) {
    panic(“def a freed page!”);
  }
  refc[index] = cur_count - 1;
  if (refc[index] == 0) {
    // we need to free page
    kfree(pa);
  }
}

Replace all the places calling kfree with dec_ref.

Modify kalloc to call add_ref.

Modify copyout() to use the same scheme as page faults when it encounters a COW page.

// Copy from kernel to user.
// Copy len bytes from src to virtual address dstva in a given page table.
// Return 0 on success, -1 on error.
int
copyout(pagetable_t pagetable, uint64 dstva, char *src, uint64 len)
{
  uint64 n, va0, pa0;
  while(len > 0){
    va0 = PGROUNDDOWN(dstva);
    if(va0 >= MAXVA) {
      return -1;
    }

    pte_t *pte;
    pte = walk(pagetable, va0, 0);
    if (pte == 0) {
      return -1;
    }

    if ((*pte & PTE_V) == 0) {
      return -1;
    }
    if ((*pte & PTE_U) == 0) {
      return -1;
    }

    if (*pte & PTE_RSW) {
      pa0 = PTE2PA(*pte);
      uint flags = PTE_FLAGS(*pte);
      // +Write, -COW
      flags |= PTE_W;
      flags &= (~PTE_RSW);

      char *mem = kalloc();
      memmove(mem, (void*)pa0, PGSIZE);
      uvmunmap(pagetable, va0, PGSIZE, 0);
      dec_ref((void*)pa0);
      if (mappages(pagetable, va0, PGSIZE, (uint64)mem, flags) != 0) {
        panic(“sometthing is wrong in mappages in trap.\n”);
      }
    }

Reference:

Result

心得

unix给用户端提供了很多接口。提供了方便但是谁也无法保证用户程序不会滥用。复制fork导致多余内存的浪费必须要解决,否则一个不断复制的程序可以耗尽资源。所以大师们设计出copy on write的方案, trade了一点点性能换来了资源的节约。

PreviousLab 4 LazyNextLab 6 RISC-V assembly

Last updated 5 years ago

Was this helpful?

https://pdos.csail.mit.edu/6.828/2019/labs/cow.html