📂
build a OS
  • Learn OS with me
  • OS Interfaces
    • OS interfaces
    • I/O and File descriptors
    • Process and Memory
    • Pipes
    • File
  • OS Organization
    • OS Organization
    • Challenge yourself
  • Memory Management
    • XV6 Virtual Memory
    • Page Table
      • Part 1: How to translate address
      • Part 2: Create an Address Space
      • Part 3: How Page Table is used
      • Part 4: Page Fault and Swap
      • Part 5: How to operate on page tables
    • xv6 buddy allocator
      • How to display physical memory
    • Memory Management Walk Through
  • Traps and Interrupts
    • Trap Home Page
      • 系统调用的核心原理
    • What is trapframe
    • What is trampoline
    • Traps from kernel space
    • How fork() works
    • How system calls get into/out of the kernel
    • How exec() works
  • Scheduling
    • XV6 CPU Scheduling
    • How unix pipes work?
    • How does wait(), exit(), kill() work?
  • File System
    • Overview and Disk Layout
    • Buffer Cache
    • Design Inode Layer
    • Inode Content
    • Block Allocator
    • Design a log system for crash recovery
    • Directory Layer
    • Path names
    • File Descriptor Layer
    • FS System Calls
    • XV6 VS Real World
    • Make Xv6 File disk management system
    • Write FS simulator in python
    • How Redirect Shell command works
  • Concurrency
    • Spinlock
    • How linux select work
    • Hardware Support Locking
    • Exercise: Implement atomic counter
    • Locking in Xv6
    • Concurrency in Xv6
    • Exercise: Socket Programming with Event loop
  • Labs
    • Lab 1 Xv6 and Unix utilities
    • Lab 2 Shell
    • Lab 3 Buddy Allocator
    • Lab 4 Lazy
    • Lab 5 Copy-on-Write Fork for xv6
    • Lab 6 RISC-V assembly
    • Lab 6 Uthread: switching between threads
    • Lab 6 Alarm
    • Lab 7 Lock
    • Lab 8 File System: Large Files
    • Lab 8 File System: Symbolic links
    • Lab 9 mmap
    • Lab 10 Networking Part 1
    • Lab 10 Networking Part 2
  • Hardware, Device, Assembly
    • RISC-V assembly
    • Assembly: Access and Store information in Memory
    • Start xv6 and the first process
    • Why first user process loads another program?
    • What does kernel.ld do in XV6?
    • XV6 Device Driver
Powered by GitBook
On this page
  • Example
  • Linux Implementation
  • Summary

Was this helpful?

  1. Concurrency

How linux select work

PreviousSpinlockNextHardware Support Locking

Last updated 5 years ago

Was this helpful?

select, pselect, FD_CLR, FD_ISSET, FD_SET, FD_ZERO - synchronous I/O multiplexing

int select(int nfds, fd_set *readfds, fd_set *writefds,
                  fd_set *exceptfds, struct timeval *timeout);

void FD_CLR(int fd, fd_set *set);
int  FD_ISSET(int fd, fd_set *set);
void FD_SET(int fd, fd_set *set);
void FD_ZERO(fd_set *set);

select() and pselect() allow a program to monitor multiple file descriptors, waiting until one or more of the file descriptors become “ready” for some class of I/O operation (e.g., input possible).

The timeout argument specifies the interval that select() should block waiting for a file descriptor to become ready. The call will block until either:

  • a file descriptor becomes ready;

  • the call is interrupted by a signal handler; or

  • the timeout expires.

Example

#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <sys/types.h>
#include <unistd.h>

int
main(void)
{
   fd_set rfds;
   struct timeval tv;
   int retval;

   /* Watch stdin (fd 0) to see when it has input. */
   FD_ZERO(&rfds);
   FD_SET(0, &rfds);

   /* Wait up to five seconds. */
   tv.tv_sec = 5;
   tv.tv_usec = 0;

   retval = select(1, &rfds, NULL, NULL, &tv);
   /* Don't rely on the value of tv now! */

   if (retval == -1)
       perror(“select()”);
   else if (retval)
       printf(“Data is available now.\n”);
       /* FD_ISSET(0, &rfds) will be true. */
   else
       printf(“No data within five seconds.\n”);

   exit(EXIT_SUCCESS);
}

Linux Implementation

Code is from linux 2.6.

This is the select system call entry.

SYSCALL_DEFINE5(select, int, n, fd_set __user *, inp, fd_set __user *, outp,
        fd_set __user *, exp, struct timeval __user *, tvp)

The core logic

int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
               fd_set __user *exp, struct timespec *end_time)

Allocate bit maps, all fds from 3 registered fd sets have a bit presented.

    /*
     * We need 6 bitmaps (in/out/ex for both incoming and outgoing),
     * since we used fdset we need to allocate memory in units of
     * long-words. 
     */
    size = FDS_BYTES(n);
    bits = stack_fds;
    if (size > sizeof(stack_fds) / 6) {
        /* Not enough space in on-stack array; must use kmalloc */
        ret = -ENOMEM;
        bits = kmalloc(6 * size, GFP_KERNEL);
        if (!bits)
            goto out_nofds;
    }
    fds.in      = bits;
    fds.out     = bits +   size;
    fds.ex      = bits + 2*size;
    fds.res_in  = bits + 3*size;
    fds.res_out = bits + 4*size;
    fds.res_ex  = bits + 5*size;

    if ((ret = get_fd_set(n, inp, fds.in)) ||
        (ret = get_fd_set(n, outp, fds.out)) ||
        (ret = get_fd_set(n, exp, fds.ex)))
        goto out;
    zero_fd_set(n, fds.res_in);
    zero_fd_set(n, fds.res_out);
    zero_fd_set(n, fds.res_ex);

    ret = do_select(n, &fds, end_time);

Loop and fetch each file, check its state, mark the bit if qualified. int do_select(int n, fd_set_bits *fds, struct timespec *end_time)

file = fget_light(i, &fput_needed);
if (file) {
    f_op = file->f_op;
    mask = DEFAULT_POLLMASK;
    if (f_op && f_op->poll) {
        wait_key_set(wait, in, out, bit);
        mask = (*f_op->poll)(file, wait);
    }
    fput_light(file, fput_needed);
    if ((mask & POLLIN_SET) && (in & bit)) {
        res_in |= bit;
        retval++;
        wait = NULL;
    }
    if ((mask & POLLOUT_SET) && (out & bit)) {
        res_out |= bit;
        retval++;
        wait = NULL;
    }
    if ((mask & POLLEX_SET) && (ex & bit)) {
        res_ex |= bit;
        retval++;
        wait = NULL;
    }
}

Each file will be set state accordingly after I/O. The states are defined as:

#define POLLIN_SET (POLLRDNORM | POLLRDBAND | POLLIN | POLLHUP | POLLERR)
#define POLLOUT_SET (POLLWRBAND | POLLWRNORM | POLLOUT | POLLERR)
#define POLLEX_SET (POLLPRI)

Summary

select code checks every single file, and return results. We have to use FD_ISSET(FD, &rfds) to check which fd is actually ready.

If a lot of fds we are monitoring, the cost is very heavy.

select(2) - Linux manual page