# I/O and File descriptors

A fd is a small integer representing a kernel-managed object that a process may read/write. A process obtains a fd by opening a file, directory, device, pipe, or socket. In unix, a file has different types that what it currently represents.

Every process has a private space of file descriptors starting at zero. Xv6 kernel uses fd as an index into a per-process table.

By convention, fd 1(standard input), fd 2(standard output), fd3 (standard error). `xv6 shell` ensure the 3 fds are open when starting up.

`read` and `write` system calls advance the file by offset (how much it reads or writes), so next time, the read/write starts from the new offset.

One interesting aspect is program like `cat` does not know whether it is reading from file, console, socket, or pipe. It does not know whether it is printing to a console, or a file. The use of fds and the conventions allows some simple implementation of user programs.

`close` releases fd, making it free for reuse. **A newly allocated fd is always the lowest-numbered unused descriptor of the current process.** Unix shell utilizes this feature with `fork`, to implement smart I/O redirection, and implement `pipe`.

### I/O redirection

`fork` copies parent’s file descriptor tables, with its memory. The system call `exec` replaces the calling process’ memory but **preserves its file table.**

**Example how command cat < input.txt works**

```c
char *argv[2];
argv[0] = “cat”;
argv[1] = 0;
if(fork() == 0) {
  close(0);
  open(“input.txt”, O_RDONLY);
  exec(“cat”, argv);
}
```

After the child closes file descriptor 0, open is guaranteed to use that file descriptor for the newly opened input.txt: 0 will be the smallest available file descriptor. Cat then executes with file descriptor 0 (standard input) referring to input.txt

### `fork` copies fd table, but file offset is shared.

The underlying file offset is shared between parent and child.

```c
if(fork() == 0) {
  write(1, “hello “, 6);
  exit(0);
} else {
  wait(0);
  write(1, “world\en”, 6);
}
```

`dup` system call is similar. Duplicates an existing file descriptor, returning a new fd that refers to the same I/O object. Both file descriptors share the an offset.

```c
fd = dup(1);
write(1, “hello “, 6);
write(fd, “world\en”, 6);
```

#### Do you know why the file offset is shared?

`fd` is basically an index pointing to an actual file object. The file is shared between parent, and child. After `fork` or `dup`, the reference count of the file is incremented.

#### Fun Fun Command

`ls existing-file non-existing-file > tmp1 2>&1` The `2>&1` tells the shell to give the command a file descriptor 2 that is a duplicate of descriptor 1. Both the name of the existing file and the error message for the non-existing file will show up in the file tmp1.

### Summary

File descriptors are a powerful abstraction, because they hide the details of what they are connected to: a process writing to file descriptor 1 may be writing to a file, to a device like the console, or to a pipe.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://xiayingp.gitbook.io/build_a_os/os-interfaces/i-o-and-file-descriptors.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
