Design a log system for crash recovery
Crash in the file system can lead to inconsistent state.
For example: Depending on the order of disk writes, crash may leave an inode with a reference to a content block that is marked free. Or leave an allocated but not yet referenced content block.
Log Header
The header block contains an array of sector numbers, one for each of the logged blocks, and the count of log blocks. The count is either 0, indicating there is no transaction. Or non-zero, indicating the log contains a complete committed transactions with a number of logged blocks.
XV6 writes the header block when a txn commits, but not before. So it is atomic.
Group Commit
Committing several txns together. Logging system only commits when no fs system calls ongoing.
Log Disk Space
XV6 dedicates a fixed amount of space on disk to hold log. The total number of blocks written by the system calls must fit in the space. 1. No single system call can be allowed to write more distinct blocks than allowed. 2. A system call must wait until there is enough space in log then begins.
Special Case: write
XV6’s write system call breaks up large writes into multiple smaller writes that fit in the log.
Data structure
Code in log.c
Pattern for using the log in system call
begin_op
begin_op
Wait and sleep if the log is currently committing. Sleep when log is reaching MAX allowed size. Otherwise, increment the outstanding system call counter in log metadata, and continue.
ilock
ilock
Lock the given inode. Reads the inode from disk if necessary.
This part is very important, since the inode we have in-memory might not load the latest content from disk!
iunlock
iunlock
Unlock the given inode by releasing the sleep lock.
end_op
end_op
Called at the end of FS system call. Commits if this was the last outstanding operation.
Fun Part
A) Linux ext2 -> ext3 is using the logging system. See journaling the linux ext2fs filesystem
B) Logging is similar to DB’s logging and transaction design.
Last updated