The document discusses the Linux kernel buffer cache. It describes the structure of buffer headers and the buffer pool. It outlines 5 scenarios for retrieving a buffer, including if the block is found in the hash queue, a free buffer is available, or if a delayed write buffer needs to be written first. It also covers reading and writing blocks to disk using functions like bread(), breada(), bwrite(), and brelse(). The advantages of the buffer cache in reducing disk access and ensuring integrity are presented.
1 of 28
More Related Content
Unix ch03-03(2)
1. Chap 3. The Buffer Cache
1. Buffer Headers
2. Structure of the Buffer Pool
3. Scenarios for Retrieval of a Buffer
4. Reading and Writing Disk Blocks
5. Advantages and Disadvantages of
the Buffer Cache
2. Why Buffer cache
To reduce the frequency of Disk Access
Slow disk access time
User program
User level
libraries
System call interface
File system Process ipc
Kernel level Buffer cache Control schedular
character block Subsystem mm
Device driver
Hardware control
Hardware level hardware
Figure 2.1 block diagram of the System Kernel
3. Buffer Headers
Buffer
Memory array
Contains data from disk
Buffer header
Identifies the buffer
Has
Device number
損 Logical file system num not a physical device num
Block number
Pointer to a data array for the buffer
Memory array : Buffer header = 1 : 1
4. States of buffer
Locked
Buffer header
Buffer header
Valid
Delayed write Dev num
Block num
Reading or writing
status
Waiting for free ptr to data area
ptr to previous buf
on hash queue
ptr to next buf
on the hash queue
ptr to previous buf
on free list
ptr to next buf
on free list
Figure 3.1 buffer header
5. Structure of the Buffer Pool
Free List
Doubly linked circular list
Every buffer is put on the free list when boot
When insert
Push the buffer in the head of the list ( error case )
Push the buffer in the tail of the list ( usually )
When take
Choose first element
Hash Queue
Separate queues : each Doubly linked list
When insert
Hashed as a function device num, block num
6. Scenarios for Retrieval of a Buffer
To allocate a buffer for a disk block
Use getblk()
Has Five scenarios
algorithm getblk else{
Input: file system number if ( there are no buffers on free list)
block number { /*scenario 4*/
Output: locked buffer sleep(event any buffer becomes free)
that can now be used for block continue;
{ }
while(buffer not found) remove buffer from free list;
{ if(buffer marked for delayed write)
if(block in hash queue){ /*scenario 5*/ { /*scenario 3*/
if(buffer busy){ asynchronous write buffer to disk;
sleep(event buffer becomes free) continue;
continue; }
} /*scenario 2*/
mark buffer busy; /*scenario 1*/ remove buffer from old hash queue;
remove buffer from free list; put buffer onto new hash queue;
return buffer; return buffer;
} }
}}
7. 1st Scenario
Block is in hash queue, not busy
Choose that buffer
Blkno 0 mod 4 28 4 64
Blkno 1 mod 4 17 5 97
Blkno 2 mod 4 98 50 10
Blkno 3 mod 4 3 35 99
Freelist header
(a) Search for block 4 on first hash queue
8. 1st Scenario
After allocating
Blkno 0 mod 4 28 4 64
Blkno 1 mod 4 17 5 97
Blkno 2 mod 4 98 50 10
Blkno 3 mod 4 3 35 99
Freelist header
(b) Remove block 4 from free list
9. 2nd Scenario
Not in the hash queue and exist free
buff.
Choose one buffer in front of free list
Blkno 0 mod 4 28 4 64
Blkno 1 mod 4 17 5 97
Blkno 2 mod 4 98 50 10
Blkno 3 mod 4 3 35 99
Freelist header
(a) Search for block 18 Not in the cache
10. 2nd Scenario
After allocating
Blkno 0 mod 4 28 4 64
Blkno 1 mod 4 17 5 97
Blkno 2 mod 4 98 50 10 18
Blkno 3 mod 4 35 99
Freelist header
(b) Remove first block from free list, assign to 18
11. 3rd Scenario
Not in the hash queue and there exists delayed
write buffer in the front of free list
Write delayed buffer async. and choose next
Blkno 0 mod 4 28 4 64
Blkno 1 mod 4 17 5 97
delay
Blkno 2 mod 4 98 50 10
Blkno 3 mod 4 3 35 99
delay
Freelist header
(a) Search for block 18, delayed write blocks on free list
12. 3rd Scenario
After allocating
Blkno 0 mod 4 28 64
Blkno 1 mod 4 17 5 97
writing
Blkno 2 mod 4 98 50 10 18
Blkno 3 mod 4 3 35 99
writing
Freelist header
(b) Writing block 3, 5, reassign 4 to 18
13. 4th Scenario
Not in the hash queue and no free buffer
Wait until any buffer free and re-do
Blkno 0 mod 4 28 4 64
Blkno 1 mod 4 17 5 97
Blkno 2 mod 4 98 50 10
Blkno 3 mod 4 3 35 99
Freelist header
(a) Search for block 18, empty free list
14. 4th Scenario
Process A Process B
Cannot find block b
on the hash queue
No buffer on free list
Sleep Cannot find block b
on hash queue
No buffers on free list
Sleep
Somebody frees a buffer: brelse
Takes buffer from free list
Assign to block b
Figure 3.10 Race for free buffer
15. 4th Scenario
What to remind
When process release a buffer wake all
process waiting any buffer cache
16. 5th Scenario
Block is in hash queue, but busy
Wait until usable and re-do
Blkno 0 mod 4 28 4 64
Blkno 1 mod 4 17 5 97
Blkno 2 mod 4 98 50 10
Blkno 3 mod 4 3 35 99
busy
Freelist header
(a) Search for block 99, block busy
17. 5th Scenario
Process A Process B Process C
Allocate buffer
to block b
Find block b
Lock buffer
on hash queue
Initiate I/O
Buffer locked, sleep Sleep waiting for
Sleep until I/O done
any free buffer
( scenario 4 )
I/O done, wake up
brelse(): wake up others
Get buffer previously
assigned to block b
reassign buffer to buffer b
Buffer does not contain
block b
start search again
time
18. Reading Disk Blocks
Use bread()
If the disk block is in buffer cache
How to know the block is in buffer cache
Use getblk()
Return the data without disk access
Else
Calls disk driver to schedule a read request
Sleep
When I/O complete, disk controller interrupts the
Process
Disk interrupt handler awakens the sleeping process
Now process can use wanted data
19. Reading Disk Blocks
algorithm
algorithm
Algorithm bread
Algorithm bread
Input: file system block number
Input: file system block number
Output: buffer containing data
Output: buffer containing data
{{
get buffer for block(algorithm getblk);
get buffer for block(algorithm getblk);
if(buffer data valid)
if(buffer data valid)
return buffer;
return buffer;
initiate disk read;
initiate disk read;
sleep(event disk read complete);
sleep(event disk read complete);
return(buffer);
return(buffer);
}}
20. Reading Disk Blocks
In linux
In linux Block device file
Block device file
High-level block
High-level block
block_read() Device handler
Device handler
block_write()
bread()
breada()
getblk() ll_rw_block()
Buffer cache Low-level block
RAM Buffer cache Low-level block
Device handler
Device handler DISK
Figure 13-3 block device handler architecture for buffer I/O operation
in Understanding the Linux Kernel
21. Reading Disk Blocks
Read Ahead
Improving performance
Read additional block before request
Use breada()
Algorithm
Algorithm
Algorithm breada
Algorithm breada
Input: (1) file system block number for immediate read
Input: (1) file system block number for immediate read
(2) file system block number for asynchronous read
(2) file system block number for asynchronous read
Output: buffer containing data for immediate read
Output: buffer containing data for immediate read
{{
if (first block not in cache){
if (first block not in cache){
get buffer for first block(algorithm getblk);
get buffer for first block(algorithm getblk);
if(buffer data not valid)
if(buffer data not valid)
initiate disk read;
initiate disk read;
}}
22. Reading disk Block
Algorithm-cont
Algorithm-cont
if second block not in cache){
if second block not in cache){
get buffer for second block(algorithm getblk);
get buffer for second block(algorithm getblk);
if(buffer data valid)
if(buffer data valid)
release buffer(algorithm brelse);
release buffer(algorithm brelse);
else
else
initiate disk read;
initiate disk read;
}}
if(first block was originally in cache)
if(first block was originally in cache)
{{
read first block(algorithm bread)
read first block(algorithm bread)
return buffer;
return buffer;
}}
sleep(event first buffer contains valid data);
sleep(event first buffer contains valid data);
return buffer;
return buffer;
}}
23. Writing disk Block
Synchronous write
the calling process goes the sleep awaiting I/O completion and
releases the buffer when awakens.
Asynchronous write
the kernel starts the disk write. The kernel release the buffer
when the I/O completes
Delayed write
The kernel put off the physical write to disk until buffer
reallocated
Look Scenario 3
Relese
Use brelse()
24. Writing Disk Block
algorithm
algorithm
Algorithm bwrite
Algorithm bwrite
Input: buffer
Input: buffer
Output: none
Output: none
{{
Initiate disk write;
Initiate disk write;
if (I/O synchronous){
if (I/O synchronous){
sleep(event I/O complete);
sleep(event I/O complete);
relese buffer(algorithm brelse);
relese buffer(algorithm brelse);
}}
else if (buffer marked for delayed write)
else if (buffer marked for delayed write)
mark buffer to put at head of free list;
mark buffer to put at head of free list;
}}
25. Release Disk Block
algorithm
algorithm
Algorithm brelse
Algorithm brelse
Input: locked buffer
Input: locked buffer
Output: none
Output: none
{{
wakeup all process; event,
wakeup all process; event,
waiting for any buffer to become free;
waiting for any buffer to become free;
wakeup all process; event,
wakeup all process; event,
waiting for this buffer to become free;
waiting for this buffer to become free;
raise processor execution level to allow interrupts;
raise processor execution level to allow interrupts;
if( buffer contents valid and buffer not old)
if( buffer contents valid and buffer not old)
enqueue buffer at end of free list;
enqueue buffer at end of free list;
else
else
enqueue buffer at beginning of free list
enqueue buffer at beginning of free list
lower processor execution level to allow interrupts;
lower processor execution level to allow interrupts;
unlock(buffer);
unlock(buffer);
}}
26. Advantages and Disadvantages
Advantages
Allows uniform disk access
Eliminates the need for special alignment of user buffers
by copying data from user buffers to system buffers,
Reduce the amount of disk traffic
less disk access
Insure file system integrity
one disk block is in only one buffer
Disadvantages
Can be vulnerable to crashes
When delayed write
requires an extra data copy
When reading and writing to and from user processes
27. What happen to buffer until now
Allocated buffer Using getblk() 5 scenarios
Mark busy Preserving integrity
manipulate Using bread, breada, bwrite
release Using brelse algorithm
28. Reference
LINUX KERNEL INTERNALS
Beck, Bohme, Dziadzka, Kunitz, Magnus, Verworner
The Design of the Unix operating system
Maurice j.bach
Understanding the LINUX KERNEL
Bovet, cesati
In linux
Buffer_head : include/linux/fs.h
Bread : fs/buffer.c
Brelse : include/linux/fs.h