SimpleDB (Part 2): Memory Management
Contents
(Note: This post is part of the SimpleDB series. Please check the first post if you haven’t already)
Intro
Accessing data from disk is dramatically slower than from RAM - by a factor of 100,000. That’s why memory management matters.
For any data processing task, it’s the rate limiting factor.
In this post we’ll focus on implementing key parts of the BufferManager
to show how it helps minimize disk access. For the full implementation, see this branch.
How does a DB engine try to minimize disk access? When a client requests data, the goal is for it to already be in memory, bypassing the slow disk retrieval.
This sounds like a cache!
Buffer: Your DB’s memory
In SimpleDB, we’re going to implement a buffer to hold the content of a block in memory. It also contains information like if the buffer is pinned, what block it is assigned to, etc.
But first, let’s look at the struct of our buffer.
The main things to see here are contents
which contains the page’s data, and block
which is a reference to the BlockId
. pins
refers to if the buffer’s contents are currently in used by any client. We will use txnum
and lsn
in later chapter when handling transaction management.
Now that we understand the structure of the buffer, what should it do?
If it holds page contents in memory, what happens if a client needs to make changes to the page? Clients change page contents, prompting a write to disk for permanence. But our goal is to minimize disk access, which includes disk writes.
Our buffer strategy to minimize disk writes involves only two scenarios where we would write to disk:
- when the page is getting replaced because the buffer is getting pinned to another block
- when the recovery manager needs to write the page to disk to guard against possible system crash
We’ll look at two important functions for the buffer, which are really read
and write
.
First, assign_to_block
, which reads the contents of the specified block into the contents of the buffer.
Notice how we first flush the existing data. If the buffer had existing page content and it was already modified (dirty), then it needs to be written to disk before it gets assigned a new page.
Once the existing page is flushed to disk, then it will take a new block, and read its contents into the buffer.
Now let’s look at flush
.
- First, we check
self.txnum >= 0
. This is our way of indicating if the buffer is involved in a transaction and has modified content - If there is a a block in the buffer, we have the file manager write our buffer’s contents to the block on disk
- Finally, we change the buffer’s
txnum
to -1 to indicate that our buffer has already been written to disk
For the full implementation of BufferPage
and tests, see this.
With our buffer implementation complete, we need a way to manage multiple buffers efficiently. This is where the buffer manager comes in.
Buffer Management strategies
The buffer manager organizes a pool of buffers where some buffers might be pinned (in use by a client) or unpinned (free to take).
When a client requests data from the buffer manager, let’s see what might happen:
- contents of the block are in a buffer
- page is pinned
- page is unpinned
- contents of the block are not in the buffer
- all buffers in the buffer pool are pinned (client is put on wait list)
- there exists at least one unpinned buffer in the buffer pool
If the contents of the block are in a buffer, then it’s simple. We can just reuse the page.
If the contents of the block are not in the buffer and there’s at least one unpinned buffer in the pool, then we can use that buffer and pin it with our block.
So that means the only complicated scenario is when all the buffers in the pool are pinned. If they are all pinned, we need to choose a buffer to replace.
There are 4 potential replacement strategies:
- naive
- Choose the first unpinned buffer. But then you run into the problem of not evenly using the buffer pool.
- FIFO (first in first out)
- Choose the buffer that has been in the pool the longest (oldest ‘add’ time). The problem with this is that frequently used buffers are often among the ones that are in the pool longest.
- LRU (least recently used)
- Choose the buffer that has not been used for the longest period. This is an effective general strategy since it avoids replacing commonly used pages.
- clock:
- Starts scanning at the page after the previous replacement, and chooses the first unpinned page it finds. Attempts to use the buffers as evenly as possible. Since frequently used buffers will usually be pinned, it has a higher chance of skipping over those.
Let’s move onto the implementation. First, we’ll define the struct.
Our buffer_pool
is a vector of buffers. num_available
refers to the number of buffers that are unpinned. And max_time
refers to how long a client will have to wait for an available buffer before receiving an error.
As for the most important functions for the BufferManager
, they are pin
and unpin
.
pin
will pin a buffer to the specified block, potentially waiting max_time
until a buffer becomes available. If no buffer is available after that time period, then an error is thrown.
try_to_pin
will pin a buffer to the specified block. But if there’s already a buffer assigned to the block, then that buffer is used. Otherwise, it’ll try to choose an unpinned buffer.
choose_unpinned_buffer
is where we define the replacement strategy. As mentioned above, there are 4 we could potentially use. For SimpleDB, we’ll go with the naive approach.
This simply looks for the first unpinned buffer (which does not evenly use all the buffers in the pool).
Finally, we’re going to implement unpin
.
It’ll call the unpin
function on buffer, which decrements the pin count. If the buffer’s pin count goes to 0, it means that there are no active clients using that buffer. If that’s the case, we can increment the number of available buffers by 1. This function shows the clear relationship between the number of available/unpinned buffers and the action of unpinning.
With these two functions implemented, pin
and unpin
, we’ve got the basics down for our buffer manager.
For the full implementation and tests, see this.
Summary
In this second part of our SimpleDB series, we looked at memory management, specifically focusing on minimizing disk access. We implemented two main components: the buffer and the buffer manager.
The buffer holds block content in memory and handles page operations. We made it minimize disk writes only when necessary: during page replacement or when the recovery manager requires it.
The buffer manager handles a pool of buffers and the complex logic around buffer replacement and pinning. We implemented two crucial functions: pin
and unpin
.
If you’d like to see the full implementation along with the tests, visit the repo here.
Have some thoughts on this post? Reply with an email.