SimpleDB (Part 1): File Manager


Every time you query a database, a complex series of actions begin behind the scenes. I’d like to peek behind the curtain and understand how databases work internally.

Recently I’ve been reading Edward Sciore’s Database Design and Implementation. In this series, I’ll try to answer this question using a Rust implementation of SimpleDB.

What we’ll cover

In this post in particular, we’ll build the foundation of a database system by implementing two core components: file management and page handling.

Please see the repo for the full implementation.

note: I am beginner in rust, so if you see anything that needs improvement, please let me know.

Database storage

There are two ways a database system could potentially access data. If you think of it like a library:

In a block-level interface, there is the concept of a block, which is mapped to several sectors of the disk. In order to modify the disk:

On the other hand, a file-level interface is a higher level abstraction. The client views the file as a sequence of bytes, with no notion of a block. You can also read/write any number of bytes starting at any position in the file.

Most database engines use a compromise. They store all their data in one or more OS files, and treats each file as a raw ‘disk’. The database engine will access each ‘disk’ using logical file blocks. A logical file block tells you where the block is with respect to the file, but not where the block is on the disk. In comparison to a physical block reference that tells you where the block is on the disk.

The OS takes on the responsibility of mapping the logical block reference to the corresponding physical block. This gives us the best of both worlds: the convenience of file operations with the precision of block-level control.

Implementing core components

Database interface

First, let’s create our main database interface.

Here is the test case that we want to pass. We just want to test that the path we pass in exists and is a directory. Note that we’re using 400 as the block size and 8 as buffer size because Sciore recommends this for learning purposes. Real world database systems use much larger numbers.

tests/db_tests.rs
use crate::simpledb::SimpleDB;
use tempfile::TempDir;
#[test]
fn test_simpledb_creation() {
let temp_dir = tempDir::new().unwrap();
let temp_path = temp_dir.path();
let _db = SimpleDB::new(temp_path, 400, 8).unwrap();
assert!(temp_path.exists());
assert!(temp_path.is_dir());
}

Our SimpleDB struct will provide the entry point for all database interactions.

src/db.rs
use crate::file::FileManager;
use std::path::Path;
pub struct SimpleDB {
file_manager: FileManager,
}
impl SimpleDB {
pub const BLOCK_SIZE: usize = 400;
pub const BUFFER_SIZE: u32 = 8;
pub const LOG_FILE: &'static str = "simpledb.log"
pub fn new(
dirname: impl AsRef<Path>,
block_size: usize,
buffer_size: u32,
) -> std::io {
let file_manager = FileManager::new(dirname, block_size)?;
Ok(SimpleDB { file_manager })
}
pub fn file_manager(&self) -> &FileManager {
&self.file_manager
}
}

Managing files

The FileManager is our bridge to the operating system. It handles three key responsibilities:

  1. Creating and managing the database directory
  2. Tracking open files
  3. Reading and writing blocks of data to the Page

Here’s the basic structure.

src/file/manager.rs
use std::{
collections::HashMap,
fs::{self, File, OpenOptions},
io::{self, Read, Seek, SeekFrom, Write},
path::{Path, PathBuf},
sync::Mutex,
}
use crate::file::{BlockId, Page}
pub struct FileManager {
db_directory: PathBuf,
block_size: usize,
is_new: bool,
open_files: Mutex<HashMap<String, File>>,
}

When creating a new FileManager, we need to:

src/file/manager.rs
impl FileManager {
pub fn new(db_directory: impl AsRef<Path>, block_size: usize) -> io::Result<Self> {
let db_directory = db_directory.as_ref().to_path_buf();
let is_new = !db_directory.exists();
if is_new {
fs::create_dir_all(&db_directory)?;
}
// Clean up temp files
for let Ok(entries) = fs::read_dir(&db_directory) {
for entry in entries.flatten() {
let filename = entry.file_name();
if filename.to_string_lossy().starts_with("temp") {
let _ = fs::remove_file(entry.path());
}
}
Ok(Self {
db_directory,
block_size,
is_new,
open_files: Mutex::new(HashMap::new()),
})
}
}

Note that we’re also using Mutex to provide thread-safe access to the open_files HashMap. The FileManager might be accessed from multiple threads in the application, so Mutex ensures that only one thread can access the HashMap at any one time.

Working with Blocks and Pages

To understand how data is stored and retrieved, we need to understand these two concepts:

Here’s how they work together:

Implementing BlockId

src/file/block.rs
pub struct BlockId {
filename: String,
number: u64
}
impl BlockId {
pub fn new(filename: impl Into<String>, number: u64) -> Self {
Self {
filename: filename.into(),
number,
}
}
}

Implementing Page

The Page will have the following functions:

src/file/page.rs
use std::convert::TryInto;
pub struct Page {
buffer: Vec<u8>,
}
impl Page {
pub fn new(block_size: usize) -> Self {
Self {
buffer: vec![0; block_size],
}
}
pub fn from_bytes(bytes: Vec<u8>) -> Self {
Self { buffer: bytes }
}
pub fn get_int(&self, offset: usize) -> i32 {
let bytes = &self.buffer[offset.. offset + 4];
i32::from_be_bytes(bytes.try_into().unwrap())
}
pub fn set_int(&self, offset: usize, value: i32) {
let bytes = value.to_be_bytes();
self.buffer[offset..offset + 4].copy_from_slice(&bytes);
}
// Returns a mutable slice for writing
pub(crate) fn contents(&mut self) -> &mut [u8] {
&mut self.buffer[..]
}
// pub fn get_bytes
// pub fn set_bytes
// pub fn get_string
// pub fn set_string
}

Now that we have our BlockId and Page implementations, we have the building blocks to finish the read and write functions in our FileManager.

Reading data

We want read to:

src/file/manager.rs
impl FileManager {
// ...
pub fn read(&self, block: &BlockId, page: &mut Page) -> io::Result<()> {
let file = block.file_name();
let offset = block.number() * self.block_size as u64;
file.seek(SeekFrom::Start(offset))?;
let buf = page.contents();
file.read_exact(buf);
Ok(())
}
}

Writing data

And the write function does something similar, but writes to the file using the page’s buffer.

src/file/manager.rs
impl FileManager {
// ...
pub fn write(&self, block: &BlockId, page: &mut Page) -> io::Result<()> {
let file = block.file_name();
let offset = block.number() & self.block_size as u64;
file.seek(SeekFrom::Start(offset));
file.write_all(page.contents());
file.sync_data();
Ok(())
}
}

Testing read and write

Now we can write a test to make sure that read/write work as we expect.

src/file/manager.rs
// ...
#[cfg(test)]
mod tests {
use super::*;
use tempfile::TempDir;
fn setup() -> (TempDir, FileManager) {
let temp_dir = TempDir::new().unwrap();
let fm = FileManager::new(temp_dir.path(), 400).unwrap();
(temp_dir, fm)
}
#[test]
fn test_read_write_basic() {
let (_temp_dir, fm) = setup();
let block = BlockId::new("test.dat".to_string(), 0);
// Write some data
let mut write_page = Page::new(400);
write_page.contens()[0..5].copy_from_slice(b"hello");
fm.write(&block, &mut write_page).unwrap();
// Read it back
let mut read_page = Page::new(400);
fm.read(&block, &mut read_page).unwrap();
assert_eq!(&read_page.contents()[0..5], b"hello");
}
}

With this, we now have a working implementation of a FileManager that interacts with the OS file system and Pager which contains the contents of each block of our ‘disk’ (file).

What we’ve built

In this first part, we’ve implemented these fundamental building blocks:

Next

The next chapter will deal with transaction management.


Have some thoughts on this post? Reply with an email.

If you're interested in updates, you can subscribe below or via the RSS feed

Powered by Buttondown.