Abser,来自技术猫 techcats,技术猫使用 语雀 作为公开知识的平台。
还在篮子里

How boltdb Write its Data? · 语雀

How boltdb Write its Data?

A Ha!

Here’re three questions during reading the source code of BoltDB. I’ll explain our testing procedure to dive into the core of BoltDB writing machanism?

Code link: yhyddr/quicksilver

First

  • First, mentor asked me :”Did BoltDB have a temporary file when starting a read/write transaction”

Following the quickstart, all data is stored in a file on disk.

so have this question , what happened when we store or view data on this file.

mentor meaned that some methods to change file like tempory file method ,have a copy file to update. When some operations end , it file would replace origin file to realize store/commit.

Let’s check if boltdb use this method.

FileSystem Notify

I heard it from my mentor about how to watch files or diretories by simple go file.

use this package fsnotify/fsnotify on github.

I write a small go binary file to log current directory’s file’s change.

package main

import (
  "log"

  "github.com/fsnotify/fsnotify"
)

func main() {
  hang := make(chan bool)
  watcher, err := fsnotify.NewWatcher()
  if err != nil {
    log.Fatal(err)
  }

  watcher.Add("./")

  go func() {
    for {
      select {
      case e := <-watcher.Events:
        log.Println(e.Op.String(), e.Name)
      case err := <-watcher.Errors:
        log.Println(err)
      }
    }
  }()

  <-hang
}

use upon code and go build then we have fsnotify (I rename my binary file this name)

it could tell us what happened on current directory.If you know how to wirte go code , you could change it to watch any file or directory you like by yoursely.

I’ll show how it works

Update Boltdb file

then we need write code to store some data.This way we update boltdb file and know if there has a tempory file to replace origin file. Because it’s obviously that boltdb as simple kv database seems do not have access on system directory but just current directory and the database file.

We Write code below to insert data :

package main

import (
  "encoding/binary"
  "log"
  "os"
  "time"

  bolt "go.etcd.io/bbolt"
)

var InsertNum int = 15000

func main() {
  hang := make(chan bool)
  db, err := bolt.Open("./data.db", 0600, nil)
  if err != nil {
    log.Fatal(err)
  }

  go func() {
    times := time.Now()
    for {
      db.Update(func(tx *bolt.Tx) error {
        b, err := tx.CreateBucketIfNotExists([]byte("cats"))
        if err != nil {
          return err
        }

        num, err := b.NextSequence()
        log.Println(num)
        byteid := make([]byte, 8)
        binary.BigEndian.PutUint64(byteid, num)

        b.Put(byteid, byteid)

        if num == InsertNum {
          log.Println(time.Now().Sub(times))
          os.Exit(0)
        }
        return nil
      })
    }

  }()

  <-hang
}

attention I use variable InsertNum controll how many data could be inserted.

then we run it.(sure you need run fsnotify first.)

image.png

we inserted 900 data and it cost us 20s.find that it’s just write data.db file. It means that we seems to have anwser to the first question : No!

In addition , we also could get the answer by source code.

Second

We have known the first question’s answer, mentor quickly asked me the second.

  • How much data or Which size could a boltdb file store ?

we want know how much.

let’s find it from source code. BoltDB declare these sizes on the head of go file

image.png

and we got the miniFillPercent and maxFillPercent are 0.1 and 1.0, in db.go we also know about boltdb page size is determined by OS page size.

image.png

It means a page may be used some part.

important that maxMmanpstep = 1 <<30 is 1 GB for remapping.

we have know there’s no temporary file that means BoltDB use memory or called Mmap to host temporary data.

use Mmap means , your BoltDB File should not bigger than your assignable memory space.

Third

From upon questions , we know a lot about bolddb. But mentor still have a question:

  • in a node , author set 50% capcity limit otherwise spill it to two nodes.Do our datafile just use 50% space what a boltdb database file hold?

image.png

It’s because the stragecy about node spill method to controll size of node do not overflow OS page size.

that question need we know about the disk layout of BoltDB file . I’ll detailed explanation in my next article. Of course, we know the space ratio should not over 0.5.

and now I just give some data for you.

I first insert 100000 data which is auto increment key and the same value.

1 -> 100000 total 977784 b because of BoltDB store in file as bytes.

and we found our data file use 8.5m space.

then I insert 100000 again.

image.png

100000 -> 200000 and file expand to 25.2m when I insert nearly 110000th data.

And at last file is 25.2m when hosted 200000 data.

image.png

来源: How boltdb Write its Data? · 语雀

发表评论

电子邮件地址不会被公开。 必填项已用*标注