B-trees

were originally designed to work on direct access secondary storage (Magnetic

disk). In a computer system the memory capacity is consists of two parts: primary

memory (uses memory chips) and secondary storage (based on magnetic disk). When

dealing with large datasets, it is often impossible to maintain a whole dataset

in the primary memory. Instead, a small part of the dataset is maintained in

the primary memory and the remaining data is read from a secondary storage 1.

B-tree

is a data structure used for sorting huge datasets for fast retrieval from a

disk. A typical B-tree has a single-digit height, even with millions of entries

2. Which means

that few disk accesses are needed to find the place where data is stored in the

tree. At most two accesses are required to search for any node. “From a

practical point of view, B-trees, therefore, guarantee an access time of less

than 10 ms even for extremely large datasets.” —Dr. Rudolf Bayer,

inventor of the B-tree

1. Description

B-trees

save time by using nodes with many branches (children). Unlike binary tree, in

which each node has only two children. When a node has many children a record

can be found easily by passing through fewer nodes than if there are two

children per node. See the example in Figure 1.

The example

in Figure 1 shows that the red node in a binary tree has depth of three while

the corresponding node in the B-tree has depth of one. Clearly, B-trees locates

desired records faster, assuming that all system parameters are identical. The

difference in depth between binary tree and B-tree is greater in a practical

database than in the example in Figure 1. Real world B-trees height depends on

the number of records in the database.

The

tradeoff is that the decision process at each node in B-tree is more

complicated compared with a binary tree. A complex program is required to

execute B-tree’s operations, but this program runs fast because it is stored in

RAM.

1.1 Definition

A

B-tree of order m (the maximum number of children for each node) is a

self-balancing search tree which satisfies the following properties:

· Every node has at most m children.

· A non-leaf node with k children contains k-1 keys.

· The root has at least two children if it is not a leaf node.

· Every non-leaf node (except root) has at least m?2 children.

· All leaves appear in the same level.

Figure 2: A B-tree of order

four.

Figure

2 shows a B-tree of order four. Each node contains up to three keys, and

internal nodes have up to four children 3.

The height

of an n-key b-tree T of height h with a minimum degree t greater than or equal

to 2,

, For n

The

worst case height is O(log n). Since the branches of a B-tree can be large

compared to other balanced trees, the base of the algorithm tends to be large. But

the number of visited nodes during a search tends to be smaller than required

by other trees. Although this is does not affect the worst case height, B-tree

tends to has smaller heights than other trees 4.

1.2

Basic operations

on B-trees

Conventions:

·

The root of a B-tree is in the main memory (it is

not required to perform Disk-Read operation on the root).

·

Any node passed as a parameter must have had a Disk-Read

operation performed on them.

·

Top down algorithm, starts at the root of the tree.

1.2.1 Search

Searching

a B-tree is similar to searching a binary tree but instead of making two way branching

decision at a node, must make multi-way branching decision depending on the

number of a node’s children. To search for a node in a B-tree, a linear search

must be performed on the values of a node. After finding the value which is greater than or equal to the desired

value, the child pointer to the immediate left of that value is followed. If

all values are less than the desired value, the rightmost child pointer is

followed. The search is terminated as soon as the desired node is found 5.

Inputs: is a pointer

to the root node of a sub-tree.

is a key to be searched in the sub-tree.

is a child pointer 1.

o B-Tree-Search pseudo-code

B-Tree-Search(x,

k)

1 i ? 1

2 while i ? nx and k > keyix

3 do i ? i + 1

4 if i ? nx and k == keyix

5 then return (x, i)

6 if leafx

7 then return NIL

8 else Disk-Read(cix)

9

return B-Tree-Search(cix, k)

Lines 1-3 : Using a linear

search procedure, find the smallest i

such

that k ? keyix, or else they set i to nx

+ 1

Lines 4-5 : Check if the key

is discovered, then return it.

Lines 6-9 : Either end the

search unsuccessfully if x is

a leaf or recurs to search the appropriate sub-tree of x, after performing necessary Disk-Read operation on that child 6.

o B-Tree-Search Running

Time

The number of disk pages accessed by B-Tree-Search

is ?(h) = (logt n), where h is the height of

the B-tree and n is the number of keys in the B-tree. is O(logt n). The while loop time in lines

2-3 for each node is O(t). The running time is O(th)

= O(t logt n) 6.

o B-Tree-Search Examples ( Consider the B-tree in Figure 2 page )

Example 1 : Search for key = 7.

By applying

B-Tree-Search:

Starts at

the root (17), is key = 17? Is key > 17? No, go to left sub-tree.

Next node (4,9), is key = 4? Is key > 4? Yes, move to the next key.

Is key = 9? Is key > 9? No, go to sub-tree left of 9.

Next node (7,8), is key = 7? Yes, found.

Example 2 : Search for key = 11.

By applying B-Tree-Search:

Starts at the root (17), is key = 17? Is key

> 17? No, go to left sub-tree.

Next node (4,9), is key = 4? Is key > 4? Yes, move to the next key.

Is key = 9? Is key > 9? Yes, go to the right sub-tree (there is no

more keys in the node to check).

Next node (12), is key = 12? Is key >12? No, the left sub-tree is a

leaf. Which means that 11 is not in the tree.

1.2.2 Create

B-Tree-Create operation builds a B-tree by first creating a new empty root node and

then call B-Tree-Insert to add new keys. B-Tree-Create and B-Tree-Insert

operations are both using Allocate-Node which allocates

one disk page to be used as a new node in O(1) time. When a node is created by Allocate-Node it does not require Disk-Read,

because there is no useful information stored on the disk for that node yet.

o B-Tree-Create pseudo-code

B-Tree-Create(T)

1

x ? Allocate-Node()

2

leafx ? TRUE

3

nx ? 0

4

Disk-Write(x)

5 rootT ?

x

o B-Tree-Create Running

Time is O(1).

1.2.3 Split a node

When

a node becomes full it is necessary to perform split operation. B-Tree-Split-Child moves the median key of node x up into its parent y (it has to be non-full node), where x is

the ith child of y. All keys in x right

of the median key are moved to the new node. The keys left of the median key

remain in the original node x.

The new node becomes the right child of the median key that was

moved to its parent y. The original node x

becomes the left child of the median key that was moved to its

parent y 5.

Inputs: is a non-full

internal node.

is an index.

is

the ith child of x and is the node being split.

Node y originally has 2t – 1 children but is

reduced to t – 1 children by this operation.

is the new child of x.

o B-Tree-Split-Child pseudo-code

B-Tree-Split-Child(x,

i, y)

1

z ? Allocate-Node()

2

leafz ? leafy

3

nz ? t – 1

4

for j ? 1 to t – 1

5

do

keyjz ? keyj+ty

6

if not leafy

7

then

for j ? 1 to t

8

do

cjz ? cj+ty

9

ny ? t – 1

10

for j ? nx + 1 downto i + 1

11

do

cj+1x ? cjx

12

ci+1 ? z

13

for j ? nx downto i

14

do

keyj+1x ? keyjx

15

keyix ? keyty

16

nx ? nx + 1

17

Disk-Write(y)

18

Disk-Write(z)

19 Disk-Write(x)

Lines 1-8 : Create a new node z and give it the larger t –

1 keys and corresponding t children of y.

Line 9 : Adjusts the key count for y.

Lines 10-16 : Insert z as a child of x. To separate y from z,

move the median key from y up to x, and adjust the

key count for x.

Lines 17-19 : write out all modified disk page 6.

o B-Tree-Split-Child Running

Time is ?(t).