OpenBook-V2 Deep Dive—How the limit orders are stored in the form of a tree
OpenBook is a decentralized CLOB on Solana and in this series of blog posts we're going to deep dive into the codebase to figure out how the incoming orders are stored.
So it’s been over a few weeks since I’ve been spending time on this project and figuring out how it works under the hood, but apparently there are no documentations, guides, and absolutely no articles online on going into the technical aspects of this project. So obviously I decided to make one and get more better at rust and finance.
In this particular blog post, I won’t be going into how the order matching works but rather how the orders are stored in the form of a critbit/binary tree, which is the most complex part of the codebase.
Here’s a final drawing I’ve done on the the tree which is filled with orders.
Every market on the openbook dex has two accounts of the struct “BookSide” which holds the tree root and it’s child nodes for both the bid and ask sides.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
pub struct Orderbook<'a> {
pub bids: RefMut<'a, BookSide>,
pub asks: RefMut<'a, BookSide>,
}
#[account(zero_copy)]
pub struct BookSide {
pub roots: [OrderTreeRoot; 2],
pub reserved_roots: [OrderTreeRoot; 4],
pub reserved: [u8; 256],
pub nodes: OrderTreeNodes,
}
#[zero_copy]
pub struct OrderTreeNodes {
pub order_tree_type: u8, // OrderTreeType, but that's not POD
pub padding: [u8; 3],
pub bump_index: u32,
pub free_list_len: u32,
pub free_list_head: NodeHandle,
pub reserved: [u8; 512],
pub nodes: [AnyNode; MAX_ORDERTREE_NODES],
}
The nodes array consists of either leaf or inner nodes , where the leaf represents the actual limit order.
(Removed the other parmas which are not needed for the post)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
pub struct LeafNode {
pub tag: u8,
pub key: u128,
pub owner: Pubkey,
pub quantity: i64,
}
pub struct InnerNode {
pub tag: u8,
pub prefix_len: u32,
pub key: u128,
pub children: [NodeHandle; 2],
}
Every node is assigned a key by encoding the price data and the sequence number which is a value that keeps incrementing.
1
2
3
4
fn new_node_key(price_data: u64, seq_num: u64) -> u128 {
let upper = (price_data as u128) << 64;
upper | (seq_num as u128)
}
Eg: $35 with a sequence number of 1
The price is shifted 64 bits to the left and the sequence number is encoded in the end.
The key: 1000110000000000000000000000000000000000000000000000000000000000000001
The ordertree.rs
consists of the main functions which performs the operations on the tree.
The insert() method modifies the nodes array and returns a handle.
Now we’ll be going into the main function of the tree which is the insert_leaf() method which is what structures the tree and really a confusion the lead me to write this post.
Okay before that let’s see how xoring and leading zeroes is used to compare which value is greater.
4 -> 100
5 -> 101
6 -> 110
7 -> 111
Let’s compare 4 and 5 , if we xor the value is 001
The count of leading zeroes is 2 and the next bit is used to see which of the value is greater. Or just remove the matching values and compare the next bit, if it has 1 then the particular value is greater.
4 - 000
5 - 001 <- Greater
Now 5 and 6
5 - 101
6 - 110
Checking:
5 - 001
6 - 010
The next bit right after the leading zeroes of the xor’d value of 6 is 1 , making it the greater value.
Now let’s get into the insert_leaf() function:
- First it checks if the root node is empty.
1
2
3
4
5
6
7
8
9
10
let mut parent_handle: NodeHandle = match root.node() {
Some(h) => h,
None => {
// create a new root if none exists
let handle = self.insert(new_leaf.as_ref())?;
root.maybe_node = handle;
root.leaf_count = 1;
return Ok((handle, None));
}
};
The insert() method creates a new leaf node and appends it to the nodes array and returns a handle.
- Now if the root exists and we append a value.
There is a loop in the function which sets the parent of where our new price data should be a child of.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
loop {
// require if the new node will be a child of the root
let parent_contents = *self.node(parent_handle).unwrap();
let parent_key = parent_contents.key().unwrap();
let shared_prefix_len: u32 = (parent_key ^ new_leaf.key).leading_zeros();
match parent_contents.case() {
None => unreachable!(),
Some(NodeRef::Inner(inner)) => {
let keep_old_parent = shared_prefix_len >= inner.prefix_len;
if keep_old_parent {
let (child, crit_bit) = inner.walk_down(new_leaf.key);
stack.push((parent_handle, crit_bit));
parent_handle = child;
continue;
};
}
_ => (),
};
This specific line was what confused me for literal 3 days until I figured it out:
let keep_old_parent = shared_prefix_len >= inner.prefix_len
The shared prefix len is the count of current leading zeroes of the new_leaf and the parent.
Before that here’s a simple script that helps us understand why this is a thing.
(Change the 25 and experiment)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
fn main() {
let order_one_num = 25;
let mut order_two_num = 0;
for i in (0..100).rev() {
if i == order_one_num {
println!("----------");
continue;
}
order_two_num = i;
let order_one = new_node_key(order_one_num, 1);
let order_two = new_node_key(order_two_num, 1);
let shared_prefix_len1: u32 = (order_one ^ order_two).leading_zeros();
println!("The shared prefix len {} & {}: {}", order_one_num, order_two_num, shared_prefix_len1);
let crit_bit_mask: u128 = 1u128 << (127 - shared_prefix_len1);
let new_leaf_crit_bit = (crit_bit_mask & order_two) != 0;
let _old_parent_crit_bit = !new_leaf_crit_bit;
if new_leaf_crit_bit {
//println!("The bigger is order two") ;
} else {
//println!("The bigger is order one");
}
}
}
Which outputs:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
The shared prefix len 25 & 49: 58
The shared prefix len 25 & 48: 58
The shared prefix len 25 & 47: 58
The shared prefix len 25 & 46: 58
The shared prefix len 25 & 45: 58
The shared prefix len 25 & 44: 58
The shared prefix len 25 & 43: 58
The shared prefix len 25 & 42: 58
The shared prefix len 25 & 41: 58
The shared prefix len 25 & 40: 58
The shared prefix len 25 & 39: 58
The shared prefix len 25 & 38: 58
The shared prefix len 25 & 37: 58
The shared prefix len 25 & 36: 58
The shared prefix len 25 & 35: 58
The shared prefix len 25 & 34: 58
The shared prefix len 25 & 33: 58
The shared prefix len 25 & 32: 58
The shared prefix len 25 & 31: 61
The shared prefix len 25 & 30: 61
The shared prefix len 25 & 29: 61
The shared prefix len 25 & 28: 61
The shared prefix len 25 & 27: 62
The shared prefix len 25 & 26: 62
----------
The shared prefix len 25 & 24: 63
The shared prefix len 25 & 23: 60
The shared prefix len 25 & 22: 60
The shared prefix len 25 & 21: 60
The shared prefix len 25 & 20: 60
The shared prefix len 25 & 19: 60
The shared prefix len 25 & 18: 60
The shared prefix len 25 & 17: 60
The shared prefix len 25 & 16: 60
The shared prefix len 25 & 15: 59
The shared prefix len 25 & 14: 59
The shared prefix len 25 & 13: 59
The shared prefix len 25 & 12: 59
The shared prefix len 25 & 11: 59
The shared prefix len 25 & 10: 59
The shared prefix len 25 & 9: 59
The shared prefix len 25 & 8: 59
The shared prefix len 25 & 7: 59
The shared prefix len 25 & 6: 59
The shared prefix len 25 & 5: 59
The shared prefix len 25 & 4: 59
The shared prefix len 25 & 3: 59
The shared prefix len 25 & 2: 59
The shared prefix len 25 & 1: 59
The shared prefix len 25 & 0: 59
As we can see that for values greater than 25 the values keeps on decreasing and for the values less than 25 the values keeps on increasing.
THIS IS EXTREMELY IMPORTANT.
Aight let’s leave that for a bit.
Here we have this line:
1
let (child, crit_bit) = inner.walk_down(new_leaf.key)
Which is:
1
2
3
4
5
fn walk_down(&self, search_key: u128) -> (NodeHandle, bool) {
let crit_bit_mask = 1u128 << (127 - self.prefix_len);
let crit_bit = (search_key & crit_bit_mask) != 0;
(self.children[crit_bit as usize], crit_bit)
}
Bigger picture: Say our inner node has the children 6 and 7 , the question is where should 4 be placed? It should be 6 and here’s how:
Like what is happening here?
The confusing part of this function is that how does using the new leaf value for the existing children doing anything?
And here’s the answer:
1
2
3
4
5
6
7
8
>>> bin(4)
'0b100'
>>> bin(5)
'0b101'
>>> bin(6)
'0b110'
>>> bin(7)
'0b111'
Say our inner node is of 5 and 6 , the shared prefix len is 1.
Now say the $4 order comes in and it matched the condition and went into the node , take the shared prefix len and check the next bit value.
For 4 (100) after the first bit it’s 0 , so 4 is going with the 0th child which is 5
Now 7 (111) after the first bit it’s 1, so 4 is going with the 1st child which is 6.
Now let’s go back to our insert_leaf function.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
let crit_bit_mask: u128 = 1u128 << (127 - shared_prefix_len);
let new_leaf_crit_bit = (crit_bit_mask & new_leaf.key) != 0;
let old_parent_crit_bit = !new_leaf_crit_bit;
let new_leaf_handle = self.insert(new_leaf.as_ref())?;
let moved_parent_handle = match self.insert(&parent_contents) {
Ok(h) => h,
Err(e) => {
self.remove(new_leaf_handle).unwrap();
return Err(e);
}
};
let new_parent: &mut InnerNode = cast_mut(self.node_mut(parent_handle).unwrap());
*new_parent = InnerNode::new(shared_prefix_len, new_leaf.key);
new_parent.children[new_leaf_crit_bit as usize] = new_leaf_handle;
new_parent.children[old_parent_crit_bit as usize] = moved_parent_handle;
let new_leaf_expiry = new_leaf.expiry();
let old_parent_expiry = parent_contents.earliest_expiry();
new_parent.child_earliest_expiry[new_leaf_crit_bit as usize] = new_leaf_expiry;
new_parent.child_earliest_expiry[old_parent_crit_bit as usize] = old_parent_expiry;
It’s creating a new Inner node as such that the right will have the greatest value and the left having the least, meaning if we keep on going to one particular side (from children to children) will hit the highest of the leafs (on either side). Left and right depends on the asks or bids.
Now I’ll explain the whole thing with an example which will make the whole thing clear.
(Assume the sequence number is 1 for all.)
An order of $5 comes in.
- Currently the root node is empty.
- A new leaf is created with $5 as the value and added to the nodes array.
- Order for $4 comes in, a new inner node is created with 4 and 5 as the children and 4 being the key for the inner node.
- Order for $6 comes in, The shared prefix len for 4 and 5 is 63 and for 4 and 6 is 62 , so here the inner prefix len is greater than the shared , meaning there is no going inside one of the children, a new inner node is created with $6 being the greater child and the other child being the inner node consisting of 4 and 5. The new inner node being the root with 6 as the key value.
- Another order for 7 comes in , inner prefix len for 6 and 4 is 62 and for 6 and 7 is 63 , here the shared prefix len is greater than the inner prefix len , so we go inside one of the the node , here 7 is greater so we go to the right side where there is just 6 as the leaf node.
- A new inner node is created(replaced) there with 6 and 7 being the children.
(See the image at the starting of the post to get a visual intuition.)
So unless we hit a place where the inner prefix len is greater we keep on going into the children.
For increasing:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// $25
// $27
// $28
// $29
// $31
// $32
// $34
// $35
// $37
// $45
// $25 and $40 | $25 and $27
// 58 62
// $27 and $40 | $27 and $28
// 58 61
// $28 and $40 | $28 and $29
// 58 63
// $29 and $40 | $29 and $31
// 58 62
// $31 and $40 | $31 and $32
// 58 58
// $32 and $40 | $32 and $34
// 60 62
// $34 and $40 | $34 and $35
// 60 63
// $35 and $40 | $35 and $37
// 60 61
// $37 and $40 | $37 and $45
// 62 60
We’re going to add $40 and we can see that it breaks at the node of $45 where the value becomes greater.
For decreasing:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// $25
// $21
// $18
// $15
// $9
// $8
// $5
// $3
// $25 and $4 | $25 and $21
// 59 60
// $21 and $4 | $21 and $18
// 59 61
// $18 and $4 | $18 and $15
// 59 59
// $15 and $4 | $15 and $9
// 60 61
// $9 and $4 | $9 and $8
// 60 63
// $8 and $4 | $8 and $5
// 60 60
// $5 and $4 | $5 and $3
// 63 61
//
Inserting $4 and we can see that it breaks at $3.
If you wanna directly experiment with the tree use this repo: https://github.com/pvnotpv/openbook-v2-order-tree