Now that we know that we want to store our packets in a hash tree, we still have to choose what exactly we are going to implement. This post explains the different parameters we have to consider to make that choice.
What’s our memory?
A WaDeD board has five types of memory we can access, ranging from the smallest to the largest:
- MCU backup registers (80 bytes): these are used to save the registers when entering standby mode. Reading and writing is extremely fast, but this is not multi-purpose memory, and it is not addressable.
- MCU internal EEPROM (4 KB ranging from 0x0808 0000 to 0x0808 0FFF): non-volatile memory, addressable by octet. Writing is slow (up to 3.94 ms), reading is fast. Should be used only for data that we want to keep over the long run and not update frequently.
- MCU internal RAM (16 KB ranging from 0x2000 0000 to 0x2000 3FFF): volatile memory, addressable by octet. Reading and writing is fast. This is our main RAM, and, with the exception of the backup registers, the fastest memory we can access.
- MCU internal flash (128 KB ranging from 0x0800 0000 to 0x0801 FFFF): non-volatile memory, addressable by page. Reading and writing are slow. In normal use, this is where we will store the firmware.
- External FRAM (256 KB): non-volatile memory, addressable by octet over the SPI2 bus. The FRAM can work with a frequency up to 40MHz, so it is only limited by the working frequency of the MCU, which is set at 1MHz. This means access in microseconds, a thousand times faster than flash or EEPROM, but a thousand time slower than RAM.
The RAM is therefore the best memory to use for frequent R/W operations, followed by the FRAM. It is not reasonable to expect frequent writing on the flash and EEPROM, so the normal use of the software should treat those zones as read-only.
What does a tree weight?
It is obvious that we cannot stock the messages in the RAM: we need to store them on non-volatile memory, and there is not enough space to store a satisfactory amount of them. Therefore, it is a no-brainer that they will go to the FRAM. Storing the tree raises more questions: how much does the tree “weighs” anyway?
If well constructed, the nodes of the tree need not store a pointer to their sons (think of a heap for instance), so the internal nodes only need to store, for instance, 8 bytes, if we use 8 bytes hashes as messages fingerprints. With a message size of about 180 bytes, we can store 1024 of them on the FRAM while keeping enough space for the rest of the application, so our tree must have 1024 leaves. The leaves are special, because they do not contain the same things as an internal node: they either contain a whole bucket (less flexible, more economic) or the address of a bucket (more memory consuming, but allows to specify the kind of bucket when inserting the message in the tree). In the later hypothesis, each leave weighs 11 bytes, since the FRAM uses three-octets addresses. The leaves take up a total memory space of 11264 bytes, around 11kB.
With a fixed number of sons in our tree, we can use either a binary tree (with 1023 internal nodes) or a 4-sons tree (with 381 internal nodes) (we could use a 512-sons tree, but that would not be very efficient). That is, the internal part of the tree weighs about 8kB if it is a binary tree, and about 3kB a 4-sons tree (8184 and 2728 octets, to be exact). The total tree weighs either 19 or 14 kB.
How many sons should we have?
What are the pros and cons of a binary tree versus a 4-sons tree? Size is a parameter, as we just saw. But by storing less information, the 4-sons tree is also less precise. When updating a part of the tree, the affected subtree is considerably larger than for a binary tree. It means that in the case two trees do not differ a lot (which, as we saw, is when the hash tree structure comes in handy), we need to make more hash comparisons to identify the leaves that differ. Since hash are what are exchanged through radio during the comparison phase of the transaction, a hash comparison is a costly operation.
In the same time, we saw that FRAM accesses are a thousand times slower than RAM accesses. This means that if we can store something in the RAM, we have a strong incentive to do so. It is not reasonable to store 11 kB of leaves in the 16 kB RAM, but we can consider storing the internal tree. 8 kB is a lot, so storing a 4-sons tree in RAM is easier than storing a binary tree. Can we afford storing a binary tree? A 4-sons tree? Neither? This will need experimentation, and might need last minute adjustments as the firmware becomes more complex and uses more RAM.