Overview
The AVL Heap Library provides a high-performance, memory-efficient implementation of an AVL tree stored in a heap-allocated array. Unlike traditional pointer-based AVL trees, this implementation uses array indexing for navigation, resulting in better cache locality and reduced memory overhead.
The AVL Heap requires a small, fixed-size data structure with two balance bits embedded in that structure; in return, it saves 2 64-bit pointers per data item over traditional AVL trees while keeping its data items in sorted order. The smallest data structure (i.e. 1 64-bit word minus the 2 balance bits) should be as fast as a traditional AVL tree. As the data structure grows to more 64-bit words, AVL Heap insertion and deletion slows down since copying/moving up to one data structure per tree level (AVL Heap) is slower than changing one pointer per tree level (AVL tree).
I provided the idea, then Claude.ai's Sonnet 4.5 created this library and almost all of the documentation. I have not reviewed Claude's code yet, so beware -- there may be bugs. A heap can encode different types of trees; I'm guessing that red-black trees, quad trees, etc. can also be encoded as heaps with similar-to-the-AVL-Heap advantages and disadvantages.
Memory Efficient
Arena-based allocation with on-demand memory commitment. Balance bits embedded in data.
Cache Friendly
Array-based storage provides excellent cache locality compared to pointer-chasing.
Configurable
Choose which bits store balance information based on your data alignment requirements.
Multi-Word
Support for elements with multiple 64-bit words for complex data structures.
Features
Core Capabilities
- O(log n) operations for insert, delete, and search
- Arena memory management with lazy page commitment
- Configurable balance bits (top or bottom 2 bits of any word)
- Multi-word elements for storing complex data
- Key-only or key-value storage modes
- In-order, pre-order, post-order traversals
- Detailed statistics and memory usage reporting
Files
Balance Bit Configurations
| Configuration | Use Case | Constraint | Usable Range |
|---|---|---|---|
BALANCE_TOP_BITS |
Integer keys, counters, IDs | key < 262 | 0 to 4,611,686,018,427,387,903 |
BALANCE_BOTTOM_BITS |
Pointers, aligned addresses | 4-byte aligned (bottom 2 bits = 0) | Any aligned value |
API Reference
Data Types
typedef enum {
BALANCE_TOP_BITS, /* Use bits 62-63 (MSB) */
BALANCE_BOTTOM_BITS /* Use bits 0-1 (LSB) */
} BalanceBitLocation;
typedef struct AVLHeap AVLHeap;
Initialization
AVLHeap* avlh_init(
const char *name, /* Descriptive name for debugging */
size_t initial_capacity, /* Initial array size */
size_t element_words, /* Words per element (min 1) */
size_t max_memory, /* Max arena size (0 = 1 GiB) */
size_t balance_word_idx, /* Which word has balance bits */
BalanceBitLocation balance_location /* TOP or BOTTOM bits */
);
Core Operations
/* Add element */
int avlh_add(AVLHeap *avlh, uint64_t key, const uint64_t *data);
/* Check existence */
int avlh_find(AVLHeap *avlh, uint64_t key);
/* Get data pointer */
uint64_t* avlh_find_element(AVLHeap *avlh, uint64_t key);
/* Delete element */
int avlh_delete(AVLHeap *avlh, uint64_t key);
/* Free memory */
void avlh_free(AVLHeap *avlh);
Traversal & Statistics
void avlh_print_inorder(AVLHeap *avlh);
void avlh_print_preorder(AVLHeap *avlh);
void avlh_print_postorder(AVLHeap *avlh);
void avlh_stats(AVLHeap *avlh);
size_t avlh_get_size(AVLHeap *avlh);
Usage Examples
Example 1: Integer Keys (TOP bits)
/* Create heap for user IDs */
AVLHeap *ids = avlh_init(
"UserIDs",
16, /* initial capacity */
1, /* 1 word per element */
0, /* default memory limit */
0, /* balance bits in word 0 */
BALANCE_TOP_BITS /* use top 2 bits */
);
/* Add some user IDs */
avlh_add(ids, 100, NULL);
avlh_add(ids, 200, NULL);
avlh_add(ids, 150, NULL);
/* Check if ID exists */
if (avlh_find(ids, 150)) {
printf("User 150 found!\n");
}
/* Clean up */
avlh_free(ids);
Example 2: Key-Value Pairs (TOP bits)
/* Create heap for employee salaries */
AVLHeap *salaries = avlh_init(
"Salaries",
16, 2, 0, /* 2 words: key + salary */
0, BALANCE_TOP_BITS
);
/* Add employee data */
uint64_t salary1[] = {75000};
uint64_t salary2[] = {120000};
avlh_add(salaries, 1001, salary1); /* Employee 1001: $75k */
avlh_add(salaries, 1005, salary2); /* Employee 1005: $120k */
/* Retrieve salary */
uint64_t *sal = avlh_find_element(salaries, 1005);
if (sal) {
printf("Employee 1005 earns: $%llu\n", sal[0]);
}
avlh_free(salaries);
Example 3: Pointer Addresses (BOTTOM bits)
/* Create heap for memory addresses */
AVLHeap *ptrs = avlh_init(
"Pointers",
16, 1, 0,
0, BALANCE_BOTTOM_BITS /* use bottom 2 bits */
);
/* Add aligned addresses (must be 4-byte aligned) */
avlh_add(ptrs, 0x1000, NULL); /* OK: aligned */
avlh_add(ptrs, 0x2000, NULL); /* OK: aligned */
avlh_add(ptrs, 0x1001, NULL); /* FAIL: not aligned */
avlh_free(ptrs);
BALANCE_TOP_BITS: Keys must be less than 262
BALANCE_BOTTOM_BITS: Keys must be 4-byte aligned (bottom 2 bits = 0)
Advanced Configuration
Multi-Word Elements
You can store multiple 64-bit words per element and place the balance bits in any word:
/* Balance in first word */
AVLHeap *h1 = avlh_init("Config1", 16, 3, 0, 0, BALANCE_TOP_BITS);
/* Balance in last word */
AVLHeap *h2 = avlh_init("Config2", 16, 3, 0, 2, BALANCE_TOP_BITS);
/* Balance in middle word with bottom bits */
AVLHeap *h3 = avlh_init("Config3", 16, 3, 0, 1, BALANCE_BOTTOM_BITS);
uint64_t data[] = {1111, 2222};
avlh_add(h1, 100, data);
avlh_add(h2, 200, data);
avlh_add(h3, 0x1000, data); /* Must be aligned */
Memory Management
The library uses an arena-based memory allocator with on-demand page commitment:
- Initial reservation: 1 GiB virtual address space (customizable)
- Initial commitment: 128 KiB physical memory
- Automatic growth: Doubles committed memory as needed
- Zero-copy reallocation: New allocations from the same arena
The arena allocator minimizes system calls and fragmentation. Virtual address space is cheap; physical memory is committed only as needed. This allows heaps to grow efficiently without frequent reallocation.
Performance Characteristics
Time Complexity
| Operation | Average Case | Worst Case |
|---|---|---|
| Insert | O(log n) | O(log n) |
| Delete | O(log n) | O(log n) |
| Search | O(log n) | O(log n) |
| Traversal | O(n) | O(n) |
Space Complexity
- Per element: element_words × 8 bytes
- No pointer overhead: Balance bits embedded in data
- No node allocation: Contiguous array storage
- Cache efficient: Sequential memory access patterns
Advantages vs. Traditional AVL Trees
Better Cache Locality
Array-based storage keeps nodes close together, reducing cache misses
Lower Memory Overhead
No pointer storage, balance bits embedded in data words
Simpler Memory Management
Single arena allocation instead of per-node malloc/free
Predictable Performance
No malloc/free during operations, just index arithmetic
This implementation excels when elements have unused bits for balance storage and when cache locality matters more than worst-case memory usage. For sparse trees with many unused array slots, a traditional pointer-based AVL tree may be more space-efficient.