LFU implemented in very basic (no interior mutability) Rust.
Not so long ago a paper on O(1) LFU design (http://dhruvbird.com/lfu.pdf) was pretty popular in caching world. While LFU isn't (on it's own) that good of a general cache algorithm, people do like bragging about implementing a O(1) algorithm :)
The original paper outlines the algorithm using doubly linked list. That's both interesting as a problem for learning Rust (because doubly linked lists are deemed hard) syntax and a counter example of how to implement something in Rust. I came to that conclusion comparing version with doubly linked lists that requires use of interior mutability pattern in Rust. That in turn means runtime vs. compile time checking and generally makes for worse performance overall due to simple problems such as poor compiler-level optimizations.
So as an exercise in writing idiomatic performant Rust that's easy to read, argue about and still correct I added this version.
The design principles are fairly simple:
- store data about keys and how frequent they are using
FrequencyNodestruct - store said structs in a
Vec<FrequencyNode> - at the moment frequency simply maps to index in that list
- values for keys are stored in a
HashMapusingItemstruct Itemitself simply contains the current index ofFrequencyNodestoring the key andBytesfield for data
Assuming Rust's HashMap get, insert and remove operations are O(1) this boils down to question of what's the added complexity coming from storing frequency data in Vec:
getmeans we need to grab theItemfromHashMap, pull key fromVec<FrequencyNode>, assign it to nextVec<FrequencyNode>and incrment it'sindexsetis simply insert intoHashMapand add key toVec<FrequencyNode>at index 0- only worrying point at the moment is adding new
FrequencyNode>since that's O(m) where m isVec.len()
Well, it's written using Vec, HashMap and tokio Bytes. Thus it's limited to being oportunistically (expected/amortized) O(1) rather than guaranteed O(1).
Complexities for Vec and HashMap are sourced from here
At minimum cache has to provide two key functions:
- get - internally this means one
HashMap.get,Vec.get_mutandVec.pushoperations - insert - internally this means
HashMap.getandVec.get_mutandVec.pushoperations
with Vec.get being O(1) and HashMap.get being expected O(1) the only problem is Vec.push which is amortized O(1) in this case.
To read if HashMap makes sense you can read on it here