# Cache: We are Terrible Magicians

ccv uses an application-wide transparent cache to de-duplicate matrix computations.
In the following chapters, I will try to outline how that works, and expose you
to the inner-working of ccv’s core functionalities.

## Initial Signature

**ccv_make_matrix_immutable** computes the SHA-1 hash on matrix raw data, and will
use the first 64-bit as the signature for that matrix.

## Derived Signature

Derived signature is computed from the specific operation that is going to perform.
For example, matrix A and matrix B used to generate matrix C through operation X.
C’s signature is derived from A, B and X.

## A Radix-tree LRU Cache

ccv uses a custom radix-tree implementation with generation information. It imposes
a hard limit on memory usage of 64 MiB, you can adjust this value if you like.
The custom radix-tree data structure is specifically designed to satisfy our 64-bit
signature design. If compile with jemalloc, it can be both fast and memory-efficient.

## Garbage Collection

The matrix signature is important. For every matrix that is freed with **ccv_matrix_free**
directive, it will first check the signature. If it is a derived signature,
**ccv_matrix_free** won’t free that matrix to OS immediately, instead, it will put
that matrix back to the application-wide cache. Sparse matrix, matrix without
signature / with initial signature will be freed immediately.

## Shortcut

For operation X performed with matrix A and B, it will first generate the derived
signature. The signature will be searched in the application-wide cache in hope
of finding a result matrix. If such matrix C is found, the operation X will take
a shortcut and return that matrix to user. Otherwise, it will allocate such matrix,
set proper signature on it and perform the operation honestly.

After finish this, I found that it may not be the most interesting bit of ccv.
But still, hope you found it otherwise :-)

comments powered by