Overview
As a backend software engineer you have likely used Redis. Many people attribute Redis' speed solely to its in-memory design, but there are several additional factors that contribute to its high performance. This article summarizes the main reasons Redis is fast.
1. In-memory storage
Redis is an in-memory database, so compared with disk-based databases it avoids disk I/O overhead. Disk databases must read data into memory before operating on it, which is limited by disk I/O. In contrast, in-memory databases keep data directly in RAM, removing that bottleneck.
The table below shows the performance gap between reading from memory and reading from various storage devices.
| Device | Read speed | Comparison |
|---|---|---|
| Mechanical hard drive | 0.1G/S | baseline |
| Solid state drive | 1.3G/S | 13x mechanical HDD |
| Memory | 30G/S | 300x mechanical HDD |
| L3 cache | 190G/S | 1900x mechanical HDD |
| L2 cache | 200G/S | 2000x mechanical HDD |
| L1 cache | 800G/S | 8000x mechanical HDD |
2. Efficient storage structures
Redis uses a hash table to store all key-value pairs, enabling fast lookup from key to value. A hash table is essentially an array whose elements are hash buckets. Each bucket stores entries that contain pointers to the actual key and value objects. Because values can be of various types, the hash table stores pointers (memory references) rather than concrete values; the pointers are used to locate the actual value objects. This design allows values that are collections to be referenced via their value pointer. Because this hash table holds all keys, it can be considered a global hash table.
3. Single-threaded model avoids context switches
Running the core command processing in a single thread eliminates context-switch overhead and reduces CPU consumption caused by thread contention. There are no locks to acquire or release in most operations, and no deadlock scenarios that would degrade performance.
4. I/O multiplexing to handle concurrency
Redis uses an epoll-based model for I/O multiplexing. Similar models exist in other platforms, for example Java NIO. Before epoll there were selector and poll models. The epoll model allows a single thread to manage many concurrent connections efficiently.
5. Incremental rehashing
If the hash table array never changes size, hash collisions would increase as more keys are inserted and lookup performance would degrade. To address this, Redis grows the hash table (typically doubling the size). Resizing requires moving entries to new positions, which can block I/O if done all at once and would cause the Redis thread to stall.
To avoid long pauses, Redis performs incremental rehashing. By default Redis maintains two hash tables: hash table 1 and hash table 2. Initially new keys are stored in hash table 1 and hash table 2 has no allocated space. As data grows, Redis starts rehashing:
- Allocate a larger space for hash table 2, for example twice the size of hash table 1
- Remap and copy entries from hash table 1 to hash table 2 incrementally
- Free the space used by hash table 1 once migration completes
Because step 2 can involve a large amount of copying, doing it all at once would block the Redis thread and prevent it from serving other requests. Incremental rehashing spreads the work across operations to avoid long pauses.
6. Cached timestamp
When obtaining system timestamps in typical applications, developers often call System.currentTimeMillis() or new Date().getTime() without thinking. Each system call to get the current time is relatively expensive. For a single-threaded server like Redis, making many system calls for timestamps would be costly. Redis therefore caches the current timestamp and updates it periodically with a timer, so retrieving the timestamp is just a read from the cached value rather than a system call on every operation.
ALLPCB
