Bhushitha Hashan

Posted on Mar 18

Speed vs Truth: Understanding Redis the Way Engineers Actually Do

#systemdesign #redis #softwaredevelopment #distributedsystems

Late Afternoon, Real System, Real Confusion

The office was quieter than usual. Most people had already left, but one corner still had life in it.

A whiteboard covered in boxes, arrows, and half-erased notes.

And in front of it,Arjun.

A few weeks into his internship, he had reached that stage where things no longer looked simple… but also didn’t fully make sense yet.

Behind him, leaning on the desk with a coffee mug that had clearly been refilled too many times, stood Maya,the senior systems engineer.

She watched him stare at the diagram for a while before speaking.

“Alright,” she said, calm and direct. “You’ve been staring at that same box for five minutes. What’s bothering you?”

Arjun didn’t turn immediately. He pointed at the whiteboard.

A box labeled:

Redis

“I get that this is for speed,” he said slowly. “Like… we put it in front of the database so things don’t get slow.”

Maya nodded. “Good. That’s the surface-level answer. Keep going.”

Arjun hesitated, then turned.

“But it feels… fake.”

Maya raised an eyebrow. “Fake?”

“Yeah,” he said. “Like it’s not real storage. It’s just… temporary. So why are we trusting it at all?”

Maya smiled.

“Good,” she said. “Now we can actually start.”

What Redis Actually Is (Explained Like You Mean It)

Maya walked to the whiteboard and drew two boxes.

[ Redis ] -------- [ Database ]
   (RAM)             (Disk)

She tapped the first box.

“Redis lives in memory. RAM. That’s why it’s fast.”

Then she tapped the second.

“The database lives on disk. That’s why it’s slower,but reliable.”

Arjun nodded. “Yeah, I get that part.”

“No,” Maya said, shaking her head slightly. “You understand the words. Not the implication.”

She turned back and wrote:

Redis = Speed, Not Permanence

“Everything about Redis is optimized for one thing.responding fast,” she continued. “Not guaranteeing your data will exist forever.”

Arjun frowned slightly. “So… it can lose data?”

“It will lose data,” Maya corrected.

She started listing on the board:

Memory limit → eviction
Crash → possible data loss
Restart → partial recovery

Then she turned.

“So if you think of Redis as your database, you’ve already made a mistake.”

The First Mental Breakthrough

Arjun crossed his arms, thinking.

“So then what is it actually?”

Maya didn’t answer immediately. Instead, she erased a small section and drew this:

User → App → Redis → Database

“Most requests stop here,” she said, pointing to Redis.

Then she added:

User → App → Redis ❌ (miss) → Database → Redis → User

“This is what happens when Redis doesn’t have the data.”

She stepped back.

“Redis is not the source of truth,” she said.
“It’s a temporary, fast copy of reality.”

Arjun repeated it quietly.

“…temporary copy.”

“Exactly.”

Why We Even Need Redis

Arjun turned back to the board.

“Okay, but why not just make the database faster?”

Maya laughed softly.

“Everyone asks that at some point.”

She drew two timelines.

RAM access  → nanoseconds
Disk access → milliseconds

Then she circled them.

“This difference is massive,” she said. “You don’t ‘optimize’ your way out of physics.”

She drew another diagram:

Without Redis:
User → App → Database (every request)

Then:

With Redis:
User → App → Redis (most requests)
                ↓
             Database (rare)

“Redis exists so your database doesn’t collapse under load.”

Where Things Start Getting Dangerous

Arjun nodded, following along.

“Okay… so we cache stuff. Makes sense.”

Maya tilted her head slightly.

“That works when data doesn’t change often,” she said. “But what about things that change constantly?”

She wrote:

page views
likes
active users

Then added:

“Where do those updates happen?”

Arjun answered quickly.

“In Redis… because it’s fast.”

Maya nodded.

“And now you’ve just moved the problem.”

The Fear Kicks In

She wrote on the board:

views:article:123 → 10,482

“This number keeps increasing,” she said.

Arjun nodded.

Then she asked:

“What happens if Redis runs out of memory?”

Arjun paused.

“…it deletes something?”

“Yes.”

“And if that key gets deleted?”

“…we lose the count.”

Maya crossed her arms.

“Now you see the problem.”

The Second Mental Shift: Not All Data Is Equal

Arjun leaned back slightly.

“So we shouldn’t store important stuff there.”

“Exactly,” Maya said. “But let’s define ‘important’ properly.”

She split the board into two sections.

Left Side

Critical Data

payments
orders
balances

She underlined it.

“Lose this, and your system is broken.”

Right Side

High-Speed Data

analytics
counters
sessions

“Lose a bit of this?” she shrugged. “System still works.”

“Now the rules change,” she said.

Two Ways to Write Data

Maya drew two flows.

1. Write-Through

App → Database → Redis

“Safe,” she said. “Slower, but correct.”

2. Write-Behind

App → Redis → (later) Database

“Fast,” she said. “But risky.”

Arjun looked at both.

“So we just pick one?”

Maya shook her head.

“No. We use both. Based on the data.”

The Trade-Off Nobody Escapes

She turned to him.

“You can’t have perfect speed and perfect safety at the same time.”

Arjun nodded slowly.

“Yeah… that makes sense.”

The Data Loss Gap (This One Matters)

Maya drew a timeline.

Time →
[ Redis updated ] ---- (delay) ---- [ DB updated ]

“This gap,” she said, tapping the space in between, “is where things can go wrong.”

Arjun leaned forward.

“If the system crashes there…”

“…you lose data,” Maya finished.

So We Add a Queue

She erased part of the board and drew a new flow:

App:
  → Update Redis
  → Push event → Queue

Then:

Worker:
  → Read Queue
  → Update Database

Arjun looked at it.

“So now even if Redis crashes…”

“The queue still has the data,” Maya said.

“And if the worker crashes?”

“It resumes from the queue.”

Arjun nodded.

“…okay, that’s solid.”

When Things Go Wrong (And They Will)

Maya didn’t respond immediately.

Instead, she tilted her head slightly.

“Solid… but not perfect,” she said.

Arjun frowned. “What do you mean?”

“Walk it through again,” Maya said.

Arjun looked back at the board.

“Okay… we write to Redis, push to the queue…”

He paused.

“What if Redis evicts the key before the worker runs?”

Maya nodded. “Keep going.”

“And then… what if the worker fails before writing to the database?”

Now he stopped completely.

“…then the data never reaches the database.”

Maya crossed her arms.

“And Redis already lost it.”

They both looked at the board.

“So we still lose data,” Arjun said quietly.

“Exactly.”

Maya uncapped the marker again.

She drew it out slowly.

1. Write → Redis
2. Push → Queue
3. Redis evicts key ❌
4. Worker fails ❌
5. Data never reaches DB

She stepped back.

“This,” she said, “is the kind of failure that doesn’t crash your system.”

Arjun frowned.

“…it just loses data.”

“Exactly.”

The Subtle Problem With “Safe Enough”

Maya crossed her arms.

“The queue reduces risk,” she said. “But it doesn’t eliminate it.”

Arjun nodded slowly.

“So what do we do?”

Three Things Real Systems Add

Maya held up three fingers.

“Once you reach this level, you start thinking about three things.”

1. Idempotent Writes

“If the worker retries the same update twice,” she said, “nothing should break.”

She wrote:

Bad:
INCR views → duplicates possible

Better:
SET views = value with version

“You design your writes so repeating them is safe.”

2. Assume the Queue Lies

Arjun blinked. “The queue… lies?”

Maya smiled slightly.

“Not maliciously. But it might deliver the same message twice. Or later than expected.”

She underlined it.

Always assume at-least-once delivery

“Your system must handle duplicates.”

3. Retry Properly

“What if the worker fails?” she continued.

“Retry?” Arjun said.

“Yes. But not blindly.”

She wrote:

Retry with delay
Exponential backoff
Dead-letter queue (failed forever)

“If you don’t handle failures properly,” she said, “they disappear quietly.”

The Final Realization

Arjun leaned back.

“So even the ‘safe’ design isn’t actually safe.”

Maya nodded.

“Nothing is perfectly safe,” she said. “You just keep reducing risk.”

Updating the Mental Model

Maya walked back to the board one last time and added a small note next to the system:

[ Redis ] → fast but fragile  
[ Queue ] → safer, but needs retries  
[ Database ] → final truth

Then underneath it, she wrote:

Design for failure, not success

Arjun stared at the board.

“…this got complicated fast.”

“Yeah,” she said. “Welcome to distributed systems.”

But There’s Still a Subtle Problem

Maya smiled slightly.

“But There’s always one more problem.”

She drew:

1. Redis loses key
2. User requests data
3. App checks DB
4. DB is slightly outdated

Arjun’s eyes narrowed.

“So the user sees old data.”

“Exactly.”

Read Repair (The Quiet Fix)

Maya added one more layer:

Cache miss →
  Fetch DB →
  Check queue →
  Merge updates →
  Return →
  Update Redis

Arjun blinked.

“So reading actually fixes the data?”

“Yep.”

Everything Comes Together

Maya stepped back and rewrote the system cleanly:

          [ Redis ]
              ↓
User → App → Cache Layer
              ↓
          [ Queue ]
              ↓
          [ Database ]

She pointed at each part.

“Redis gives you speed.”

“Queue gives you safety.”

“Database gives you truth.”

The Final Understanding

Arjun looked at the board for a long moment.

Then he said quietly:

“So Redis isn’t the system.”

Maya smiled.

“It’s part of the system.”

He nodded.

“And the database…”

“…is reality,” she finished.

The One Sentence That Matters

Maya picked up the marker one last time and wrote:

Move fast in Redis, commit safely through queues, trust the database.

Arjun read it once.

Then again.

This time, the diagram didn’t feel confusing.

It felt… inevitable.

If You Walk Away Remembering This

Maya capped the marker and turned.

“Before you leave,” she said, “tell me what you learned.”

Arjun didn’t hesitate this time.

Redis is fast but temporary
Data there can disappear
Important data goes to the database first, then update Redis
Fast changing data can go to Redis first with a queue
Queues make things safer, but not perfect
Everything is a trade off

Maya nodded.

“Good. Now you’re not just using Redis anymore.”

Arjun looked back at the board one last time.

“It’s not just about making it fast,” he said.

Maya shook her head slightly.

“It’s about knowing where it can break.”

She picked up her coffee.

“That’s when you start thinking like a systems engineer.”

Top comments (2)

Andre Cytryn • Mar 18

the write-behind + queue pattern is the part most tutorials skip. one gotcha worth calling out: when the queue worker fails to write to the database and the key has already been evicted from redis, you need idempotent writes and at-least-once delivery semantics, otherwise a crash at exactly the wrong moment still loses data silently. the read repair step you mentioned at the end helps catch stale reads, but it doesn't help with data that was never flushed at all. have you run into queue retry strategies in practice for this kind of setup?

Bhushitha Hashan • Mar 19

Yeah that’s a really good callout.
I actually went back and added a section in the article after reading your comment exactly around that failure window where Redis evicts the key and the worker fails before the DB write.
And you’re right, in that case the data never makes it anywhere durable, so read repair can’t help at all.
From what I understand so far, that’s where things like idempotent writes, assuming at least once delivery, and proper retry strategies (backoff / DLQ, etc.) start becoming necessary.I haven’t implemented a full retry system in practice yet, but that’s exactly what I’m planning to experiment with next.
Appreciate you pointing it out it definitely pushed both the article and my understanding a level deeper.