DEV Community

D
D

Posted on • Originally published at qiita.com

[Side A] Completely Defending Python from OOM Kills: The BytesIO Trap and D-MemFS 'Hard Quota' Design Philosophy

From the Author:
Recently, I introduced D-MemFS on Reddit. The response was overwhelming, confirming that memory management and file I/O performance are truly universal challenges for developers everywhere. This series is my response to that global interest.

🧭 About this Series: The Two Sides of Development

To provide a complete picture of this project, I’ve split each update into two perspectives:

  • Side A (Practical / from Qiita): Implementation details, benchmarks, and technical solutions.
  • Side B (Philosophy / from Zenn): The development war stories, AI-collaboration, and design decisions.

Introduction

If you write in-memory processing in Python, you will eventually encounter this kind of failure:

Killed
Enter fullscreen mode Exit fullscreen mode

Or on Windows, the process simply vanishes without a word. It's an OOM (Out of Memory) kill. Both io.BytesIO and dict will expand limitlessly until memory runs out. The process disappears without you even knowing "where" or "why" it crashed—this is one of the most troublesome pitfalls of Python in-memory processing.

In this article, I will dig into how the Hard Quota design of D-MemFS solves this problem, right from its core design philosophy.

The Problem: BytesIO and dict Swell Limitlessly

First, let's clarify the problem.

from io import BytesIO

buf = BytesIO()

# It won't stop no matter how much you write
# It continues to succeed until physical memory runs out
for i in range(100_000):
    buf.write(b"x" * 10_000)

print(buf.tell())  # 1,000,000,000 — 1 GiB
Enter fullscreen mode Exit fullscreen mode

This write does not fail. It stubbornly continues succeeding until the OS kills the process.

The same applies to dict.

vfs: dict[str, bytes] = {}
for i in range(100_000):
    vfs[f"file_{i}.bin"] = b"x" * 10_000
# No errors until 1 GiB piles up
Enter fullscreen mode Exit fullscreen mode

Soft Quotas Are Not Enough

An approach like "checking the size and warning after writing" is called a soft quota. But this has a fundamental flaw—the data has already been written.

# Pseudo-implementation of a soft quota (A bad example)
MAX_BYTES = 100 * 1024 * 1024  # 100 MiB
total = 0

def soft_write(buf: BytesIO, data: bytes) -> None:
    buf.write(data)        # <- Writes first
    global total
    total += len(data)
    if total > MAX_BYTES:  # <- Notices after writing
        raise MemoryError("quota exceeded")  # Too late
Enter fullscreen mode Exit fullscreen mode

The moment the threshold is exceeded, the memory has already been consumed. Furthermore, rolling back the written data after throwing an exception is not easy.

D-MemFS's Hard Quota Design

The D-MemFS quota operates on a Central Bank model. Before a write is executed, it checks the remaining quota balance and immediately rejects the write if there isn't enough.

write(data) is called
    ↓
Requests a reservation of len(data) bytes from the Quota Manager
    ↓
Is the balance sufficient?
    YES -> Decreases balance and executes write
    NO  -> raises MFSQuotaExceededError (the write never happens)
Enter fullscreen mode Exit fullscreen mode

Data is never written. The file is not polluted, and you can catch the exception and continue processing.

Code Example: Actually Using the Quota

Basic Quota Settings and Exception Handling

from dmemfs import MemoryFileSystem, MFSQuotaExceededError

# 10 MiB Hard Quota
mfs = MemoryFileSystem(max_quota=10 * 1024 * 1024)
mfs.mkdir("/data")

def safe_write(mfs: MemoryFileSystem, path: str, data: bytes) -> bool:
    """Can continue processing even if writing fails"""
    try:
        with mfs.open(path, "wb") as f:
            f.write(data)
        return True
    except MFSQuotaExceededError as e:
        print(f"[Warning] Skipped writing to {path} due to quota excess: {e}")
        return False

# Success Case
safe_write(mfs, "/data/small.bin", b"x" * (1 * 1024 * 1024))   # 1 MiB → OK

# Failure Case (Exceeds quota)
safe_write(mfs, "/data/big.bin", b"x" * (20 * 1024 * 1024))    # 20 MiB → Skipped

# The file is not polluted (opening 'wb' leaves an empty file, but no data was written)
st = mfs.stat("/data/big.bin")
print(st["size"])  # 0
Enter fullscreen mode Exit fullscreen mode

Processing While Checking the Remaining Quota

from dmemfs import MemoryFileSystem, MFSQuotaExceededError

QUOTA = 64 * 1024 * 1024  # 64 MiB
mfs = MemoryFileSystem(max_quota=QUOTA)
mfs.mkdir("/chunks")

def process_stream(stream, chunk_size: int = 4 * 1024 * 1024):
    """Reads a stream into memory by chunks"""
    chunk_index = 0
    written_paths = []

    for chunk in iter(lambda: stream.read(chunk_size), b""):
        path = f"/chunks/chunk_{chunk_index:04d}.bin"
        try:
            with mfs.open(path, "wb") as f:
                f.write(chunk)
            written_paths.append(path)
            chunk_index += 1
        except MFSQuotaExceededError:
            print(f"Quota reached: Kept up to {chunk_index} chunks")
            break

    return written_paths
Enter fullscreen mode Exit fullscreen mode

Node Count Limit: MFSNodeLimitExceededError

You can also set a limit on the number of files (nodes). This helps to quickly detect bugs that cause the file count to explode.

from dmemfs import MemoryFileSystem, MFSNodeLimitExceededError

# Max 100 files
mfs = MemoryFileSystem(max_nodes=100)
mfs.mkdir("/logs")

for i in range(200):
    try:
        with mfs.open(f"/logs/entry_{i:04d}.log", "xb") as f:
            f.write(f"log entry {i}\n".encode())
    except MFSNodeLimitExceededError:
        print(f"Node limit reached: Stopped at {i} files")
        break
Enter fullscreen mode Exit fullscreen mode

Storage Backends: SequentialMemoryFile and RandomAccessMemoryFile

D-MemFS has two types of storage backends.

SequentialMemoryFile (Sequential)

Implemented internally as a chain of byte sequences (list[bytes]).

  • Fast appending and reading from the beginning
  • Slow random access (needs to traverse chunks)
  • High memory efficiency (fewer allocations)

RandomAccessMemoryFile (Random Access)

Implemented internally as a bytearray.

  • Fast seek + read/write
  • Because it pre-allocates buffers during writing, doing only sequential writing might result in wasted memory.

auto-promotion (Automatic Promotion)

When default_storage="auto" (the default), it observes the file access pattern and automatically switches backends.

File created -> Starts as SequentialMemoryFile
    ↓
Random access (seek) is detected
    ↓
Automatically promoted to RandomAccessMemoryFile
Enter fullscreen mode Exit fullscreen mode
# You can also explicitly pin the backend
from dmemfs import MemoryFileSystem

mfs_seq = MemoryFileSystem(default_storage="sequential")    # Always sequential
mfs_ra  = MemoryFileSystem(default_storage="random_access") # Always random access
mfs_auto = MemoryFileSystem(default_storage="auto")         # Auto (default)
Enter fullscreen mode Exit fullscreen mode

promotion_hard_limit: Suppressing Promotion of Giant Files

Auto-promotion entails copying data into a bytearray upon random access. If this happens with an extremely large file, memory usage temporarily doubles.

By setting promotion_hard_limit, files exceeding this size will not automatically promote.

from dmemfs import MemoryFileSystem

# Files 64 MiB or larger will not automatically promote
mfs = MemoryFileSystem(
    max_quota=512 * 1024 * 1024,
    promotion_hard_limit=64 * 1024 * 1024,
)
Enter fullscreen mode Exit fullscreen mode

In pipelines handling massive data, this is an important parameter to prevent memory spikes. In enterprise batch processing or data pipelines, this parameter acts as a safety net purposefully designed to smooth out memory spikes. Being able to strictly control the upper limit of memory usage synergizes well with K8s memory limits and CI resource constraints, tying directly into operational stability.

Memory Accounting: What is Included in the Quota

The quota tracks more than just pure data bytes.

Quota consumption = Bytes of Actual Data + Chunk Overhead
Enter fullscreen mode Exit fullscreen mode

Since SequentialMemoryFile retains data in chunks, the chunk header information is slightly added as overhead. Because of this, when configuring "Quota = 10 MiB", the actual memory usage will confidently stay under 10 MiB (the actual data will be slightly less due to the overhead).

This design prioritizes the guarantee that "the quota is absolutely never exceeded".

Thread-Safe Atomic Operations

Quota updates are handled atomically under locks.

from dmemfs import MemoryFileSystem
import threading

mfs = MemoryFileSystem(max_quota=10 * 1024 * 1024)
mfs.mkdir("/concurrent")

errors = []

def writer(thread_id: int):
    for i in range(50):
        try:
            path = f"/concurrent/t{thread_id}_f{i}.bin"
            with mfs.open(path, "xb") as f:
                f.write(b"x" * (100 * 1024))  # 100 KiB each
        except Exception as e:
            errors.append(e)

threads = [threading.Thread(target=writer, args=(i,)) for i in range(10)]
for t in threads: t.start()
for t in threads: t.join()

# Excess requests over the quota yield exceptions, but the file system isn't broken
quota_errors = [e for e in errors if "quota" in str(e).lower()]
print(f"Quota exceeded: {len(quota_errors)} times (Normal behavior)")
print(f"FS Corruption: None")
Enter fullscreen mode Exit fullscreen mode

If the two steps of "checking the quota and writing" are separated, a race condition could occur where another thread cuts in between the check and the write to exhaust the quota. In D-MemFS, this verification and reservation are executed under a single RW lock, completely eliminating this conflict.

A World With Hard Quotas vs. Without

No Quota (BytesIO / dict) D-MemFS Hard Quota
Behavior on memory exceedance Process is OOM killed MFSQuotaExceededError rises
Detection timing Unnoticed until OS kills it Detected instantly before write
Catching the exception Impossible (SIGKILL) Recoverable with try/except
Rollback Impossible Unnecessary since write hasn't happened
File integrity May be corrupted Guaranteed
Logging / Monitoring Often lost Can be logged as an exception

Quota Configuration Guidelines

Here are practical guidelines regarding what values to set.

import os
import psutil

def recommended_quota() -> int:
    """
    Example of using a certain percentage of available memory as a quota.
    In production, a fixed value is more predictable.
    """
    available = psutil.virtual_memory().available
    return int(available * 0.25)  # 25% of available memory

# Rule of thumb for actual use cases
QUOTAS = {
    "unit_test":        32 * 1024 * 1024,   # 32 MiB — For testing
    "ci_pipeline":     256 * 1024 * 1024,   # 256 MiB — CI Pipeline
    "batch_processing": 2 * 1024 * 1024 * 1024,  # 2 GiB — Batch processing
}
Enter fullscreen mode Exit fullscreen mode

Basic Principle: Estimate the worst-case input size and set the quota to 1.5 - 2 times that amount. If that exceeds the total memory budget of the process, reconsider the design.

Memory Guard: Detecting Physical Memory Depletion in Advance

While a Hard Quota manages the "budget within the virtual FS," there remains another problem—when the set quota exceeds the physical memory of the host machine.

For example, even if you set max_quota=4GiB, if the machine only has 2 GiB of free memory, the OS will execute an OOM kill before reaching the quota. Hard quotas alone cannot prevent this.

The Memory Guard introduced in v0.3.0 addresses these "OOMs occurring outside the quota."

3 Modes

Mode Behavior
"none" No checks (Default, backward compatible)
"init" Checks if max_quota exceeds available memory at FS initialization
"per_write" Checks physical memory balance on every write (interval specifiable)
from dmemfs import MemoryFileSystem

# Detect insufficient memory upon initialization (Recommended)
mfs = MemoryFileSystem(
    max_quota=4 * 1024 * 1024 * 1024,     # 4 GiB
    memory_guard="init",
    memory_guard_action="raise",           # If "warn", yields ResourceWarning
)

# Check per write (For stricter use cases)
mfs = MemoryFileSystem(
    max_quota=4 * 1024 * 1024 * 1024,
    memory_guard="per_write",
    memory_guard_action="warn",
    memory_guard_interval=1.0,             # Check interval in seconds
)
Enter fullscreen mode Exit fullscreen mode

The Relationship Between Hard Quotas and Memory Guard

It might be easier to understand with an analogy of a house.

  • Hard Quota = The area of a room. A limit on how much baggage you can place.
  • Memory Guard = The building's load-bearing limit check. Confirming whether the building can withstand that weight in the first place.

Only when both are present is the safety of in-memory processing truly complete.

Design Ingenuity of "per_write" Mode

Since the "per_write" mode queries the OS for physical memory balance every time, there are concerns about performance impact. To address this, the memory_guard_interval parameter can control the check interval. The default is 1 second—if 1 second hasn't passed since the last check, it uses the cached value.

# Secures safety while maintaining performance even with high-frequency writes
mfs = MemoryFileSystem(
    max_quota=1 * 1024 * 1024 * 1024,
    memory_guard="per_write",
    memory_guard_action="raise",
    memory_guard_interval=2.0,   # Checks every 2 seconds
)
Enter fullscreen mode Exit fullscreen mode

The guarantee of the Hard Quota that "it absolutely never exceeds the quota", and the guarantee of the Memory Guard that "it won't keep running while physical memory is lacking". This dual defense is the complete picture of D-MemFS's OOM countermeasures.

Behavior in free-threaded Python (GIL=0)

In the free-threaded mode (python3.13t) introduced in Python 3.13 onwards, there is no GIL, making thread conflicts more surface-level. D-MemFS has been tested in GIL=0 environments (369 tests × 3 OS × 3 Python versions), and quota atomicity is guaranteed by explicit locks irrelevant of the GIL.

# Testing in free-threaded Python
python3.13t -c "
from dmemfs import MemoryFileSystem
import threading

mfs = MemoryFileSystem(max_quota=5 * 1024 * 1024)
mfs.mkdir('/test')

def worker(n):
    for i in range(100):
        try:
            with mfs.open(f'/test/w{n}_{i}.bin', 'xb') as f:
                f.write(b'x' * 10240)
        except Exception:
            pass

threads = [threading.Thread(target=worker, args=(i,)) for i in range(20)]
for t in threads: t.start()
for t in threads: t.join()
print('Completed (No crashes)')
"
Enter fullscreen mode Exit fullscreen mode

Conclusion

OOM is a failure that is immensely difficult to debug. Staff traces are rarely left behind, and it's hard to identify which code is the cause. By "proactively rejecting writes that don't fit in the budget," Hard Quotas convert this problem into a catchable exception.

D-MemFS's quota design is based on the philosophy of "No Surprises." Memory usage will never exceed the configured limit, exceptions can be handled with try/except, and the integrity of the file system is always maintained.

If you have ever experienced an OOM failure in in-memory processing, please do give it a try.

pip install D-MemFS
Enter fullscreen mode Exit fullscreen mode

🔗 Links & Resources

If you find this project interesting, a ⭐ on GitHub would be the best way to support my work!

Top comments (0)