Benchmark Report

Your sinks can be slow.
Your app shouldn't be.

fapilog has been tested against structlog and loguru — each using their recommended async configuration — with a 300ms network sink. Here's what happened.

0.4ms

p50 at 1,000 RPS

median request latency with a 300ms sink

59ms

p99 at 1,000 RPS

structlog: 24s. loguru: error.

99.9%

ERROR + CRITICAL preserved

under heavy load shedding

The test

A FastAPI application writes 5 structured log events per request to a simulated network sink with 300ms base latency (typical for HTTP-based log destinations like Loki or CloudWatch). k6 generates constant-rate request traffic for 60 seconds at four load levels: 10, 100, 1,000, and 3,000 RPS. We measure throughput, request latency (p50 and p99), and event preservation.

Each library uses its default async configuration: structlog's await logger.ainfo() (thread pool offload), loguru's enqueue=True (background drain thread), and fapilog's out-of-the-box production preset. Same app, same sink, same load — no tuning for anyone.

At 10 requests per second

Light load — ~50 log events per second. Even here, architectural differences show.

Throughput — events written per second

fapilog

55 ev/s

structlog

53 ev/s

loguru

5 ev/s

p99 Request Latency — lower is better

fapilog

5 ms

structlog

1.1 s

loguru

50 s

p50 Request Latency — lower is better

fapilog

1 ms

structlog

584 ms

loguru

12 s

fapilog

5 ms

p99 request latency

p50: 1 ms

Actual RPS10
Throughput55 ev/s
Events written3,006
Integrity99.9%
Memory peak101 MB

structlog

1.1 s

p99 request latency

p50: 584 ms

Actual RPS10
Throughput53 ev/s
Events written2,954
Integrity99.8%
Memory peak99 MB

loguru

50 s

p99 request latency

p50: 12 s

Actual RPS1
Throughput5 ev/s
Events written324
Integrity99.7%
Memory peak97 MB

At just 10 RPS, fapilog responds in under 1ms at p50 while structlog's await ainfo() adds 600ms of latency per request — the full sink round-trip. loguru's single drain thread can't keep up even at this load, with p50 latency over 12 seconds as the queue grows unbounded. All three libraries preserve events at this level, but the latency gap is already orders of magnitude.

At 100 requests per second

Moderate load — ~500 log events per second. The gap widens dramatically.

Throughput — events written per second

fapilog

126 ev/s

structlog

56 ev/s

loguru

5 ev/s

p99 Request Latency — lower is better

fapilog

6 ms

structlog

2.5 s

loguru

60 s

p50 Request Latency — lower is better

fapilog

1 ms

structlog

2.2 s

loguru

31 s

fapilog

6 ms

p99 request latency

p50: 1 ms

Actual RPS100
Throughput126 ev/s
Events written6,912
Integrity100%
Memory peak140 MB

structlog

2.5 s

p99 request latency

p50: 2.2 s

Actual RPS9
Throughput56 ev/s
Events written3,197
Integrity99.9%
Memory peak101 MB

loguru

60 s

p99 request latency

p50: 31 s

Actual RPS0
Throughput5 ev/s
Events written365
Integrity100%
Memory peak97 MB

structlog's await ainfo() offloads each log write to a thread pool via run_in_executor. It helps — the event loop isn't directly blocked — but each request still awaits the thread to finish. Under a 300ms sink, the default thread pool saturates quickly, pushing p99 to 2.5 seconds. loguru's enqueue=True uses a single background thread that drains at ~5 events/sec with a 300ms sink, causing unbounded backpressure and 60-second p99 latency.

At 1,000 requests per second

~5,000 log events per second. loguru can't complete the test. structlog is 408x slower.

Throughput — events written per second

fapilog

172 ev/s

structlog

69 ev/s

loguru

error

p99 Request Latency — lower is better

fapilog

59 ms

structlog

24 s

loguru

error

p50 Request Latency — lower is better

fapilog

0 ms

structlog

16 s

loguru

error

fapilog

59 ms

p99 request latency

p50: 0 ms

Actual RPS999
Throughput172 ev/s
Events written9,467
Integrity3%
Memory peak222 MB

structlog

24 s

p99 request latency

p50: 16 s

Actual RPS12
Throughput69 ev/s
Events written4,794
Integrity100%
Memory peak122 MB

loguru

—

server unresponsive

Actual RPS—
Throughput—
Events written—
Integrity—
Memory peak—

At 1,000 RPS, loguru's server errors out entirely — unable to serve any requests. structlog manages only 12 actual RPS with a 24 s p99 latency as its thread pool is overwhelmed. fapilog serves 999 actual RPS with a 59 ms p99 and sub-millisecond p50, activating load shedding to protect request latency.

p50 stays under 1ms at every load level. The p99 tail at high RPS reflects the ~1% of requests that arrive during backpressure transitions — queue resizing, worker scaling, or filter swaps. The median request always completes in under a millisecond.

At 3,000 requests per second

~15,000 log events per second. Beyond the production preset's design point.

Throughput — events written per second

fapilog

110 ev/s

structlog

error

loguru

error

p99 Request Latency — lower is better

fapilog

70 ms

structlog

error

loguru

error

p50 Request Latency — lower is better

fapilog

0 ms

structlog

error

loguru

error

fapilog

70 ms

p99 request latency

p50: 0 ms

Actual RPS2,996
Throughput110 ev/s
Events written6,046
Integrity1%
Memory peak122 MB

structlog

—

server unresponsive

Actual RPS—
Throughput—
Events written—
Integrity—
Memory peak—

loguru

—

server unresponsive

Actual RPS—
Throughput—
Events written—
Integrity—
Memory peak—

At 3,000 RPS, both structlog and loguru error out — their servers become completely unresponsive. fapilog still serves 2,996 actual RPS with a 70 ms p99 and sub-millisecond p50 — request latency remains excellent. However, event preservation drops significantly as the default queue overflows. This is the expected behaviour when the log event rate exceeds the preset's design point: fapilog prioritises application responsiveness over event completeness. Tuning the queue and worker settings for your specific throughput and memory budget restores full protected-level preservation.

This exceeds the production preset's out-of-the-box capacity. The default queue and worker settings are designed for up to ~5,000 log events/sec. At 15,000 ev/s, protected-level preservation degrades. Users should tune max_queue_size, protected_queue_size, and sink_concurrency to match their workload.

Not just faster — richer

fapilog events average 439 bytes. structlog and loguru average ~130 bytes. The extra bytes aren't bloat — they're operational context that the other libraries don't provide.

fapilog — ~440 bytes/event

{
  "timestamp": "2026-02-07T16:56:09.854Z",
  "level": "INFO",
  "message": "Request completed",
  "diagnostics": {
    "host": "prod-web-01",
    "pid": 86909,
    "python": "3.11.10",
    "service": "api"
  },
  "context": {
    "message_id": "d594...37ca"
  },
  "data": {
    "method": "POST",
    "path": "/api/v1/orders",
    "status_code": 200,
    "correlation_id": "b38f...55d6",
    "latency_ms": 0.62
  }
}

structlog / loguru — ~130 bytes/event

{
  "level": "info",
  "message": "Request completed",
  "method": "POST",
  "path": "/api/v1/orders",
  "status_code": 200,
  "correlation_id": "b38f...55d6"
}



No timestamp. No host. No PID.
No service name. No message ID.
No structured envelope.

fapilog automatically enriches every event with runtime diagnostics (host, PID, Python version, service name), per-event message IDs, ISO timestamps, and a structured envelope that separates operational metadata from business data. All of this is built-in — zero custom code.

On top of this, fapilog's production preset enables three native redactors — URL credential stripping, field masking, and regex-based pattern matching — recursively scanning every log event for sensitive data. structlog and loguru have no equivalent built-in capability. In these benchmarks, fapilog is performing more work per event than the other two libraries.

To get the same metadata and safety from structlog or loguru, you'd write custom processors, context managers, formatters, and redaction logic. fapilog does 3x more work per event, writes 3x more bytes to the sink, runs three redactors on every event, and is still 408x faster.

Not all events are equal

At 1,000 RPS with a slow sink, fapilog can't write every event. But it chooses what to shed. With protected_levels, ERROR and CRITICAL events get priority retention while INFO is shed first.

Info

12%

6,247 / 53,962

Warning

199 / 2,978

Error

99.9%

2,407 / 2,409

Critical

100%

614 / 614

99.9% of ERROR and CRITICAL events preserved while shedding 88% of INFO traffic to protect request latency. When the queue fills, fapilog's priority-aware queue evicts lower-priority events to make room for protected levels.

The workload emits a realistic level mix: 90% INFO, 5% WARNING, 4% ERROR, 1% CRITICAL. At 100 RPS, fapilog preserves 100% of all levels — no shedding is needed. At 1,000 RPS, the pipeline sheds INFO and WARNING events while protecting every ERROR and CRITICAL. The other libraries have no shedding mechanism — they simply error out under load.

Why the difference

Thread-based async isn't the same as purpose-built async.

fapilog — async pipeline

Request arrives

queue.put() — non-blocking

<1ms

Response sent immediately

Async workers drain queue with 64 concurrent writes

async

structlog — thread pool

Request arrives

await run_in_executor()

~300ms

Event loop free but request waits for thread

Thread pool saturates under load — requests queue for slots

queued

structlog's await ainfo() offloads the write to a thread, freeing the event loop. But each request still awaits completion of that thread — the response doesn't return until the 300ms write finishes. At scale, the thread pool becomes the bottleneck. loguru's enqueue=True uses a single background drain thread, which processes writes sequentially at ~3/sec — the queue grows without bound.

fapilog decouples completely: the log event goes into an async queue in under 1ms, the response returns immediately, and a pool of async workers drains to the sink with 64 concurrent writes. Under pressure, adaptive backpressure scales workers and sheds low-priority traffic rather than blocking requests.

What “recommended” means

Each library's documented, first-party mechanism for non-blocking log writes.

fapilog

0 lines changed — config preset

# built-in preset
profile = "production"

# enables:
#   async queue + workers
#   concurrent sink writes
#   protected ERROR/CRITICAL
#   3 native redactors

structlog

~1 line per log call site

# replace every log call:
- logger.info("msg", **kw)
+ await logger.ainfo("msg", **kw)

# requires async context
# each call site must change

loguru

1 line changed

# add enqueue to sink:
logger.add(
    sink,
    enqueue=True
)

# background drain thread
# single-threaded writes

Methodology

Reproducible, automated, verified.

Environment

PlatformDarwin

Python3.11.10

FrameworkFastAPI + Uvicorn

Load generatork6

Duration60s per test

Logs per request5

Sink simulation

Profilenetwork_typical

Base latency300ms

Jitter±100ms

Failure rate0.1%

Simulates HTTP-based log destinations like Loki, CloudWatch, or Datadog.

Verification

Every log event carries a unique correlation ID and sequence number. A post-test harness validates event count, uniqueness, JSON structure integrity, and per-level preservation rates.

Fairness

Each library runs in an isolated process with the identical FastAPI app, sink, and load profile. k6 uses constant-arrival-rate to maintain steady request pressure regardless of server response time. All libraries use their recommended async configuration.

Generated by fapilog-bm · Raw data

Your sinks can be slow.Your app shouldn't be.

The test

At 10 requests per second

At 100 requests per second

At 1,000 requests per second

At 3,000 requests per second

Not just faster — richer

Not all events are equal

Why the difference

What “recommended” means

fapilog

structlog

loguru

Methodology

Environment

Sink simulation

Verification

Fairness

Your sinks can be slow.
Your app shouldn't be.