fapilog
Benchmark Report

Your sinks can be slow.
Your app shouldn't be.

fapilog has been tested against structlog and loguru — each using their recommended async configuration — with a 300ms network sink. Here's what happened.

0.4ms
p50 at 1,000 RPS
median request latency with a 300ms sink
59ms
p99 at 1,000 RPS
structlog: 24s. loguru: error.
99.9%
ERROR + CRITICAL preserved
under heavy load shedding

The test

A FastAPI application writes 5 structured log events per request to a simulated network sink with 300ms base latency (typical for HTTP-based log destinations like Loki or CloudWatch). k6 generates constant-rate request traffic for 60 seconds at four load levels: 10, 100, 1,000, and 3,000 RPS. We measure throughput, request latency (p50 and p99), and event preservation.

Each library uses its default async configuration: structlog's await logger.ainfo() (thread pool offload), loguru's enqueue=True (background drain thread), and fapilog's out-of-the-box production preset. Same app, same sink, same load — no tuning for anyone.

At 10 requests per second

Light load — ~50 log events per second. Even here, architectural differences show.

Throughput — events written per second
fapilog
55 ev/s
structlog
53 ev/s
loguru
5 ev/s
p99 Request Latency — lower is better
fapilog
5 ms
structlog
1.1 s
loguru
50 s
p50 Request Latency — lower is better
fapilog
1 ms
structlog
584 ms
loguru
12 s
fapilog
5 ms
p99 request latency
p50: 1 ms
  • Actual RPS10
  • Throughput55 ev/s
  • Events written3,006
  • Integrity99.9%
  • Memory peak101 MB
structlog
1.1 s
p99 request latency
p50: 584 ms
  • Actual RPS10
  • Throughput53 ev/s
  • Events written2,954
  • Integrity99.8%
  • Memory peak99 MB
loguru
50 s
p99 request latency
p50: 12 s
  • Actual RPS1
  • Throughput5 ev/s
  • Events written324
  • Integrity99.7%
  • Memory peak97 MB

At just 10 RPS, fapilog responds in under 1ms at p50 while structlog's await ainfo() adds 600ms of latency per request — the full sink round-trip. loguru's single drain thread can't keep up even at this load, with p50 latency over 12 seconds as the queue grows unbounded. All three libraries preserve events at this level, but the latency gap is already orders of magnitude.

At 100 requests per second

Moderate load — ~500 log events per second. The gap widens dramatically.

Throughput — events written per second
fapilog
126 ev/s
structlog
56 ev/s
loguru
5 ev/s
p99 Request Latency — lower is better
fapilog
6 ms
structlog
2.5 s
loguru
60 s
p50 Request Latency — lower is better
fapilog
1 ms
structlog
2.2 s
loguru
31 s
fapilog
6 ms
p99 request latency
p50: 1 ms
  • Actual RPS100
  • Throughput126 ev/s
  • Events written6,912
  • Integrity100%
  • Memory peak140 MB
structlog
2.5 s
p99 request latency
p50: 2.2 s
  • Actual RPS9
  • Throughput56 ev/s
  • Events written3,197
  • Integrity99.9%
  • Memory peak101 MB
loguru
60 s
p99 request latency
p50: 31 s
  • Actual RPS0
  • Throughput5 ev/s
  • Events written365
  • Integrity100%
  • Memory peak97 MB

structlog's await ainfo() offloads each log write to a thread pool via run_in_executor. It helps — the event loop isn't directly blocked — but each request still awaits the thread to finish. Under a 300ms sink, the default thread pool saturates quickly, pushing p99 to 2.5 seconds. loguru's enqueue=True uses a single background thread that drains at ~5 events/sec with a 300ms sink, causing unbounded backpressure and 60-second p99 latency.

At 1,000 requests per second

~5,000 log events per second. loguru can't complete the test. structlog is 408x slower.

Throughput — events written per second
fapilog
172 ev/s
structlog
69 ev/s
loguru
error
p99 Request Latency — lower is better
fapilog
59 ms
structlog
24 s
loguru
error
p50 Request Latency — lower is better
fapilog
0 ms
structlog
16 s
loguru
error
fapilog
59 ms
p99 request latency
p50: 0 ms
  • Actual RPS999
  • Throughput172 ev/s
  • Events written9,467
  • Integrity3%
  • Memory peak222 MB
structlog
24 s
p99 request latency
p50: 16 s
  • Actual RPS12
  • Throughput69 ev/s
  • Events written4,794
  • Integrity100%
  • Memory peak122 MB
loguru
server unresponsive
  • Actual RPS
  • Throughput
  • Events written
  • Integrity
  • Memory peak

At 1,000 RPS, loguru's server errors out entirely — unable to serve any requests. structlog manages only 12 actual RPS with a 24 s p99 latency as its thread pool is overwhelmed. fapilog serves 999 actual RPS with a 59 ms p99 and sub-millisecond p50, activating load shedding to protect request latency.

p50 stays under 1ms at every load level. The p99 tail at high RPS reflects the ~1% of requests that arrive during backpressure transitions — queue resizing, worker scaling, or filter swaps. The median request always completes in under a millisecond.

At 3,000 requests per second

~15,000 log events per second. Beyond the production preset's design point.

Throughput — events written per second
fapilog
110 ev/s
structlog
error
loguru
error
p99 Request Latency — lower is better
fapilog
70 ms
structlog
error
loguru
error
p50 Request Latency — lower is better
fapilog
0 ms
structlog
error
loguru
error
fapilog
70 ms
p99 request latency
p50: 0 ms
  • Actual RPS2,996
  • Throughput110 ev/s
  • Events written6,046
  • Integrity1%
  • Memory peak122 MB
structlog
server unresponsive
  • Actual RPS
  • Throughput
  • Events written
  • Integrity
  • Memory peak
loguru
server unresponsive
  • Actual RPS
  • Throughput
  • Events written
  • Integrity
  • Memory peak

At 3,000 RPS, both structlog and loguru error out — their servers become completely unresponsive. fapilog still serves 2,996 actual RPS with a 70 ms p99 and sub-millisecond p50 — request latency remains excellent. However, event preservation drops significantly as the default queue overflows. This is the expected behaviour when the log event rate exceeds the preset's design point: fapilog prioritises application responsiveness over event completeness. Tuning the queue and worker settings for your specific throughput and memory budget restores full protected-level preservation.

This exceeds the production preset's out-of-the-box capacity. The default queue and worker settings are designed for up to ~5,000 log events/sec. At 15,000 ev/s, protected-level preservation degrades. Users should tune max_queue_size, protected_queue_size, and sink_concurrency to match their workload.

Not just faster — richer

fapilog events average 439 bytes. structlog and loguru average ~130 bytes. The extra bytes aren't bloat — they're operational context that the other libraries don't provide.

fapilog — ~440 bytes/event
{
  "timestamp": "2026-02-07T16:56:09.854Z",
  "level": "INFO",
  "message": "Request completed",
  "diagnostics": {
    "host": "prod-web-01",
    "pid": 86909,
    "python": "3.11.10",
    "service": "api"
  },
  "context": {
    "message_id": "d594...37ca"
  },
  "data": {
    "method": "POST",
    "path": "/api/v1/orders",
    "status_code": 200,
    "correlation_id": "b38f...55d6",
    "latency_ms": 0.62
  }
}
structlog / loguru — ~130 bytes/event
{
  "level": "info",
  "message": "Request completed",
  "method": "POST",
  "path": "/api/v1/orders",
  "status_code": 200,
  "correlation_id": "b38f...55d6"
}



No timestamp. No host. No PID.
No service name. No message ID.
No structured envelope.

fapilog automatically enriches every event with runtime diagnostics (host, PID, Python version, service name), per-event message IDs, ISO timestamps, and a structured envelope that separates operational metadata from business data. All of this is built-in — zero custom code.

On top of this, fapilog's production preset enables three native redactors — URL credential stripping, field masking, and regex-based pattern matching — recursively scanning every log event for sensitive data. structlog and loguru have no equivalent built-in capability. In these benchmarks, fapilog is performing more work per event than the other two libraries.

To get the same metadata and safety from structlog or loguru, you'd write custom processors, context managers, formatters, and redaction logic. fapilog does 3x more work per event, writes 3x more bytes to the sink, runs three redactors on every event, and is still 408x faster.

Not all events are equal

At 1,000 RPS with a slow sink, fapilog can't write every event. But it chooses what to shed. With protected_levels, ERROR and CRITICAL events get priority retention while INFO is shed first.

Info
12%
6,247 / 53,962
Warning
7%
199 / 2,978
Error
99.9%
2,407 / 2,409
Critical
100%
614 / 614
99.9% of ERROR and CRITICAL events preserved while shedding 88% of INFO traffic to protect request latency. When the queue fills, fapilog's priority-aware queue evicts lower-priority events to make room for protected levels.

The workload emits a realistic level mix: 90% INFO, 5% WARNING, 4% ERROR, 1% CRITICAL. At 100 RPS, fapilog preserves 100% of all levels — no shedding is needed. At 1,000 RPS, the pipeline sheds INFO and WARNING events while protecting every ERROR and CRITICAL. The other libraries have no shedding mechanism — they simply error out under load.

Why the difference

Thread-based async isn't the same as purpose-built async.

fapilog — async pipeline
1
Request arrives
2
queue.put() — non-blocking
<1ms
3
Response sent immediately
4
Async workers drain queue with 64 concurrent writes
async
structlog — thread pool
1
Request arrives
2
await run_in_executor()
~300ms
3
Event loop free but request waits for thread
4
Thread pool saturates under load — requests queue for slots
queued

structlog's await ainfo() offloads the write to a thread, freeing the event loop. But each request still awaits completion of that thread — the response doesn't return until the 300ms write finishes. At scale, the thread pool becomes the bottleneck. loguru's enqueue=True uses a single background drain thread, which processes writes sequentially at ~3/sec — the queue grows without bound.

fapilog decouples completely: the log event goes into an async queue in under 1ms, the response returns immediately, and a pool of async workers drains to the sink with 64 concurrent writes. Under pressure, adaptive backpressure scales workers and sheds low-priority traffic rather than blocking requests.

What “recommended” means

Each library's documented, first-party mechanism for non-blocking log writes.

fapilog

0 lines changed — config preset
# built-in preset
profile = "production"

# enables:
#   async queue + workers
#   concurrent sink writes
#   protected ERROR/CRITICAL
#   3 native redactors

structlog

~1 line per log call site
# replace every log call:
- logger.info("msg", **kw)
+ await logger.ainfo("msg", **kw)

# requires async context
# each call site must change

loguru

1 line changed
# add enqueue to sink:
logger.add(
    sink,
    enqueue=True
)

# background drain thread
# single-threaded writes

Methodology

Reproducible, automated, verified.

Environment

PlatformDarwin
Python3.11.10
FrameworkFastAPI + Uvicorn
Load generatork6
Duration60s per test
Logs per request5

Sink simulation

Profilenetwork_typical
Base latency300ms
Jitter±100ms
Failure rate0.1%

Simulates HTTP-based log destinations like Loki, CloudWatch, or Datadog.

Verification

Every log event carries a unique correlation ID and sequence number. A post-test harness validates event count, uniqueness, JSON structure integrity, and per-level preservation rates.

Fairness

Each library runs in an isolated process with the identical FastAPI app, sink, and load profile. k6 uses constant-arrival-rate to maintain steady request pressure regardless of server response time. All libraries use their recommended async configuration.

Generated by fapilog-bm · Raw data