Architecting for Concurrency: Wolverine's Approach to Shared Resources

Architecting for Concurrency: Wolverine's Approach to Shared Resources
by Brad Jolicoeur
04/09/2026

You've got a race condition. Two handlers read the same inventory value, both decrement it, both save. Classic read-check-write pattern. Your instinct is to reach for a distributed lock—Redis, Postgres advisory locks, whatever—because it feels safe. It maps to the mutex you learned in school. It's familiar.

I learned early in my distributed systems work that this pattern creates more problems than it solves.

The lock works fine in testing. It works fine in staging. Then you hit production load and discover that distributed locks don't just prevent race conditions—they create new failure modes that are worse than the original bug. Lock contention escalates. Timeouts cascade. The thundering herd after lock release creates spikes that trigger circuit breakers. The irony: the more you need the lock to hold under load, the more it hurts you.

While researching Wolverine, I discovered an approach I hadn't been aware of — one that rendered the saga in my Heisinbug Hunting article unnecessary. It wasn't just a code tweak — it was a built-in approach that Wolverine already had waiting.

This article is about why distributed locks fail under load, and how Wolverine's concurrency primitives—optimistic concurrency and partitioned sequential messaging—solve the problem architecturally instead of coordinationally.

Why Locks Fail Under Load

Let's start with what actually happens when you introduce a distributed lock into a high-throughput async system.

You've got 50 concurrent ReserveInventory messages arriving for the same SKU. You add a Redis lock keyed on ItemId. The first handler acquires the lock, processes the message, decrements inventory, saves, releases the lock. The other 49 handlers wait.

This is where the problems start:

1. Lock contention escalates non-linearly with concurrent requests

With 2 concurrent requests, one waits. With 10, nine wait. With 50, forty-nine wait—and they're not just waiting for the first handler to finish. They're competing with each other to acquire the lock when it's released. The more load you add, the longer each handler spends blocked, which means more handlers pile up waiting. It's a non-linear degradation curve. At some concurrency threshold, your throughput collapses entirely.

2. The thundering herd on lock release

When the lock releases, all 49 waiting handlers wake up simultaneously and compete to acquire it. One wins. Forty-eight go back to sleep. This creates a burst of CPU and network activity every time the lock releases—exactly the kind of spike that triggers circuit breakers or overwhelms connection pools. Instead of smooth sustained throughput, you get jagged spikes of contention.

3. Timeout brittleness

You set a lock timeout—say, 5 seconds—because you can't risk a crashed process holding the lock forever. But now you've got a new failure mode: what happens if processing legitimately takes 6 seconds under load? The lock expires, another handler acquires it, and now you've got two handlers running concurrently on the same resource. The race condition you were trying to prevent just happened anyway, but now it's harder to debug because it only occurs when processing is slow.

Alternatively, you skip the timeout and rely on heartbeats or lock ownership verification. Now you've introduced distributed coordination logic that has to be correct under all failure scenarios. A network partition between the handler and Redis can leave the lock held indefinitely, blocking all other requests.

4. Under load, locks make your system slower

Here's the kicker: distributed locks force serialization and blocking. Every handler pays the latency cost of acquiring the lock, waiting for the lock, and releasing the lock—even when there's no actual contention. You've added a coordination round-trip to every single message, whether it needs it or not.

At low concurrency, this overhead is negligible. At high concurrency, it's catastrophic. You wanted to prevent race conditions. Instead, you've built a system that degrades gracefully until it hits a concurrency threshold, then falls off a cliff.

Here's what that looks like in practice. Your p99 SLO is 200ms. Your handler normally completes in 50ms. Under load, 49 other requests are queued ahead of yours—each holding the lock long enough to do their work. Now your handler waits 2-3 seconds for a lock to release. You haven't changed any business logic. You haven't introduced a slow query or an external API call. You just added a lock. And now you're blowing your SLO on every single request, not because your code is slow, but because it's waiting in line. Backend SLOs are measured in milliseconds. Distributed locks under load are measured in seconds. That gap is your SLA breach.

The more you need the lock to hold—the more concurrent requests you're handling—the worse the lock performs. That's backwards.

The Problem Being Solved

Before we talk about solutions, let's be clear about the problem. Here's the classic race condition in a Wolverine message handler for inventory reservation:

using Marten;
using Wolverine;

namespace TicketingSystem;

public record ReserveInventory(string ItemId, int Quantity, string OrderId);

public class InventoryHandler
{
    public async Task Handle(ReserveInventory command, IDocumentSession session)
    {
        var item = await session.LoadAsync<InventoryItem>(command.ItemId);
        
        if (item == null)
            throw new InvalidOperationException($"Item {command.ItemId} not found");
        
        if (item.Available >= command.Quantity)
        {
            item.Available -= command.Quantity;
            session.Store(item);
        }
        else
        {
            throw new InvalidOperationException("Insufficient inventory");
        }
    }
}

public class InventoryItem
{
    public string Id { get; set; } = string.Empty;
    public string Name { get; set; } = string.Empty;
    public int Available { get; set; }
}

The race is right there: Load → check Available → decrement → Store. When two handlers run concurrently on the same ItemId, they both load Available = 10, both decrement to 9, both save 9. You just oversold by one unit.

The reflex is to add a lock around this entire block. But there's a better way.

Fix 1: Optimistic Concurrency with IVersioned

Marten supports optimistic concurrency control out of the box. Instead of preventing concurrent access with a lock, you detect conflicting updates and retry with fresh state.

Here's the idiomatic Wolverine pattern:

using Marten;
using Wolverine;
using Wolverine.Marten;

public record ReserveInventory(string ItemId, int Quantity, string OrderId);

public class InventoryItem : IVersioned
{
    public string Id { get; set; } = string.Empty;
    public string Name { get; set; } = string.Empty;
    public int Available { get; set; }
    public int Version { get; set; }  // Marten tracks version here
}

public class InventoryHandler
{
    // Wolverine calls validation automatically before Handle()
    public static IEnumerable<string> Validate(ReserveInventory command, InventoryItem item)
    {
        if (item.Available < command.Quantity)
            yield return "Insufficient inventory";
    }

    // [Entity(Required = true)] tells Wolverine to auto-load InventoryItem
    // by matching command.ItemId to InventoryItem.Id
    public static IMartenOp Handle(
        ReserveInventory command, 
        [Entity(Required = true)] InventoryItem item)
    {
        item.Available -= command.Quantity;
        // Return the operation—Wolverine/Marten execute it
        return MartenOps.Store(item);
    }
}

Let me break down why this works:

IVersioned interface: When InventoryItem implements IVersioned, Marten automatically increments the Version field on every update and verifies it hasn't changed during the transaction. If two handlers load version 5 and both try to save version 6, the second save throws a ConcurrencyException.

Wolverine's automatic retry: When Marten throws ConcurrencyException, Wolverine catches it and retries the handler automatically with fresh state. The retried handler loads the updated inventory value (9 instead of 10) and makes the correct decision. The race condition resolves itself through retries.

[Entity(Required = true)]: Wolverine loads the entity automatically by matching command.ItemId to InventoryItem.Id. No IDocumentSession.LoadAsync() boilerplate. Pure dependency injection at the messaging layer.

IMartenOp return type: The handler returns the operation to perform rather than executing it directly. Wolverine and Marten execute the operation, manage the transaction, and handle concurrency exceptions. The handler stays pure—it just describes what should happen.

When to use optimistic concurrency

This approach works beautifully for low to medium contention scenarios:

  • You've got concurrent requests, but they're spread across many different resources
  • Retries are infrequent enough that they don't create a retry storm
  • The cost of occasionally re-executing handler logic is acceptable

It breaks down when you've got high contention on a small set of hot resources:

  • Flash sales where hundreds of requests target the same SKU within seconds
  • Reservation systems where many users compete for limited inventory
  • Scenarios where retries themselves create enough load to trigger cascading failures

At high contention, optimistic concurrency degrades into a retry storm. Every handler conflicts, every handler retries, and you're burning CPU re-executing the same logic over and over. You're not blocking threads like with locks, but you're still wasting resources.

That's where partitioned sequential messaging comes in.

Fix 2: Partitioned Sequential Messaging

Instead of detecting conflicts at the handler level (optimistic concurrency) or preventing them with locks (distributed locking), you can eliminate the race condition structurally by ensuring messages for the same resource never run concurrently in the first place.

This is what partitioned sequential messaging does.

The key insight: don't coordinate at the handler level, coordinate at the routing level.

Messages with the same partition key (like ItemId) route to the same local queue and process sequentially. Messages with different partition keys route to different queues and process in parallel. The race condition becomes architecturally impossible—not because you're detecting it or preventing it, but because the system doesn't allow same-resource messages to run concurrently.

Here's what it looks like in Wolverine:

using Wolverine;

// Marker interface for commands that operate on inventory items
public interface IInventoryCommand
{
    string ItemId { get; }
}

public record ReserveInventory(string ItemId, int Quantity, string OrderId) 
    : IInventoryCommand;

public record AdjustInventory(string ItemId, int Delta, string Reason)
    : IInventoryCommand;

// In Program.cs / host setup
var builder = WebApplication.CreateBuilder(args);

builder.UseWolverine(opts =>
{
    // Configure message partitioning
    opts.MessagePartitioning
        .ByMessage<IInventoryCommand>(x => x.ItemId)  // Partition key
        .PublishToPartitionedLocalMessaging("inventory", 4, topology =>
        {
            topology.MessagesImplementing<IInventoryCommand>();
        });
});

var app = builder.Build();
app.Run();

Let me break down what's happening:

IInventoryCommand interface: This marker interface requires an ItemId property. Any command operating on inventory implements this interface. Now you've got a contract that says "all inventory commands have an ItemId that can be used for partitioning."

ByMessage<IInventoryCommand>(x => x.ItemId): This tells Wolverine to hash messages by their ItemId and route them to a specific partition. All messages with ItemId = "SKU-42" go to the same partition. Messages with ItemId = "SKU-99" go to a different partition.

PublishToPartitionedLocalMessaging("inventory", 4, ...): This creates 4 parallel local queue channels. Wolverine hashes the ItemId to determine which channel handles that message. Within each channel, messages are processed sequentially. Across channels, messages run in parallel.

The result: If 50 concurrent ReserveInventory messages arrive for "SKU-42", they all route to the same channel and process one at a time. If those same 50 messages are spread across 10 different SKUs, they distribute across the 4 channels and run in parallel. Same-resource operations are serialized. Different-resource operations are parallelized.

The handler code stays exactly the same—it's the same clean code we showed in the optimistic concurrency section:

public class InventoryHandler
{
    public static IEnumerable<string> Validate(ReserveInventory command, InventoryItem item)
    {
        if (item.Available < command.Quantity)
            yield return "Insufficient inventory";
    }

    public static IMartenOp Handle(
        ReserveInventory command, 
        [Entity(Required = true)] InventoryItem item)
    {
        item.Available -= command.Quantity;
        return MartenOps.Store(item);
    }
}

No saga state. No coordination logic. No distributed locks. The handler expresses the business rule: validate, decrement, store. Wolverine's partitioning infrastructure ensures messages for the same ItemId never run concurrently, so the race condition can't happen.

Why this scales better than locks

Partitioned sequential messaging gives you natural backpressure and graceful degradation:

No blocking: Messages queue in Wolverine's durable inbox. They don't block threads—they wait their turn in the partition. A slow handler for "SKU-42" doesn't block handlers for "SKU-99". Failures are isolated per partition.

No thundering herd: When a message finishes, the next message in that partition's queue starts processing. There's no sudden wake-up of 49 competing handlers. Throughput is smooth and predictable.

Tunable parallelism: You control the partition count (4 in the example above). More partitions = more parallelism, but also more chance of hash collisions sending unrelated items to the same partition. Tune based on your workload. If you've got 1000 SKUs and most traffic hits 10 of them, 4–8 partitions might be perfect. If traffic is evenly distributed, 16–32 partitions might give better throughput.

Observability: Wolverine tracks queue depth per partition. You can see exactly which ItemId values are hot and whether partitions are balanced. If partition 3 has 5000 queued messages and partition 1 has 50, you know your partitioning strategy needs tuning (or you've got a legitimate hot SKU that's under heavy load).

Graceful degradation under load: As concurrency increases, messages queue in their respective partitions. Latency increases linearly with queue depth, but throughput stays consistent. You don't hit a concurrency cliff where the system suddenly stops processing messages because everyone's blocked waiting for locks.

Why this is simpler than sagas

You might be thinking: "Can't I use a saga to coordinate access to inventory?" Absolutely—sagas are a legitimate approach for managing contention. I've used them successfully in systems where I needed to orchestrate complex multi-step workflows.

Sagas excel at complex stateful coordination: multi-step order fulfillment, payment processing with retries and compensations, long-running processes that span multiple services. They're designed to track state across multiple steps and coordinate decisions based on that state.

But for simple resource serialization—the "make sure inventory operations for the same SKU run in sequence" problem—partitioning is cleaner:

Saga state overhead: Every saga instance is persisted to Marten. You're reading and writing saga state on every message, even when the saga doesn't track meaningful workflow state—just coordination. That's database I/O you don't need for simple serialization.

Simpler mental model: Partitioning is a routing rule, not a workflow construct. "Messages for the same item run in order" is easier to reason about than "create a saga instance per item that coordinates handler execution."

Less code: Partitioning is a configuration block in Program.cs. Sagas require defining saga state, saga handlers, correlation logic, and cleanup rules. For resource serialization, that's additional ceremony.

Sagas are the right tool when you need stateful coordination across complex workflows. Partitioning is the right tool when you need sequential execution per resource. Choose based on the problem you're solving.

When to Use Which

Here's how I think about choosing between these approaches:

Use optimistic concurrency (IVersioned) when:

  • You've got low to medium contention across many resources
  • Retries are infrequent and acceptable
  • You want the simplest possible solution with minimal configuration
  • You're already using Marten and Wolverine defaults

Use partitioned sequential messaging when:

  • You've got high contention on a small set of hot resources (flash sales, limited inventory)
  • You need predictable throughput under sustained load
  • You want structural guarantees that same-resource operations never run concurrently
  • You're willing to tune partition count based on your workload

Use distributed locks when:

  • You're coordinating across multiple services that don't share a message broker
  • You need a lock that spans operations outside Wolverine's control (e.g., calling an external API that isn't idempotent)
  • You're doing one-time initialization or leader election

Notice I didn't say "never use locks." There are legitimate use cases—particularly for one-time coordination or leader election. But for high-throughput message processing where you're worried about race conditions on shared resources, locks are almost never the right answer.

The Mental Model Shift

The hardest part isn't the code—it's the thinking.

When you come from a synchronous background, your instinct is to prevent concurrent access. Mutex, semaphore, lock—these are the tools you learned. They work beautifully in a single-process, multi-threaded environment where lock acquisition is cheap and failure modes are well-understood.

In a distributed async system, that mental model breaks down. Lock acquisition is a network round-trip. Lock ownership verification requires heartbeats or TTLs. A crashed process can hold a lock indefinitely unless you've built sophisticated ownership tracking. The very act of preventing concurrent access introduces distributed coordination, which is inherently fragile.

The shift is this: stop thinking about preventing concurrent access. Start thinking about designing the system so contention can't happen structurally.

Partitioned sequential messaging does that. You're not coordinating at runtime—you're routing at message ingestion time. By the time a handler runs, the system has already ensured that no other handler for the same resource can run concurrently. The race condition isn't prevented. It's impossible.

That's what "designing concurrency into the architecture" looks like in practice.

And when you can't partition (maybe you've got operations that span multiple resources), optimistic concurrency gives you a fallback that retries intelligently instead of blocking. You're still not preventing access—you're detecting conflicts and resolving them through retries.

Locks should be the last resort, not the first instinct.

Conclusion

I've spent years managing contention in distributed systems through sagas, upserts, and deferred message patterns. Those approaches worked, but they required careful design and ongoing tuning. When I saw the failure modes that distributed locks create—contention escalation, timeout brittleness, cascading failures under load—it reinforced why I'd learned to avoid them.

What Wolverine offered wasn't just a different implementation—it was a more elegant architectural approach. Don't coordinate access. Design the system so coordination isn't necessary.

Wolverine's partitioned sequential messaging does that. Same-resource messages route to the same queue and run sequentially. Different-resource messages run in parallel. The race condition can't happen because the architecture prevents it structurally.

For cases where partitioning doesn't fit, optimistic concurrency with IVersioned gives you retry-based conflict resolution without blocking. It's not perfect under high contention, but it's far more resilient than distributed locks.

The next time you're tempted to add a Redis lock around a message handler, stop and ask: "Can I partition this?" If the answer is yes—and it usually is—you'll build a system that scales gracefully instead of falling off a cliff under load.

For a deeper dive on Wolverine's partitioning options and configuration, see the official partitioning documentation.

And if you want to see how to hunt down these kinds of race conditions in the first place—how to make intermittent bugs reproducible through stress testing and chaos engineering—check out the companion post: Heisenbug Hunting in Async .NET Systems.

Because the best fix in the world doesn't matter if you can't find the bug first.

You May Also Like


Heisenbug Hunting in Async .NET Systems

heisenbug-hunting.png
Brad Jolicoeur - 04/07/2026
Read

Disposable Code from the Architect's Perspective

disposable-code.png
Brad Jolicoeur - 04/07/2026
Read

Using Copilot Squad with Copilot CLI for Building .NET Web Applications

copilot-squad.png
Brad Jolicoeur - 04/06/2026
Read