Reliability and architecture
Reliability and architecture
Connection isolation
Each broker alias owns one RabbitMQConnectionManager. Per thread it caches independent producer and consumer connections. Producer channels enable confirms.
The separation prevents three common problems: publishing from a handler re-entering the consumer I/O loop, heartbeat frames colliding with concurrent operations, and a consumer reconnect destroying a healthy publisher.
Delivery guarantees
Flask-RMQ provides at-least-once processing building blocks, not exactly-once delivery:
- persistent messages plus durable queues survive normal broker restarts;
- confirms report whether the broker accepted a publication;
mandatory=Truereports unroutable messages;- a process can still crash after applying a side effect but before acking, causing redelivery.
Consumers must therefore be idempotent. Put a message/event ID in the payload, insert it into a table with a unique constraint in the same transaction as the business change, and skip duplicate IDs.
Transactional outbox
Publishing to RabbitMQ and committing a relational database are not one atomic operation. For important domain events:
- write the business change and an outbox row in one DB transaction;
- a dispatcher publishes pending rows;
- mark a row delivered only after publisher confirmation;
- retry pending rows with stable event IDs;
- deduplicate at consumers.
This is also the correct model for cross-broker fan-out. Exchanges route only within one broker; publication to broker A and broker B cannot be atomic.
Dead letters
Configure QueueConfig.dead_letter_exchange and dead_letter_routing_key, declare the DLX and failure queue in topology, and alert on failure-queue depth. Handler exceptions are nacked without requeue. Build an explicit replay tool after fixing the underlying issue rather than automatically cycling poison messages.
Shutdown and health
SIGTERM and SIGINT stop all consumers through one event. Each session polls at most one second and reconnect waits are interruptible. In orchestration, set a termination grace period longer than the longest normal handler and ensure handlers honor their own external I/O timeouts.