- How do queues work and help the system scale?
- What are the different queueing strategies that are available for developers?
- Messaging Queues
- Task Queues
- When should we use queues in distributed applications?
- Asynchronous processing
- Event-based communications
- Decoupling microservices
- Improving fault tolerance and retries
- Delayed processing
- What are the common pitfalls and architectural challenges of queues?
- Fault Tolerance and Retries
- Idempotent Actions
- What are some queueing services that we can use?
- 1. RabbitMQ
- 2. Apache Kafka:
- 3. Amazon SQS
- 4. Google Pub/Sub:
- Wrapping up
- How Amplication Fits in
When building mission-critical applications that manage large workloads, you must ensure our systems can handle the workloads. In some use cases, we can handle the demand with autoscaling policies on the infrastructure so that our infrastructure can scale up or down based on the demand.
However, there are certain use cases where this just isn't enough. For example, consider the following use case:
- A user wishes to manage a content portfolio of all the articles a person has written.
- This user then inputs a set of URLs (100 URLs) onto the system.
- The system then visits these URLs and scrapes certain information needed to build the portfolio. This can include the: Article Name, Published Date, Outline, Reading Time, and Banner Image.
- The system then stores it in an internal database.
A simple flowchart of this process is depicted below:
Figure: A flowchart of the content portfolio workflow
On its own, this process seems quite simple. But this can cause a lot of issues on a large scale:
- The system cannot simultaneously handle thousands or millions of requests when each request may include hundreds of URLs.
- We cannot predict if a URL is working or broken. Therefore, there's a high chance of system errors which may break the entire process and makes it very hard to manage retries.
- The URL we visit may have high latency, slowing the overall process.
This approach will not scale and is not resilient enough to handle high demand.
How do queues work and help the system scale?
This is where messaging queues come into the picture.
A queue is a linear data structure that can process requests on a first-in-first-out basis. Queues play a huge role when we are working in distributed systems. Queues enable asynchronous communication within a distributed system. For example, consider the scenario shown below:
Figure: A synchronous workflow of the web scraper
The system will implement a linear-scaling model if we build the portfolio import scenario we discussed before using a synchronous process. This means that if
URLInputLambda processes 100 invocations at the same time, the
ScraperLambda will then be invoked synchronously to scrape the data, one after the other.
We wrongly assume that all requests will take the same time and succeed. But, when running into an error for a single URL, handling retries for that URL will be tough.
To resolve that, we should distribute the workflow and use a message queue to handle each task separately and asynchronously. It will look something like this:
Figure: Refactoring the architecture with a messaging queue
As shown above, we are using Apache Kafka as the messaging queue. Each URL will be pushed into the queue as a single event from the
URLInputLambda. Next, the
ScraperLambdaget each event from the queue to be processed.
By bringing this architectural change, we immediately gain these benefits:
- The system is now responsive and highly scalable: Once the messages are fed into the queue, the
URLInputLambdacan complete the request processing and return a 200 status code to the user indicating that the messages have been accepted into the system. The queue can then take its own time and process the messages asynchronously. This means the forward-facing services can handle millions of requests without issues as the workflow is done asynchronously.
- The system is now decoupled: By bringing the messaging queue, we have decoupled the system services. The two Lambda functions don't scale linearly, as the queue can manage the flow of requests. If needed, we can use more instances of the processor while keeping using minimal instances of other services.
- The system is now resilient with improved fault tolerance: We can handle errors of broken URLs and set up a Dead Letter Queue, and push messages to the dead letter queue after X number of retries. This ensures no data is lost, and one failure does not disrupt the entire system, improving resiliency.
What are the different queueing strategies that are available for developers?
Well, now that we've got a rough idea of how powerful a queue is in a distributed system, it's essential to understand the different types of queues that we can use in our system:
We may use these standard queues for simple and linear use cases. Its responsibility is to store messages in a queue-based structure and let components send and receive messages from it.
A task queue is a specialized form of message queue that explicitly handles task distribution and processing. Consumers pick up tasks from the queue and execute them independently. This lets us integrate proper flow control, ensure our system scales non-linearly, and is often powered by event sourcing.
This type of queueing strategy is sometimes treated as a building block of an event-driven architecture. It consists of a central publisher that pushes data onto subscribers who subscribe to it through an event-streaming-based approach. To learn more about using Pub/Sub models in microservices, look into this in-depth article.
The Pub/Sub strategy relies on event-driven architectures where publishers publish events for subscribers to process. One of the pitfalls of this approach is that it operates on a fire-and-forget basis, meaning that if a subscriber fails to process an event, that event gets lost forever. However, we can build a fault tolerance system around the subscriber to handle such errors.
When should we use queues in distributed applications?
This is one of our most important questions when working with a queue. "When should we use it?".
We should typically use a queue to accomplish some flow control or asynchronous processing in our application while enhancing application scalability, reliability, and resiliency.
Below are some best practices we should consider when using a queue.
Queues let parts of our application perform operations asynchronously. For example, as discussed before, if you were to pass 100 URLs at a given time, and 1000 users performed this operation simultaneously, the system resources would be exhausted in one go as all of your time-consuming requests are immediately processed.
However, by introducing a queue, we can acknowledge back to the user with an "OK" message, while the system can take its time processing each request in an asynchronous nature without linearly scaling its resources. This lets you adopt architectural patterns such as "Queue Based Load Leveling".
Queues are sometimes treated as event hubs as components can push data onto a queue, and other services can subscribe to it, poll for data, and begin processing. This lets us decouple parts of the application and enforce isolated, independent processing.
Queues are used in microservices to decouple different services from each other. So, services can communicate through queues rather than calling each service directly.
This ensures better scalability of each service and reduces coupling between each service, thus, making it easier to modify or replace individual services without affecting others.
Improving fault tolerance and retries
Another reason to use a message queue is to ensure that messages are not lost. By default, queues can retry, or store failed messages through Dead Letter Queues. You can configure the Dead Letter Queue to let your system automatically push a failed message onto a special queue after X number of failed retries.
Certain queueing services like Amazon SQS support delayed processing. This lets your queue send messages to its consumers by enforcing a delay of a set number of seconds. This is beneficial when you want your consumers to cope-up with the demand.
What are the common pitfalls and architectural challenges of queues?
But, messaging queues are not always the right soltuion for our application. It has its own set of pitfalls and challenges:
Fault Tolerance and Retries
Though we said fault tolerance is offered in queues, using it correctly can often be challenging. Most of the time, we will have to create a separate queue that will act as a dead letter queue, and we will have to configure it to work as we wish manually.
Another challenge with queues is ensuring that a message is only processed once. By default, most queues use at-least-once delivery, with a potential for some messages to be handled more than once. So, it's essential to implement message deduplication using a deduplication ID that lets the queue prevent the message with the same deduplication ID from being pushed to a consumer over a certain period.
What are some queueing services that we can use?
Below are some popular queue services and technologies we can use
RabbitMQ is a robust and highly configurable open-source message broker that implements the Advanced Message Queuing Protocol (AMQP).
- Supports various messaging patterns, including point-to-point, publish-subscribe, request-reply, and more.
- Rich feature set with advanced routing, message acknowledgment, message durability, and priority queues.
- High reliability and fault tolerance with support for clustering and message replication.
- Integrates well with multiple programming languages and platforms.
- Slightly complex setup and configuration compared to other queuing systems.
2. Apache Kafka:
Apache Kafka is a distributed event streaming platform designed for high-throughput, real-time data processing and event-driven architectures.
- Scalable and high throughput with support for parallel processing and partitioning.
- Provides strong durability and fault tolerance through data replication across brokers.
- Real-time data streaming and processing capabilities.
- Supports event replay, allowing consumers to go back and reprocess past events.
- Efficiently handles large-scale data streams and has low-latency characteristics.
- Slightly more complex to set up and manage compared to traditional message queues.
- It may not be the best fit for point-to-point messaging or request-reply patterns.
3. Amazon SQS
Amazon SQS is a fully managed message queue service that provides a reliable and scalable solution for asynchronous messaging between distributed components and microservices.
- Fully managed service with high availability, durability, and automatic scaling.
- Supports two types of queues: Standard Queue for high throughput and FIFO Queue for ordered, exactly-once processing.
- Super easy deduplication rules configuration for FIFO queues.
- Support for dead-letter queues.
- Messages are limited to 256KB in size; anything bigger will require implementing a solution that leverages S3 buckets.
- Limitations on the number of in-flight messages and FIFO throughput.
4. Google Pub/Sub:
Google Pub/Sub is a fully managed, highly scalable messaging service that enables real-time messaging and event-driven architectures.
- Scales automatically based on demand and offers low-latency message delivery.
- Offers fine-grained access controls for securing messages and topics.
- Limited features compared to more advanced event streaming platforms like Apache Kafka.
Messaging queues have become the industry standard for building highly scalable, reliable, available, and resilient distributed systems. Their ability to decouple components, handle asynchronous processing, and provide fault tolerance has made them vital to modern application architectures.
However, it's essential to understand the challenges associated with messaging queues. Proper handling of fault tolerance, idempotent actions, and message deduplication is crucial to avoid potential pitfalls.
Considering these aspects, we can build systems that efficiently handle complex and demanding workloads!
How Amplication Fits in
Adopting asynchronous communication models can improve your microservices architecture's scalability, reliability, and performance. But building a scalable microservice architecture requires a lot of planning and boilerplate, scaffolding, and repetitive coding.
Amplication is a code generator for backend services that generates all the repetitive parts of microservices architecture, including communication between services using message brokers with all the best practices and industry standards.
You can build the foundation of your backend services with Amplication in minutes and focus on the business value of your product.