
On reading a message from the queue, the ECS task can discover whether the node it is processing is a parent node with more child nodes, or a leaf node.

The ECS Task should be created with a Task IAM Role ( not to be confused with Execution Role) that permits accessing the queue and other needed resources.Ĭ. We can pass parameters like the queue URL as task environment variables.ī. The process running inside the container can, in a nearly continuous loop with sleeps, read the SQS queue. Development of Docker Container Image and Deployment to ECS.Ī. For a graphical or relational database, this can be the id or primary key.Ģ. For S3 content processing, this can be the prefix.
#Sqs queue metrics portable
JSON is a good portable option.Ī message should identify a single node for processing. Defining a format and schema for the SQS messages. Considerations for Design Pattern Usageġ. There are several ways to publish custom metrics and auto-scale two are shown in Figure 2. This continues until there are no nodes left. Each new task instance helps drain the queue and continues posting more messages upon finding child nodes in the tree. A recalculated value of the custom metric, “Backlog per Task”, which refers to the number of unread messages divided by the number of ECS Tasks, is posted to Amazon CloudWatch by either an AWS Lambda function or a command-line interface (CLI) command running every one minute. This prompts the automatic scaling mechanism to spin up more ECS tasks because the queue size has changed. It then recursively posts back more messages to the SQS queue, one for each such child node. It enumerates other nodes, especially the ones that the current one is the parent of. Architecture: Recursive Scaling using Amazon SQS and Amazon ECS Fargate clusterĮach ECS task processes one named node at a time. Processing starts when the first message (representing the root, or entity ‘A’ in Figure 1) is posted to the SQS queue, see following Figure 2.įigure 2. The Design Pattern: Recursive Amazon ECS Scaling Using SQS and Custom Metrics It was important to achieve this while retaining control on the number of parallel nodes, ability to report progress, and retry failures. The situation demanded running many instances of Python or Java AWS SDK thread pools in parallel, to rapidly process non-overlapping prefixes. The goal was to achieve massive parallelization of these prefixes during copying and processing. The leaf directories had thousands of objects each. The data lake content spanned thousands of “subfolders” (prefixes) nested several dozens of layers deep. In a real-life scenario for an AWS customer, hundreds of petabytes of existing S3 content in named buckets were copied and processed. Recursion is the ideal solution for discovery and processing of such nodes. In each such scenario, a compute operation is performed for each node in the tree. This is true whether the nodes are rows in a relational database, nodes in a graph database, or S3 prefixes. This will result in the simultaneous processing of entities, regardless of whether an entity is a leaf node or parent to some other node.Īn efficient algorithm to traverse an inverted tree and process nodes in parallel can be applied to many common needs.

Then these people came in contact with a different (or the same) set of people. Imagine that an application has collected the following tree-like COVID contact data, where ‘A’ has come in contact with ‘B’, ‘C’ and ‘D’. Let us explore a couple of example scenarios, both hypothetical (COVID contact tracing) and real (processing of existing Amazon Simple Storage Service (S3) content). Some use cases would be database parent-child records or graph relationships. It’s useful in a workflow where all nodes between the root and leaves must be processed rapidly and in parallel. In fact, any use case with a tree-like set of entities can use this pattern.
#Sqs queue metrics professional
At AWS Professional Services, we have recently used a variant of this pattern to achieve highly parallel computation for larger customers. Scaling based on Amazon Simple Queue Service (SQS) is a commonly used design pattern.
