Google Cloud Tasks: Next-Level Task Execution for Modern Applications

Google Cloud Tasks: Next-Level Task Execution for Modern Applications

Introduction

Efficient task management is vital in modern distributed and scalable cloud environments. Google Cloud Tasks offers a managed solution that simplifies the distribution and execution of tasks across various components of your application. In this article, we will explore the key features of Google Cloud Tasks and demonstrate how to leverage them using Node.js code snippets.

What is Google Cloud Tasks?

Google Cloud Tasks is a fully managed task distribution service that allows you to reliably enqueue and execute tasks. It provides features such as task queuing, scheduling, retries, and prioritization, making it an ideal choice for building scalable and responsive applications.

Getting Started with Google Cloud Tasks

To start using Google Cloud Tasks, follow these steps:

Step 1: Enable the Cloud Tasks API

Ensure that you have enabled the Cloud Tasks API in your Google Cloud project. You can do this through the Google Cloud Console or by using the gcloud command-line tool.

gcloud services enable cloudtasks.googleapis.com

Step 2: Create a Task Queue

A task queue is a container for your tasks. Create a task queue by specifying a name and other optional parameters such as maximum task attempts, rate limits, and worker constraints.

Step 3: Enqueue Tasks:

Enqueue tasks to the task queue by specifying the request method, URL, body and any other optional parameters that fit your needs. The payload can contain any data necessary for task execution. The body needs to be in base64 in order for the data to be sizeable and easily transmitted over the network.

Step 4: Task Handler:

Implement a task handler that processes the tasks. This could be a separate route or function that receives the tasks, extracts the payload, and performs the necessary actions.

Note that it is very important to return a 200. Any status code other than that indicates that the execution failed and Cloud Task will keep on retrying (depending on your queue configurations).

That's it, basically.

If you want to get information about a task and/or delete a task, you can use the methods below.

Let's go through a simple example of how to use the cloud task functions we created. There are different use cases of Cloud Tasks (i mentioned them in the later part of this article) but for the sake of simplicity of this article, let's imagine that we have to do different batch jobs, sort of like a sequence of jobs.

We can have a BatchJobService that has functions which call the cloud task functions that we created.

import { ITaskService } from "./TaskService";

const baseUrl = "http://whatever-your-base-url-is";

export interface IObject {
    [key: string]: any;
}

export class BatchService {

    constructor (private taskService: ITaskService) {}

    async createFirstBatchTask() {
        //create queue
        const queueName = await this.taskService.createTaskQueue('first-batch-queue');

        //add a task to the queue you created above
        await this.taskService.createTask(queueName, {
            taskName: "first-batch-task",
            url: `${baseUrl}/create-first-batch`,
            data: { //whatever data you want to send or pass
                operationType: "batch",
                value: 20
            }
        });
    }

    async processFirstBatchTask(data: IObject) {
        console.log(data); //{operationType: "batch", value:20}
    }
}

The example above is pretty explanatory and there's not much to talk about. The first function createFirstBatchTask creates a task queue (like a container), and then enqueues a task into the queue while passing the data to be sent, url to process the data and what HTTP method to use.

The second function processFirstBatchTask is the handler which processes whatever data and does whatever it wants with it.

The full code can be seen here: https://github.com/SirPhemmiey/cloud-task-tutorial

If you have used Google Cloud Pub/Sub before, you'd probably be wondering about the difference between Cloud Pub/Sub and Cloud Tasks, just like I did before i started using Cloud Tasks. Truth is, they are both powerful services provided by GCP, but they serve different purposes and have distinct characteristics.

Amongst other differences between the two, the core difference is in their message handling and invocation; implicitly and explicitly.

What does Implicit and Explicit Invocation even mean?

Implicit: In this case, the publisher has no control over the delivery of the message. Pub/Sub aims to decouple publishers of events and subscribers to those events. Publishers do not need to know anything about their subscribers.

Explicit: By contrast, Cloud Tasks is aimed at explicit invocation where the publisher retains full control of execution. The publisher can tell how the message should be delivered, when the message should be delivered and what to pass in the message. Full control.

Another benefit of Cloud Tasks is you can pause/resume the queue using Cloud Console and CLI command to stop/start the processing of tasks, very similar to Google Cloud Scheduler.

  1. Detailed Comparison of Cloud Tasks and Pub/Sub

    Advanced Features and Use Cases of Cloud Tasks

    Google Cloud Tasks offers several advanced features and use cases:

    1. Task Scheduling: You can schedule tasks to be executed at specific times or intervals. Set the scheduled time when enqueuing the task, and Google Cloud Tasks ensures the task is executed accordingly. For instance, you may want to send an email 1 month after sign-up for the trial period. Without using Cloud Tasks, you'd ideally have a cron job to check the difference between dates and send the email. But Cloud Tasks saves you this query!

    2. Task Retries and Acknowledgment: Google Cloud Tasks automatically retries failed tasks based on configurable settings. Tasks can also be acknowledged upon completion, allowing you to track their status and handle any failures or retries.

    3. Ordering and Prioritization: You can control the order in which tasks are executed by specifying task priorities. Higher-priority tasks are processed before lower-priority ones, ensuring important tasks are handled promptly.

    4. Monitoring and Insights: Google Cloud Tasks provides visibility into task execution with built-in monitoring and logging. You can access metrics, logs, and error information to track the performance and health of your task processing.

    5. Point-2-Point Communication: Asynchronous call between 2 microservices.

    6. Control Traffic: Need to control the rate so that worker's scalability is under control. e.g. Push an asynchronous image processing job, hitting an API with max requests etc

Limitations of Google Cloud Tasks

As much as Google Cloud Tasks helps with efficient task management on the cloud, it does have limitations, some of which i don't like and i wish that the limitation is removed in the near future. There are a couple of limitations but i'll highlight the "most important ones" to know and keep in mind

  1. Limited task payload size: Google Cloud Tasks imposes a limit on the size of the task payload, which is currently set at 1MB. So, if your tasks require larger payloads, you may need to consider alternative solutions or split the payload across multiple tasks.

  2. Task retention period: Tasks in Google Cloud Tasks have a limited retention period, which is currently set at 31 days. This means that any task added to a queue must be executed within 31 days. If a task is not processed within this period, it will be automatically deleted. So, you need to ensure your tasks are processed in a timely manner to avoid losing any important data.

  3. Task execution time limits: Google Cloud Tasks imposes a maximum execution time limit for tasks, which is currently set at 10 minutes. If your tasks require longer execution times, you'll need to consider other mechanisms or split the work into multiple tasks.

  4. Queue Recreation: If you delete a queue, you must wait for 7 days before creating a queue with the name again. One of the limitations i dislike because this makes me rethink about naming my queues carefully.

  5. Queue dispatch rate: This refers to the maximum rate at which tasks can be dispatched from a queue. The limitation is that you can only dispatch 500 taks in a queue per second. So, if you want to dispatch more than that, it's best to use multiple queues.

  6. Task de-duplication window: As much as you can create multiple tasks with different names in queue, once a task is deleted, you'll have to wait for about 1 hour to use the same name again.

  7. Maximum schedule time for a task: This is the maximum amount of time in the future that a task can be scheduled. If you want to schedule a task to be ran more than 30 days from the current date, it's going to throw an error. This is arguably the limitation i dislike the most.

    It's important to consider these limitations when evaluating Google Cloud Tasks for your specific use case. While it is a powerful task queuing service, understanding its constraints will help you make informed decisions and plan accordingly for your application requirements.

Conclusion

Google Cloud Tasks simplifies the management of distributed tasks in your applications. Its powerful features, such as task queuing, scheduling, retries, and prioritization, make it an excellent choice for building scalable and reliable systems. In this article, we covered the basics of using Google Cloud Tasks and demonstrated how to create task queues, enqueue tasks, and handle them using a task handler in Node.js. We talked about advanced features, and use cases of Cloud Tasks to help you make an informed decision. We also talked about the differences between Cloud Pub/Sub and Cloud Tasks. By leveraging Google Cloud Tasks, you can focus on your application's business logic while relying on a fully managed service to handle task distribution and execution efficiently.

Reference

https://medium.com/google-cloud/cloud-tasks-or-pub-sub-8dcca67e2f7a