Building More Reliable Workflows With Events

Every application starts off with running some basic code in the background. Common use cases include sending email notifications, handling subscription payment webhooks, or periodically fetching data from third party APIs.

Inngest's webhook support has evolved significantly—now supporting form-urlencoded and multipart data, plus a management API for programmatic webhook creation and version control. Learn more

At some point, most applications will be required to run tasks in the background that are critical to the core functionality or business value of the product. This is when applications need to get serious and consider more robust solutions as failed tasks can represent lost revenue.

Application-critical tasks are everywhere and might include:

Ingesting data from user uploads or third party APIs, transforming or validating data and sending user notifications
User actions that require propagation across several databases, systems or external integrations
Transaction processing including payments sent and received

All of these tasks must be reliable and fault-tolerant. Failures in these situations can lead data inconsistencies or broken states for users.

What is needed

As tasks gets more complex, it often can be split into multiple independent steps or phases. Many folks use the term "workflows" or step functions to describe these more complex tasks where, often, logic is required to glue the steps together.

Developers or operators need confidence that their critical workflows are running consistently, reliably, and durably. Quite often, one part of a workflow may be more susceptible to failure due to a flaky external API or rate limiting. If this one part of the code fails, it should be independently retried, avoiding the need to restart a job from the beginning.

Lastly, observability is crucial to understanding errors and results from each stage of your task. Status and logs should be grouped by type of workflow and individual workflow runs.

Splitting the code into steps

There are many different ways to split tasks into discrete steps (e.g. actions, sub-tasks, etc.). A simple approach is that teams will chain together multiple queues and workers. This works, but it requires more overhead, it may be difficult to audit, and will likely require separating of related code into separate jobs.

Alternatively, another approach could be to use a workflow orchestration framework. Each has it's own paradigm for building this logic often with significant learning curves.

Writing critical workflows as code

Our goals were to enable developers to write complex jobs combining multiple steps right in code while minimizing custom DSL and concepts to learn.

We don't think writing step functions or workflows as a large JSON configuration is a great developer experience. Also, using a online GUI to design these isn't always ideal as the logic or DAG may be edited or managed outside of the actual code's version control.

We also believe it should be simple to run, from local development, to production where code can be deployed to any number of platforms.

Let's take a use case to demonstrate our approach. A product is a CRM tool that enables a user to upload a large CSV file with contacts information. The API triggers this to run with the event api/contact_list.uploaded. This advanced CRM tool validates the upload then uses third party APIs to enrich the data with information about the contact's business then waits for the user to review and filter this list down before inserting all data into the CRM database. When the user approve or rejects and selects filers, the API will send another event: api/contact_list.reviewed. Here's what the code could look like:

import { inngest } from "./client";

inngest.createFunction(
  { name: "Contacts Import and enrichment" },
  { event: "api/contact_list.uploaded" },
  async ({ event, step }) => {
    const { isValid, errors } = await step.run("Validate upload contents", async () => {
      // Download the csv file, validate columns and data in each row
      const { isValid, errors } = downloadAndValidateCSV(event.data.filename);
      return { isValid, errors };
    });

    if (!isValid) {
      return await step.run("Notify user of invalid contents", async () =>
        await sendContactsImportFailedEmail(event.user.id, errors)
      );
    }

    // Enrichment may fail at times due to networking blip
    await step.run("Enrich contracts information", async () => {
      // Call a third party API service to enriches each contact's info
      // then uploads the data to an object store when complete
    });

    const listReviewedEvent = await step.waitForEvent("api/contact_list.reviewed", {
      timeout: "7d",
      match: "data.upload_id", // data.upload_id is in both events and must match to proceed
    })

    if (listReviewedEvent.data.is_approved === false) {
      return await step.run("Delete uploaded contact lists", () => { /* ...*/ });
    }

    const { totalUsersAdded } = await step.run("Create contacts in CRM", async () => {
      const contacts = await downloadEnrichedContactList(event.data.filename);
      const filteredContacts = applyFilters(listReviewedEvent.data.filters);
      return await insertContactsIntoCRMDatabase(event.data.account_id, filteredContacts);
    });

    await step.run("Notify user of successful import", async () =>
      await sendContactsImportSuccessEmail(event.user.id, totalUsersAdded)
    );
  }
)

All code within each step.run() callback will be run independently and will be retried upon failure, resuming from where the function left off.
Using step.waitForEvent() enable the workflow to wait for additional user interactions or events via event coordination.
The results of each step, including failures and retries are logged can be inspected on the Inngest dashboard for complete observability.

More info on these and other tools are in our docs.

Advanced monitoring and querying

Query your data with Insights

Inngest now includes Insights—the ability to query your events and runs using SQL directly in the dashboard:

```sql SELECT event_name, COUNT(*) as event_count FROM events WHERE created_at > NOW() - INTERVAL '7 days' GROUP BY event_name ORDER BY event_count DESC; ```

Insights enables:

Custom queries for event analysis
Run performance tracking
Data export for custom reporting
Schema explorer to discover available data

Particularly powerful for AI workflows—track token usage, model calls, and agent performance directly from your workflow data.

Read the announcement

Export metrics to Datadog

For teams using Datadog, you can now export all Inngest metrics for centralized monitoring and alerting. Configure the integration in your Inngest dashboard under Settings > Integrations > Datadog.

Available metrics include:

inngest_function_run_scheduled_total
inngest_function_run_started_total
inngest_function_run_ended_total
And more

Learn more

Over to you

For critical code that runs in the background, developers need guarantees and observability in a simple to use package. Simple to use, easy to maintain. Investing in any solution should mean transparency, portability and potential to self-host. We keep our system including the execution engine that powers all of this functionality as well as the SDK completely open source: inngest/inngest, inngest/inngest-js.

Our beta release is out today and can be used with Inngest Cloud today - it's free to start using. Come join us on Discord or Github to share feedback or seek support on building out some critical workflows for yourself.