Reliably run critical workflows
Break multi-step AI pipelines and complex business logic into durable, independently retried steps.
Many applications run mission-critical code in the background: processing payments, orchestrating multi-step AI pipelines, syncing data across systems, or handling complex fulfillment logic. Because this work is asynchronous, it's harder to observe success or failure compared to synchronous API requests. And when a job runs for minutes or hours, reliability and durability become essential.
You might start with a single long function, but what happens when it fails halfway through? Does the entire job restart from the beginning? If you're running an AI pipeline that calls an LLM, stores the result, then calls another LLM to evaluate the output, a failure on the evaluation step shouldn't force you to re-run (and re-pay for) the first LLM call.
The answer is to break your long-running jobs into discrete steps. Each step runs independently, can throw its own errors, retries on its own, and passes data to the next step. This is the core of durable execution: your workflow makes progress even when individual pieces fail.
§With Inngest
Inngest lets you compose workflows using built-in step tools. Each step.run() call is independently executed and retried. The following example demonstrates a multi-step AI content pipeline triggered by a document upload:
01import { inngest } from "./client";0203export const processDocument = inngest.createFunction(04 { id: "process-uploaded-document", triggers: [{ event: "api/document.uploaded" }] },05 async ({ event, step }) => {06 // Step 1: Validate the upload07 const { isValid, errors } = await step.run(08 "validate-upload",09 async () => {10 const file = await storage.getFile(event.data.filename);11 return validateDocument(file);12 }13 );1415 if (!isValid) {16 return await step.run(17 "notify-invalid-upload",18 async () => await sendUploadFailedEmail(event.data.userId, errors)19 );20 }2122 // Step 2: Extract and summarize content with an LLM23 const summary = await step.run("summarize-content", async () => {24 const content = await storage.getFileContent(event.data.filename);25 return await llm.summarize(content);26 });2728 // Step 3: Classify the document29 const classification = await step.run("classify-document", async () => {30 return await llm.classify(summary, {31 categories: ["invoice", "contract", "report", "correspondence"],32 });33 });3435 // Step 4: Store results36 await step.run("save-results", async () => {37 await db.documents.update({38 where: { id: event.data.documentId },39 data: { summary, classification: classification.category },40 });41 });4243 // Step 5: Notify the user44 await step.run(45 "notify-success",46 async () => await sendProcessingCompleteEmail(47 event.data.userId,48 event.data.documentId49 )50 );51 }52);Each step runs independently and retries when it fails. If the LLM call in step 3 times out, Inngest retries just that step without re-running the validation or the first LLM call. The data returned from each step is persisted and available to subsequent steps automatically.
§Alternative approaches
You can implement durable workflows with AWS Step Functions or Temporal, but each comes with significant learning curves and operational overhead. Step Functions require you to define workflows in a JSON DSL. Temporal requires running its own server infrastructure and learning a framework-specific programming model.
Inngest's approach is serverless and designed to feel like writing normal code. You define steps using step.run() inside a regular function, deploy it alongside your application, and Inngest handles execution, retries, and state management.
Rolling your own solution is also possible, but over time it tends to grow into a system of chained queues with separately deployed workers, each with their own retry logic, state management, and logging. This can become difficult to maintain and reason about.