# Step parallelism

- If you’re using a serverless platform to host, code will run in true parallelism similar to multi-threading (without shared state)
- Each step will be individually retried

### Platform support

**Parallelism works across all providers and platforms**.  True parallelism is supported for serverless functions;  if you’re using a single Express server you’ll be splitting all parallel jobs amongst a single-threaded node server.

## Running steps in parallel

You can run steps in parallel via `Promise.all()`:

- Create each step via [`step.run()`](/docs-markdown/reference/functions/step-run) without awaiting, which returns an unresolved promise.
- Await all steps via `Promise.all()`. This triggers all steps to run in parallel via separate executions.

A common use case is to split work into chunks:

```ts
import { Inngest } from "inngest";

const inngest = new Inngest({ id: "signup-flow" });

export const fn = inngest.createFunction(
  { id: "post-payment-flow", triggers: { event: "stripe/charge.created" } },
  async ({ event, step }) => {
    // These steps are not `awaited` and run in parallel when Promise.all
    // is invoked.
    const sendEmail = step.run("confirmation-email", async () => {
      const emailID = await sendEmail(event.data.email);
      return emailID;
    });

    const updateUser = step.run("update-user", async () => {
      return db.updateUserWithCharge(event);
    });

    // Run both steps in parallel.  Once complete, Promise.all will return all
    // parallelized state here.
    //
    // This ensures that all steps complete as fast as possible, and we still have
    // access to each step's data once they're complete.
    const [emailID, updates] = await Promise.all([sendEmail, updateUser]);

    return { emailID, updates };
  }
);
```

When each step is finished, Inngest will aggregate each step's state and re-invoke the function with all state available.

### Step parallelism in Python

Inngest supports parallel steps regardless of whether you're using asynchronous or synchronous code. For both approaches, you can use `step.parallel`:

#### async - with `inngest.Step` and `await ctx.group.parallel()`

```py
@client.create_function(
  fn_id="my-fn",
  trigger=inngest.TriggerEvent(event="my-event"),
)
async def fn(ctx: inngest.Context) -> None:
  user_id = ctx.event.data["user_id"]

  (updated_user, sent_email) = await ctx.group.parallel(
    (
      lambda: step.run("update-user", update_user, user_id),
      lambda: step.run("send-email", send_email, user_id),
    )
  )
```

#### sync - with `inngest.StepSync` and `group.parallel()`

```py
@client.create_function(
  fn_id="my-fn",
  trigger=inngest.TriggerEvent(event="my-event"),
)
def fn(ctx: inngest.ContextSync) -> None:
  user_id = ctx.event.data["user_id"]

  (updated_user, sent_email) = ctx.group.parallel(
    (
      lambda: ctx.step.run("update-user", update_user, user_id),
      lambda: ctx.step.run("send-email", send_email, user_id),
    )
  )
```

## Optimizing parallel step performance

Without optimized parallelism, parallel steps require 2 requests per step to your application. If you have many parallel steps (e.g., hundreds), this can lead to:

- High number of HTTP requests to your application
- Increased ingress bandwidth
- Higher CPU usage from request parsing

Optimized parallelism reduces this to just 1 request per parallel step, significantly improving performance for functions with many parallel operations.

### TypeScript

You can disable optimized parallelism by setting `optimizeParallelism: false` on your client or function.

**Important considerations:**

- **`Promise.race` behavior**: With optimized parallelism (the default in v4), `Promise.race` waits for all parallel steps to complete before resolving. If you need early resolution behavior, use `group.parallel()`:

```ts
const winner = await group.parallel(async () => {
  return Promise.race([
    step.run("a", () => "a"),
    step.run("b", () => "b"),
  ]);
});
```

- **Sequential steps in parallel groups**: Steps that run sequentially within different parallel branches may not execute in the order you expect. For example:

```ts
const stepOrder: string[] = [];

export const fn = inngest.createFunction(
  { id: "fn-1", triggers: { event: "event-1" } },
  async ({ step }) => {
    await Promise.all([
      (async () => {
        await step.run("fast.1", async () => {
          stepOrder.push("fast.1");
        });
        await step.run("fast.2", async () => {
          stepOrder.push("fast.2");
        });
      })(),
      (async () => {
        await step.run("slow.1", async () => {
          await sleep(1000);
          stepOrder.push("slow.1");
        });
        await step.run("slow.2", async () => {
          await sleep(1000);
          stepOrder.push("slow.2");
        });
      })(),
    ]);

    // With optimizeParallelism: ['fast.1', 'slow.1', 'fast.2', 'slow.2']
    // Without optimizeParallelism: ['fast.1', 'fast.2', 'slow.1', 'slow.2']
  }
);
```

### Python: Optimized by default with opt-out

Python always uses optimized parallelism by default, as it doesn't have an equivalent to `Promise.race` to worry about. However, you can opt out at the group level if you need sequential steps within parallel groups to run independently.

Use the `parallel_mode` parameter to control this behavior:

```py
import inngest
import asyncio

@inngest_client.create_function(
    fn_id="my-fn",
    trigger=inngest.TriggerEvent(event="my-event"),
)
async def fn(ctx: inngest.Context) -> None:
    async def fast_group() -> None:
        await ctx.step.run("a", lambda: asyncio.sleep(1))
        await ctx.step.run("b", lambda: asyncio.sleep(1))

    async def slow_group() -> None:
        await ctx.step.run("x", lambda: asyncio.sleep(10))
        await ctx.step.run("y", lambda: asyncio.sleep(10))

    # Using RACE mode makes steps run in expected order: a, b, x, y
    await ctx.group.parallel(
        (fast_group, slow_group), 
        parallel_mode=inngest.ParallelMode.RACE
    )
```

**Without specifying `parallel_mode`**, the steps will run in the order: `a`, `x`, `b`, `y` (optimized mode). Everything is still correct, but step `b` doesn't run until step `x` completes.

**With `parallel_mode=inngest.ParallelMode.RACE`**, the steps run in the expected order: `a`, `b`, `x`, `y`, but with the performance trade-off of more requests.

## Chunking jobs

A common use case is to chunk work. For example, when using OpenAI's APIs you might need to chunk a user's input and run the API on many chunks, then aggregate all data:

```ts
import { Inngest } from "inngest";

const inngest = new Inngest({ id: "signup-flow" });

export const fn = inngest.createFunction(
  { id: "summarize-text", triggers: { event: "app/text.summarize" } },
  async ({ event, step }) => {
    const chunks = splitTextIntoChunks(event.data.text);

    const summaries = await Promise.all(
      chunks.map((chunk, index) =>
        step.run(`summarize-chunk-${index}`, () => summarizeChunk(chunk))
      )
    );

    await step.run("summarize-summaries", () => summarizeSummaries(summaries));
  }
);
```

This allows you to run many independent steps, wait until they're all finished, then fetch the results from all steps within a few lines of code. Doing this in a traditional system would require creating many jobs, polling the status of all jobs, and manually combining state.

## Limitations

Currently, the total data returned from **all** steps must be under 4MB (eg. a single step can return a max of. 4MB, or 4 steps can return a max of 1MB each).  Functions are also limited to a maximum of 1,000 steps.

## Parallelism vs fan-out

Another technique similar to parallelism is fan-out ([read the guide here](/docs-markdown/guides/fan-out-jobs)):  when one function sends events to trigger other functions.  Here are the key differences:

- Both patterns run jobs in parallel
- You can access the output of steps ran in parallel within your function, whereas with fan-out you cannot
- Parallelism has a limit of 1,000 steps, though you can create as many functions as you'd like using fan-out
- You can replay events via fan-out, eg. to test functions locally
- You can retry individual functions easily if they permanently fail, whereas if a step permanently fails (after retrying) the function itself will fail and terminate.
- Fan-out splits functionality into different functions, using step functions keeps all related logic in a single, easy to read function