# AI Inference&#x20;

You can build complex AI workflows and call model providers as steps using two-step methods, `step.ai.infer()` and `step.ai.wrap()`, or our AgentKit SDK.  They work with any model provider, and all offer full AI observability:

- `step.ai.wrap()`  wraps other AI SDKs (OpenAI, Anthropic, and Vercel AI SDK) as a step, augmenting the observability of your Inngest Functions with information such as prompts and tokens used.
- `step.ai.infer()` offloads the inference request to Inngest's infrastructure, pausing your function execution until the request finishes.  This can be a significant cost saver if you deploy to serverless functions
- [AgentKit](https://agentkit.inngest.com) allows you to easily create single model calls or agentic workflows.

For building a full ReAct-style agent loop where each LLM call and tool execution is a retriable step, see the [agent tool loop pattern](/docs-markdown/ai-patterns/agent-tool-loops). To expand your system with [sub-agent delegation](/docs-markdown/ai-patterns/sub-agent-delegation), use `step.invoke()` and `step.sendEvent()`.

### Benefits

Using `step.ai` or [AgentKit](https://agentkit.inngest.com) allows you to:

- Automatically monitor AI usage in production to ensure quality output
- Easily iterate and test prompts in the dev server
- Track requests and responses from foundational inference providers
- Track how inference calls work together in multi-step or agentic workflows

## Step tools: `step.ai`  &#x20;

### `step.ai.infer()`

Using `step.ai.infer()` allows you to call any inference provider's endpoints by offloading it to Inngest's infrastructure.
All requests and responses are automatically tracked within your workflow traces.

**Request offloading**

On serverless environments, your function is not executing while the request is in progress — which means you don't pay for function execution while waiting for the provider's response.
Once the request finishes, your function restarts with the inference result's data.  Inngest never logs or stores your API keys or authentication headers.  Authentication originates from your own functions.

Here's an example which calls OpenAI:

```ts {{ title: "TypeScript" }}
export default inngest.createFunction(
  { id: "summarize-contents", triggers: { event: "app/ticket.created" } },
  async ({ event, step }) => {

    // This calls your model's chat endpoint, adding AI observability,
    // metrics, datasets, and monitoring to your calls.
    const response = await step.ai.infer("call-openai", {
      model: step.ai.models.openai({ model: "gpt-4o" }),
      // body is the model request, which is strongly typed depending on the model
      body: {
        messages: [{
          role: "assistant",
          content: "Write instructions for improving short term memory",
        }],
      },
    });

    // The response is also strongly typed depending on the model.
    return response.choices;
  }
);
```

```py {{ title: "Python" }}
import inngest
from inngest.experimental import ai

client = inngest.Inngest(app_id="my-app")

@client.create_function(
    fn_id="summarize-contents",
    trigger=inngest.TriggerEvent(event="app/ticket.created"),
)
async def summarize_contents(ctx: inngest.Context) -> object:
    # This calls your model's chat endpoint, adding AI observability,s
    # metrics, datasets, and monitoring to your calls.
    res = await ctx.step.ai.infer(
        "call-openai",
        adapter=ai.openai.Adapter(
            auth_key="sk-openai-000000",
            model="gpt-4o",
        ),
        # body is the model request
        body={
            "messages": [
                {
                    "role": "assistant",
                    "content": "Write instructions for improving short term memory",
                }
            ],
        },
    )

    # The response is a dict matching the provider's response
    return res["choices"]
```

**"PDF processing with Claude Sonnet and step.ai.infer()"**: [Use step.ai.infer() to process a PDF with Claude Sonnet.]("https://github.com/inngest/inngest-js/tree/main/examples//step-ai/anthropic-claude-pdf-processing/#readme")

### `step.ai.wrap()`  (TypeScript only)

Using `step.ai.wrap()` allows you to wrap other TypeScript AI SDKs, treating each inference call as a step.  This allows you to easily convert AI calls to steps with full observability without changing much application-level code:

```ts {{ title: "Vercel AI SDK" }}
import { generateText } from "ai"
import { openai } from "@ai-sdk/openai"

export default inngest.createFunction(
  { id: "summarize-contents", triggers: { event: "app/ticket.created" } },
  async ({ event, step }) => {

    // This calls `generateText` with the given arguments, adding AI observability,
    // metrics, datasets, and monitoring to your calls.
    const { text } = await step.ai.wrap("using-vercel-ai", generateText, {
      model: openai("gpt-4-turbo"),
      prompt: "What is love?"
    });

  }
);
```

```ts {{ title: "Anthropic SDK" }}
import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic();

export default inngest.createFunction(
  { id: "summarize-contents", triggers: { event: "app/ticket.created" } },
  async ({ event, step }) => {

    // This calls `generateText` with the given arguments, adding AI observability,
    // metrics, datasets, and monitoring to your calls.
    const result = await step.ai.wrap("using-anthropic", anthropic.messages.create, {
      model: "claude-3-5-sonnet-20241022",
      max_tokens: 1024,
      messages: [{ role: "user", content: "Hello, Claude" }],
    });

  }
);
```

In this case, instead of calling the SDK directly, you specify the SDK function you want to call and the function's arguments separately within `step.ai.wrap()`.

### Supported providers

The list of current providers supported for `step.ai.infer()` is:

- `openai`, including any OpenAI compatible API such as Perplexity
- `gemini`
- `anthropic`
- `grok`
- `azure-openai`

### Limitations

- Streaming responses from providers is coming soon, alongside real-time support with Inngest functions.

- When using `step.ai.wrap` with sdk clients that require client instance context to be preserved between
  invocations, currently it's necessary to bind the client call outside the `step.ai.wrap` call like so:

```ts {{ title: "Wrap Anthropic SDK" }}
import Anthropic from "@anthropic-ai/sdk";
const anthropic = new Anthropic();

export const anthropicWrapGenerateText = inngest.createFunction(
  { id: "anthropic-wrap-generateText", triggers: { event: "anthropic/wrap.generate.text" } },
  async ({ event, step }) => {
    //
    // Will fail because anthropic client requires instance context
    // to be preserved across invocations.
    await step.ai.wrap(
      "using-anthropic",
      anthropic.messages.create,
      {
        model: "claude-3-5-sonnet-20241022",
        max_tokens: 1024,
        messages: [{ role: "user", content: "Hello, Claude" }],
      },
    );

    //
    // Will work beccause we bind to preserve instance context
    const createCompletion = anthropic.messages.create.bind(anthropic.messages);
    await step.ai.wrap(
      "using-anthropic",
      createCompletion,
      {
        model: "claude-3-5-sonnet-20241022",
        max_tokens: 1024,
        messages: [{ role: "user", content: "Hello, Claude" }],
      },
    );
  },
);
```

```ts {{ title: "Wrap OpenAI SDK" }}
import OpenAI from "openai";
const openai = new OpenAI({ apiKey: OPENAI_API_KEY });

export const openAIWrapCompletionCreate = inngest.createFunction(
  { id: "opeai-wrap-completion-create", triggers: { event: "openai/wrap.completion.create" } },
  async ({ event, step }) => {
    //
    // Will fail because anthropic client requires instance context
    // to be preserved across invocations.
    await step.ai.wrap(
      "openai.wrap.completions",
      openai.chat.completions.create,
      {
        model: "gpt-4o-mini",
        messages: [
          { role: "system", content: "You are a helpful assistant." },
          {
            role: "user",
            content: "Write a haiku about recursion in programming.",
          },
        ],
      },
    );

    //
    // Will work beccause we bind to preserve instance context
    const createCompletion = openai.chat.completions.create.bind(
      openai.chat.completions,
    );

    const response = await step.ai.wrap(
      "openai-wrap-completions",
      createCompletion,
      {
        model: "gpt-4o-mini",
        messages: [
          { role: "system", content: "You are a helpful assistant." },
          {
            role: "user",
            content: "Write a haiku about recursion in programming.",
          },
        ],
      },
    );
  },
);
```

- When using `step.ai.wrap`, you can edit prompts and rerun steps in the dev server.
  But, arguments must be JSON serializable.

```ts {{ title: "Vercel AI SDK" }}
import { generateText as vercelGenerateText } from "ai";
import { openai as vercelOpenAI } from "@ai-sdk/openai";

export const vercelWrapGenerateText = inngest.createFunction(
  { id: "vercel-wrap-generate-text", triggers: { event: "vercel/wrap.generate.text" } },
  async ({ event, step }) => {
    //
    // Will work but you will not be able to edit the prompt and rerun the step in the dev server.
    await step.ai.wrap(
      "vercel-openai-generateText",
      vercelGenerateText,
      {
        model: vercelOpenAI("gpt-4o-mini"),
        prompt: "Write a haiku about recursion in programming.",
      },
    );

    //
    // Will work and you will be able to edit the prompt and rerun the step in the dev server because
    // the arguments to step.ai.wrap are JSON serializable.
    const args = {
      model: "gpt-4o-mini",
      prompt: "Write a haiku about recursion in programming.",
    };

    const gen = ({ model, prompt }: { model: string; prompt: string }) =>
      vercelGenerateText({
        model: vercelOpenAI(model),
        prompt,
      });

    await step.ai.wrap("using-vercel-ai", gen, args);
  },
);
```

- `step.ai.wrap's` Typescript definition will for the most part infer allowable inputs based on the
  signature of the wrapped function. However, in some cases where the wrapped function contains complex
  overloads, such as Vercel's `generateObject`, it may be necessary to type cast.

*Note*: Future version of the Typescript SDK will correctly infer these complex types, but for now we
require type casting to ensure backward compatibility.

```ts {{ title: "Vercel AI SDK" }}
import { generateObject as vercelGenerateObject } from "ai";
import { openai as vercelOpenAI } from "@ai-sdk/openai";

export const vercelWrapSchema = inngest.createFunction(
  { id: "vercel-wrap-generate-object", triggers: { event: "vercel/wrap.generate.object" } },
  async ({ event, step }) => {
    //
    // Calling generateObject directly is fine
    await vercelGenerateObject({
      model: vercelOpenAI("gpt-4o-mini"),
      schema: z.object({
        recipe: z.object({
          name: z.string(),
          ingredients: z.array(
            z.object({ name: z.string(), amount: z.string() }),
          ),
          steps: z.array(z.string()),
        }),
      }),
      prompt: "Generate a lasagna recipe.",
    });

    //
    // step.ai.wrap requires type casting
    await step.ai.wrap(
      "vercel-openai-generateObject",
      vercelGenerateObject,
      {
        model: vercelOpenAI("gpt-4o-mini"),
        schema: z.object({
          recipe: z.object({
            name: z.string(),
            ingredients: z.array(
              z.object({ name: z.string(), amount: z.string() }),
            ),
            steps: z.array(z.string()),
          }),
        }),
        prompt: "Generate a lasagna recipe.",
      } as any,
    );
  },
);
```

## AgentKit: AI and agent orchestration&#x20;

AgentKit is a simple, standardized way to implement model calling — either as individual calls, a complex workflow, or agentic flows.

Here's an example of a single model call:

```ts {{ title: "TypeScript" }}

import { Agent, agenticOpenai as openai, createAgent } from "@inngest/agent-kit";
export default inngest.createFunction(
  { id: "summarize-contents", triggers: { event: "app/ticket.created" } },
  async ({ event, step }) => {

    // Create a new agent with a system prompt (you can add optional tools, too)
    const writer = createAgent({
      name: "writer",
      system: "You are an expert writer.  You write readable, concise, simple content.",
      model: openai({ model: "gpt-4o", step }),
    });

    // Run the agent with an input.  This automatically uses steps
    // to call your AI model.
    const { output } = await writer.run("Write a tweet on how AI works");
  }
);

```

[Read the full AgentKit docs here](https://agentkit.inngest.com) and [see the code on GitHub](https://github.com/inngest/agent-kit).