What is LLM Orchestration?
LLM orchestration frameworks provide the abstractions and tools needed to build complex AI applications. They handle the plumbing — prompt management, model switching, tool integration, memory, retrieval, and chain composition — so you can focus on application logic.
Framework Comparison
| Framework | Language | Focus | Best For |
|---|---|---|---|
| LangChain | Python, TypeScript | General-purpose chains & agents | Broad LLM applications |
| LangGraph | Python, TypeScript | Stateful agent workflows | Complex agent systems |
| LlamaIndex | Python, TypeScript | Data indexing & retrieval | RAG applications |
| Vercel AI SDK | TypeScript | Streaming UI + tool use | Next.js / React apps |
| Semantic Kernel | C#, Python | Enterprise AI integration | .NET / Microsoft stack |
LangChain
LangChain is the most widely adopted framework with the richest ecosystem of integrations. Its core concepts are models, prompts, chains, tools, and memory.
// LangChain in TypeScript
import { ChatAnthropic } from "@langchain/anthropic";
import { ChatPromptTemplate } from "@langchain/core/prompts";
import { StringOutputParser } from "@langchain/core/output_parsers";
import { RunnableSequence, RunnablePassthrough } from "@langchain/core/runnables";
const model = new ChatAnthropic({ model: "claude-sonnet-4-20250514" });
// Simple chain with LCEL (LangChain Expression Language)
const chain = ChatPromptTemplate.fromMessages([
["system", "You are a helpful assistant that explains {topic} concepts."],
["user", "{question}"],
])
.pipe(model)
.pipe(new StringOutputParser());
const result = await chain.invoke({
topic: "AI agents",
question: "What is the ReAct pattern?",
});
// Complex chain with parallel steps
const analysisChain = RunnableSequence.from([
{
summary: ChatPromptTemplate.fromMessages([
["user", "Summarize: {text}"],
]).pipe(model).pipe(new StringOutputParser()),
sentiment: ChatPromptTemplate.fromMessages([
["user", "Analyze sentiment of: {text}"],
]).pipe(model).pipe(new StringOutputParser()),
text: new RunnablePassthrough(),
},
ChatPromptTemplate.fromMessages([
["user", "Summary: {summary}\nSentiment: {sentiment}\nCreate a report."],
]).pipe(model).pipe(new StringOutputParser()),
]);
LlamaIndex
LlamaIndex specializes in connecting LLMs to data. It provides powerful abstractions for indexing, retrieval, and query engines.
# LlamaIndex RAG pipeline
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.llms.anthropic import Anthropic
from llama_index.embeddings.openai import OpenAIEmbedding
# Configure LLM and embeddings
Settings.llm = Anthropic(model="claude-sonnet-4-20250514")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
# Load and index documents (one line!)
documents = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(documents)
# Create a query engine
query_engine = index.as_query_engine(
similarity_top_k=5,
response_mode="compact", # compact, refine, tree_summarize
)
# Query
response = query_engine.query("What is our vacation policy?")
print(response)
print("Sources:", [n.node.metadata for n in response.source_nodes])
# Chat engine (conversational)
chat_engine = index.as_chat_engine(
chat_mode="condense_plus_context",
similarity_top_k=5,
)
r1 = chat_engine.chat("Tell me about the API rate limits")
r2 = chat_engine.chat("Can I increase them?") # Handles follow-ups
Vercel AI SDK
The Vercel AI SDK is the best choice for TypeScript-first applications, especially with Next.js. It provides streaming, tool use, and agent patterns with a clean, modern API.
// Vercel AI SDK - streaming with tools
import { streamText, tool } from "ai";
import { anthropic } from "@ai-sdk/anthropic";
import { z } from "zod";
// In a Next.js API route
export async function POST(req: Request) {
const { messages } = await req.json();
const result = streamText({
model: anthropic("claude-sonnet-4-20250514"),
messages,
maxSteps: 5, // Agent mode
tools: {
searchKnowledge: tool({
description: "Search the knowledge base for information",
parameters: z.object({
query: z.string().describe("Search query"),
}),
execute: async ({ query }) => {
// Retrieve from vector store
const results = await vectorStore.search(query, 5);
return results.map(r => r.content).join("\n");
},
}),
},
onFinish: async ({ text, toolCalls, toolResults }) => {
// Log usage, save to database, etc.
console.log("Completed with", toolCalls.length, "tool calls");
},
});
return result.toDataStreamResponse();
}
// Client-side with useChat hook
// "use client";
// import { useChat } from "ai/react";
// const { messages, input, handleSubmit } = useChat();
When to Use Each Framework
Decision Guide
- Use LangChain when you need maximum flexibility, many integrations, or are building complex chains
- Use LangGraph when building stateful agents with complex control flow and persistence
- Use LlamaIndex when RAG is your primary use case and you want the fastest path to production retrieval
- Use Vercel AI SDK when building Next.js/React apps with streaming UI and tool use
- Use raw API when your use case is simple or you need maximum control and minimal dependencies
Framework Selection Tips
- Don't over-abstract: If you only need 1-2 LLM calls, use the raw API instead of a framework
- Frameworks evolve fast: LangChain v0.3+ and LlamaIndex v0.11+ have significantly different APIs from earlier versions
- Mix and match: You can use LlamaIndex for indexing and LangChain for agents — they're compatible
- Watch the dependency tree: Frameworks bring many transitive dependencies — audit for security and size
Summary
LLM orchestration frameworks accelerate development by providing battle-tested abstractions for common patterns. LangChain offers breadth, LlamaIndex offers depth in RAG, the Vercel AI SDK offers the best TypeScript DX, and Semantic Kernel serves the .NET ecosystem. Choose based on your primary use case, language preference, and the complexity of your application. When in doubt, start with the raw API and adopt a framework only when the abstractions provide clear value.