LangChain & CrewAI
Paygent ships first-party callbacks for LangChain and CrewAI. They plug into the framework's native callback system and meter every LLM call the framework makes.
But — and this is important — you don't always need them.
When to use framework callbacks vs auto-instrumentation
LangChain and CrewAI both use the OpenAI / Anthropic SDKs internally. So when Paygent monkey-patches openai.chat.completions.create, that patch fires whether the call comes from your code or from inside a LangChain ChatOpenAI. Auto-instrumentation works out of the box.
You only need the framework callback if:
- You want framework-specific metadata (which agent, which task, which chain step) attached to events.
- You're using a custom LLM client that LangChain understands but Paygent's patcher doesn't.
- You disabled auto-instrumentation (
auto_instrument=False) and want metering only via the callback.
For most apps, auto-instrumentation alone is enough. Try that first. If you find an LLM call sneaking through unmetered, add the framework callback as a backstop.
Don't double-count
If you use auto-instrumentation and the framework callback simultaneously, you'd meter the same call twice. Paygent's framework callbacks detect this case and skip themselves when the patcher is active and a paygent_context is set. So you can stack them safely — the callback only fires when the patcher won't.
LangChain integration
Install
pip install paygent[langchain]
This pulls in langchain-core. If you're already using a heavier extra (langchain, langchain-openai), you don't need the [langchain] extra — the callback only depends on langchain-core.
Basic usage with ChatOpenAI
from langchain_openai import ChatOpenAI
from paygent import Paygent
from paygent.integrations import LangChainCallback
pg = Paygent.init(api_key="pg_live_...")
llm = ChatOpenAI(model="gpt-4o-mini")
cb = LangChainCallback(pg, user_id="user_123")
response = llm.invoke(
"Summarize the French Revolution in two sentences.",
config={"callbacks": [cb]},
)
print(response.content)
// Coming soon
The LangChainCallback reads the response's llm_output["token_usage"] (or per-generation usage_metadata in newer LangChain), builds a UsageEvent, and pushes it to Paygent's queue.
Usage with chains
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant."),
("user", "{input}"),
])
chain = prompt | llm | StrOutputParser()
cb = LangChainCallback(pg, user_id="user_123")
result = chain.invoke({"input": "Explain monads"}, config={"callbacks": [cb]})
// Coming soon
Usage with agents
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_core.tools import tool
@tool
def search(q: str) -> str:
"""Search the web for information."""
return f"Results for '{q}'"
agent = create_openai_tools_agent(llm, [search], prompt)
executor = AgentExecutor(agent=agent, tools=[search])
cb = LangChainCallback(pg, user_id="user_123")
result = executor.invoke(
{"input": "What's new in AI today?"},
config={"callbacks": [cb]},
)
// Coming soon
Auto-instrument + LangChain callback together
If you want to be safe — auto-instrument catches the LLM call, callback catches things the patcher might miss — just keep both. The callback's de-dup logic handles it:
pg = Paygent.init(api_key="...") # auto_instrument=True (default)
cb = LangChainCallback(pg, user_id="user_123")
with paygent_context(user_id="user_123"): # patcher will meter
result = chain.invoke({"input": "..."}, config={"callbacks": [cb]})
# Patcher meters the call; callback detects this and skips.
# No paygent_context → patcher won't meter; callback DOES meter.
result = chain.invoke({"input": "..."}, config={"callbacks": [cb]})
// Coming soon
The decision is per-call: if paygent_context is set AND the SDK is instrumented, the callback skips. Otherwise, the callback meters.
Complete LangChain example
"""A LangChain chain metered with Paygent."""
import os
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from paygent import Paygent, PaygentLimitExceeded
from paygent.integrations import LangChainCallback
pg = Paygent.init(api_key=os.environ["PAYGENT_API_KEY"])
pg.on_soft_gate(lambda r: print(f"⚠ {r.message}"))
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.2)
prompt = ChatPromptTemplate.from_messages([
("system", "You write concise summaries in 2 sentences."),
("user", "{topic}"),
])
chain = prompt | llm | StrOutputParser()
user_id = "user_123"
cb = LangChainCallback(pg, user_id=user_id, metadata={"feature": "summarize"})
try:
for topic in ["Quantum entanglement", "Mitochondria", "Async/await"]:
result = chain.invoke({"topic": topic}, config={"callbacks": [cb]})
print(f"\n{topic}:\n {result}")
except PaygentLimitExceeded as e:
print(f"\nBlocked: {e.guard_result.message}")
pg.flush()
usage = pg.get_usage(user_id)
print(f"\nTotal: ${usage.period_cost:.4f}, {usage.period_tokens_total} tokens")
pg.shutdown()
// Coming soon
CrewAI integration
Install
pip install paygent[crewai]
Basic usage
CrewAI exposes a step_callback on Crew. Paygent's CrewAICallback is a callable, so you can pass it directly.
from crewai import Agent, Crew, Task
from paygent import Paygent
from paygent.integrations import CrewAICallback
pg = Paygent.init(api_key="pg_live_...")
cb = CrewAICallback(pg, user_id="user_123")
researcher = Agent(
role="Researcher",
goal="Find recent news on a topic",
backstory="You are a meticulous researcher.",
)
task = Task(
description="Find 3 facts about quantum computing.",
expected_output="A bullet list of 3 facts.",
agent=researcher,
)
crew = Crew(agents=[researcher], tasks=[task], step_callback=cb)
result = crew.kickoff()
print(result)
// Coming soon
The callback runs after every agent step. It tries to extract token counts from the step output's token_usage, usage_metrics, or tokens attribute, plus any nested result.usage dict. If a step isn't an LLM call (no tokens captured, no model), the callback skips it.
Multi-agent crew
researcher = Agent(role="Researcher", goal="Gather facts", backstory="...")
writer = Agent(role="Writer", goal="Produce a summary", backstory="...")
research_task = Task(
description="Find 5 facts about Mars.",
expected_output="bullet list",
agent=researcher,
)
writing_task = Task(
description="Write a 3-sentence summary using the research.",
expected_output="paragraph",
agent=writer,
)
cb = CrewAICallback(pg, user_id="user_123")
crew = Crew(
agents=[researcher, writer],
tasks=[research_task, writing_task],
step_callback=cb,
)
result = crew.kickoff()
// Coming soon
Auto-instrument + CrewAI callback together
Same de-dup story as LangChain. CrewAI uses litellm or the OpenAI SDK under the hood, so the patcher catches the calls. The callback only fires when the patcher won't (i.e. no paygent_context set).
In CrewAI, the typical pattern is callback only, no paygent_context. The callback's user_id does the attribution.
pg = Paygent.init(api_key="...", auto_instrument=False) # callback-only mode
cb = CrewAICallback(pg, user_id="user_123")
crew = Crew(..., step_callback=cb)
crew.kickoff()
// Coming soon
If you do want auto-instrumentation as a backstop, leave auto_instrument=True and just use the callback. The de-dup logic handles it.
Complete CrewAI example
"""A CrewAI crew metered with Paygent."""
import os
from crewai import Agent, Crew, Task
from paygent import Paygent
from paygent.integrations import CrewAICallback
pg = Paygent.init(api_key=os.environ["PAYGENT_API_KEY"])
pg.on_usage(lambda e: print(f" metered {e.total_tokens} {e.model} tokens"))
user_id = "user_123"
cb = CrewAICallback(pg, user_id=user_id, metadata={"crew": "research"})
researcher = Agent(
role="Researcher",
goal="Find current information",
backstory="You're a thorough researcher.",
verbose=False,
)
task = Task(
description="Name 3 recent breakthroughs in renewable energy.",
expected_output="A bullet list of 3 items.",
agent=researcher,
)
crew = Crew(agents=[researcher], tasks=[task], step_callback=cb)
result = crew.kickoff()
print("\nResult:", result)
pg.flush()
print(f"Spend: ${pg.get_usage(user_id).period_cost:.4f}")
pg.shutdown()
// Coming soon
Other frameworks
Anything that uses the OpenAI / Anthropic Python SDK underneath works with auto-instrumentation out of the box. No special integration needed.
That includes:
- LlamaIndex — uses
openaidirectly under the hood - AutoGen — uses
openaidirectly - DSPy — uses
openaidirectly - Pydantic AI — uses provider SDKs directly
- Bare
openai/anthropicSDK calls - Anything you wrote yourself
For these, just wrap your call site in paygent_context:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from paygent import Paygent, paygent_context
pg = Paygent.init(api_key="pg_live_...")
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
with paygent_context(user_id="user_123"):
# LlamaIndex makes OpenAI calls underneath; the patcher catches them.
response = index.as_query_engine().query("What's in the docs?")
// Coming soon
If a framework wraps the LLM call in a way that hides it from the patcher (e.g. uses a custom HTTP client that doesn't go through openai.chat.completions.create), you'll see zero events. In that case, fall back to:
- The framework's own callback system if it has one (write a custom Paygent adapter —
LangChainCallbackis ~100 lines and a good template) pg.wrap()/pg.awrap()at the call site- Manually constructing
UsageEventand callingpg._event_sink(event)(advanced)
Next steps
- Streaming — token capture for streamed responses
- Callbacks & Events — what the callbacks fire (same model across all integrations)
- SDK Reference — full callback class signatures