LLM Monitoring

Sentry auto-generates LLM Monitoring data for common providers in Python, but you may need to manually annotate spans for other frameworks.

Span conventions

Span Operations

Span OPDescription
ai.pipeline.*The top-level span which corresponds to one or more AI operations & helper functions
ai.run.*A unit of work - a tool call, LLM execution, or helper method.
ai.chat_completions.*A LLM chat operation
ai.embeddings.*An LLM embedding creation operation

Span Data

AttributeTypeDescriptionExamplesNotes
ai.input_messagesstringThe input messages sent to the model[{"role": "user", "message": "hello"}]
ai.completion_tоkens.usedintThe number of tokens used to respond to the message10required for cost calculation
ai.prompt_tоkens.usedintThe number of tokens used to process just the prompt20required for cost calculation
ai.total_tоkens.usedintThe total number of tokens used to process the prompt30required for charts and cost calculation
ai.model_idlistThe vendor-specific ID of the model used"gpt-4"required for cost calculation
ai.streamingbooleanWhether the request was streamed backtrue
ai.responseslistThe response messages sent back by the AI model["hello", "world"]
ai.pipeline.namestringThe description of the parent ai.pipeline spanMy AI pipelinerequired for charts

Instrumentation

When a user creates a new AI pipeline, the SDK automatically creates spans that instrument both the pipeline and its AI operations.

Example

Copied
from sentry_sdk.ai.monitoring import ai_track
from openai import OpenAI

sentry.init(...)

openai = OpenAI()

@ai_track(description="My AI pipeline")
def invoke_pipeline():
    result = openai.chat.completions.create(
        model="some-model", messages=[{"role": "system", "content": "hello"}]
    ).choices[0].message.content

    return openai.chat.completions.create(
        model="some-model", messages=[{"role": "system", "content": result}]
    ).choices[0].message.content


This should result in the following spans.

Copied
<span op:"ai.pipeline" description:"My AI pipeline">
	<span op:"ai.chat_completions.openai" description:"OpenAI Chat Completion" data[ai.total_tokens.used]:15 data[ai.pipeline.name]:"My AI pipeline" />
	<span op:"ai.chat_completions.openai" description:"OpenAI Chat Completion" data[ai.total_tokens.used]:20 data[ai.pipeline.name]:"My AI pipeline" />
</span>

Notice that the ai.pipeline.name span of the children spans is the description of the ai.pipeline.* span parent.

You can edit this page on GitHub.