langsmith-cli

Alpha — This CLI is under active development. Commands, flags, and output schemas may change between releases. Feedback and bug reports welcome via GitHub Issues.

An agent-first CLI for querying and managing LangSmith resources.

Built for AI coding agents (deepagents, Claude Code, Cursor, etc.) and developers who need fast, scriptable access to projects, traces, runs, datasets, evaluators, experiments, and threads.

Installation

Install script (recommended)

curl -fsSL https://cli.langsmith.com/install.sh | sh

GitHub releases

Download the latest binary for your platform from GitHub Releases.

Authentication

Set your API key as an environment variable:

export LANGSMITH_API_KEY="lsv2_pt_..."

Optionally set defaults:

export LANGSMITH_ENDPOINT="https://api.smith.langchain.com"  # For self-hosted
export LANGSMITH_PROJECT="my-default-project"                 # Default project for queries

Or pass them as flags:

langsmith --api-key lsv2_pt_... trace list --project my-app

Quick Start

# List tracing projects
langsmith project list

# List recent traces in a project
langsmith trace list --project my-app --limit 5

# Get a specific trace with full detail
langsmith trace get <trace-id> --project my-app --full

# List LLM calls with token counts
langsmith run list --project my-app --run-type llm --include-metadata

# List datasets
langsmith dataset list

# List experiments for a dataset
langsmith experiment list --dataset my-eval-set

Output Formats

All commands default to JSON output for agent consumption:

langsmith trace list --project my-app  # JSON array to stdout

Use --format pretty for human-readable tables and trees:

langsmith --format pretty trace list --project my-app

Write to a file with -o:

langsmith trace list --project my-app -o traces.json

Command Reference

`project` — List tracing projects

A tracing project (session) is a namespace that groups related traces together. This lists only tracing projects, not experiments — use experiment list for those.

# List tracing projects (default limit: 20)
langsmith project list
langsmith project list --limit 50

# Filter by name
langsmith project list --name-contains chatbot

# Human-readable table
langsmith --format pretty project list

`trace` — Query and export traces

A trace is a tree of runs representing one end-to-end invocation of your application.

# List recent traces (default limit: 20)
langsmith trace list --project my-app
langsmith trace list --project my-app --limit 50 --last-n-minutes 60

# Filter traces
langsmith trace list --project my-app --error           # Only errors
langsmith trace list --project my-app --min-latency 5   # Slow traces (>5s)
langsmith trace list --project my-app --tags production  # By tag
langsmith trace list --project my-app --name "agent"     # By name

# Include additional fields
langsmith trace list --project my-app --include-metadata   # + status, duration, tokens, costs
langsmith trace list --project my-app --include-io         # + inputs, outputs, error
langsmith trace list --project my-app --include-feedback   # + feedback_stats
langsmith trace list --project my-app --full               # All fields (metadata + io + feedback)

# Show trace hierarchy (fetches full run tree for each trace)
langsmith trace list --project my-app --show-hierarchy --limit 3

# Get a specific trace
langsmith trace get <trace-id> --project my-app --full

# Export traces to JSONL files (one per trace)
langsmith trace export ./traces --project my-app --limit 20 --full

# Custom filename pattern (supports {trace_id} and {name} placeholders)
langsmith trace export ./traces --project my-app --filename-pattern "{name}_{trace_id}.jsonl"

`run` — Query individual runs

A run is a single step within a trace (LLM call, tool call, chain step, etc.).

# List LLM calls (default limit: 50)
langsmith run list --project my-app --run-type llm
langsmith run list --project my-app --run-type tool --name search

# Find expensive calls
langsmith run list --project my-app --run-type llm --min-tokens 1000 --include-metadata

# Include feedback scores
langsmith run list --project my-app --include-feedback

# Get a specific run
langsmith run get <run-id> --full

# Export to JSONL (default limit: 100)
langsmith run export llm_calls.jsonl --project my-app --run-type llm --full

`thread` — Query conversation threads

A thread groups multiple root runs sharing a thread_id (multi-turn conversations).

# List threads (requires --project)
langsmith thread list --project my-chatbot
langsmith thread list --project my-chatbot --last-n-minutes 120

# Get all turns in a thread
langsmith thread get <thread-id> --project my-chatbot --full

`dataset` — Manage evaluation datasets

# List datasets
langsmith dataset list
langsmith dataset list --name-contains eval

# Get dataset details
langsmith dataset get my-dataset

# Create and delete
langsmith dataset create --name my-eval-set --description "QA pairs for v2"
langsmith dataset delete my-old-dataset --yes

# Export examples to JSON
langsmith dataset export my-dataset ./data.json --limit 500

# Upload from JSON file
langsmith dataset upload data.json --name new-dataset

`example` — Manage dataset examples

# List examples
langsmith example list --dataset my-dataset
langsmith example list --dataset my-dataset --split test --limit 50

# Paginate through examples
langsmith example list --dataset my-dataset --limit 20 --offset 20

# Create an example
langsmith example create --dataset my-dataset \
  --inputs '{"question": "What is LangSmith?"}' \
  --outputs '{"answer": "A platform for LLM observability"}'

# Create with metadata and split assignment
langsmith example create --dataset my-dataset \
  --inputs '{"question": "What is tracing?"}' \
  --outputs '{"answer": "Recording LLM application execution"}' \
  --metadata '{"source": "manual", "version": 2}' \
  --split test

# Delete an example
langsmith example delete <example-id> --yes

`evaluator` — Manage evaluator rules

# List evaluators
langsmith evaluator list

# Upload an offline evaluator (for experiments)
langsmith evaluator upload evals.py \
  --name accuracy --function check_accuracy --dataset my-eval-set

# Upload an online evaluator (for production monitoring)
langsmith evaluator upload evals.py \
  --name latency-check --function check_latency --project my-app

# Set sampling rate (evaluate a fraction of runs, 0.0-1.0)
langsmith evaluator upload evals.py \
  --name latency-check --function check_latency --project my-app --sampling-rate 0.5

# Replace an existing evaluator
langsmith evaluator upload evals.py \
  --name accuracy --function check_accuracy_v2 --dataset my-eval-set --replace --yes

# Delete an evaluator
langsmith evaluator delete accuracy --yes

`experiment` — Query experiment results

# List experiments
langsmith experiment list
langsmith experiment list --dataset my-eval-set

# Get experiment results (feedback stats, run stats)
langsmith experiment get my-experiment-2024-01-15

Filter Options

Most trace and run commands share these filter options:

Flag	Description	Example
`--project`	Project name	`--project my-app`
`--limit, -n`	Max results	`-n 10`
`--last-n-minutes`	Time window	`--last-n-minutes 60`
`--since`	After ISO timestamp	`--since 2024-01-15T00:00:00Z`
`--error / --no-error`	Error status	`--error`
`--name`	Name search (case-insensitive)	`--name ChatOpenAI`
`--run-type`	Run type (run commands only)	`--run-type llm`
`--min-latency`	Min latency (seconds)	`--min-latency 2.5`
`--max-latency`	Max latency (seconds)	`--max-latency 10`
`--min-tokens`	Min total tokens	`--min-tokens 1000`
`--tags`	Tags (comma-separated, OR logic)	`--tags prod,v2`
`--filter`	Raw LangSmith filter DSL	`--filter 'eq(status, "error")'`
`--trace-ids`	Specific trace IDs	`--trace-ids abc123,def456`

Requirements

Go 1.23+
golangci-lint (for linting)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github/workflows		.github/workflows
cmd/langsmith		cmd/langsmith
internal		internal
scripts		scripts
.gitignore		.gitignore
.goreleaser.yml		.goreleaser.yml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

langsmith-cli

Installation

Install script (recommended)

GitHub releases

Authentication

Quick Start

Output Formats

Command Reference

`project` — List tracing projects

`trace` — Query and export traces

`run` — Query individual runs

`thread` — Query conversation threads

`dataset` — Manage evaluation datasets

`example` — Manage dataset examples

`evaluator` — Manage evaluator rules

`experiment` — Query experiment results

Filter Options

Requirements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

langsmith-cli

Installation

Install script (recommended)

GitHub releases

Authentication

Quick Start

Output Formats

Command Reference

project — List tracing projects

trace — Query and export traces

run — Query individual runs

thread — Query conversation threads

dataset — Manage evaluation datasets

example — Manage dataset examples

evaluator — Manage evaluator rules

experiment — Query experiment results

Filter Options

Requirements

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`project` — List tracing projects

`trace` — Query and export traces

`run` — Query individual runs

`thread` — Query conversation threads

`dataset` — Manage evaluation datasets

`example` — Manage dataset examples

`evaluator` — Manage evaluator rules

`experiment` — Query experiment results

Packages