[Design Proposal] Evaluate Search Options for Docusaurus Documentation #316
Closed
AnoshanJ
started this conversation in
Design Proposals
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Current State (16/2/2026):
Problem
Our Docusaurus based documentation site (https://wso2.github.io/agent-manager/) requires a reliable, scalable, and maintainable search solution that works well across documentation versions and languages.
Currently, we need to decide:
Which search solution best fits our needs (Algolia, TypeSense, or client-side)
What trade-offs exist in cost, maintenance, scalability, and features
Whether we should plan for AI-powered search features
User Stories
As an Agent Developer/Platform Administrator, I want fast, accurate, and context-aware search across the documentation so that I can quickly discover configuration options, usage patterns, and operational guidance - ideally through both keyword search and AI-powered question answering.
As a documentation maintainer, I want a scalable, low-maintenance search solution that continues to deliver high-quality results as the documentation grows across pages, versions, and languages, without introducing additional operational complexity or ongoing maintenance cost.
Existing Solutions
Currently, the documentation site does not have a search capability. Users must manually navigate documentation pages or rely on external search engines.
The following search approaches are commonly used with Docusaurus and are evaluated in this proposal:
1. Algolia DocSearch (Hosted)
Officially supported by Docusaurus
Hosted documentation search solution
Free for public technical documentation sites
2. Client-side / Local Search
Search index generated at build time and loaded in the browser
No external dependencies or service costs
Best suited for small documentation sites
3. Typesense DocSearch (Self-managed or Cloud)
Community-supported integration for Docusaurus
Requires operating or paying for a Typesense cluster
Higher operational complexity compared to hosted options
Proposed Solution
Overview
Evaluate different search approaches for Docusaurus - Algolia DocSearch, Typesense DocSearch, and client-side search, with a focus on Algolia DocSearch due to its scalability, zero-cost docs tier, official support, and extensibility (including AI features).
Design
Option 1: Algolia DocSearch (Recommended)
Architecture
Algolia-hosted crawler indexes public docs
Algolia-hosted search index
Docusaurus search UI queries Algolia directly
Advantages
Official Docusaurus support
No infrastructure to manage
Highly relevant, and performant search
Optional Ask AI layer on top of the same index
Ask AI enables conversational search using our documentation as context.
Bring Your Own LLM Key (OpenAI, Anthropic, Mistral, etc.)
No Algolia fee for Ask AI itself, Cost is only LLM usage
Fully optional and can be enabled later
Capabilities
Natural-language Q&A
Context-aware answers
Configurations
Maximum token limit per response
Maximum number of search hits per LLM request
Thread depth limit
Contextual search (version + language filtering)
Free DocSearch tier (DocSearch Program) for developer docs and technical blogs :
DocSearch Program Limits (Free)
First 5,000,000 records
50,000,000 search requests / month
1,000,000 recommend requests / month
5,000,000 crawls / month
Application size - 25 GB
Index size - 25 GB
Record size - 100 kB
Number of indices per application - 20
Queries per second - 3
Maximum number of team members - 10
Beyond this limit, need to switch to a commercial plan
Algolia Commercial Plans
These plans apply if we run our own Algolia app (e.g., non-doc pages, branding removal, custom search or have a requirement to upgrade from the free DocSearch tier)
Option 2: Client-side / Local Search
Architecture
Search index generated at build time
Entire index downloaded to browser
Trade-offs
Only suitable for small docs
Slower initial load
Poor relevance compared to hosted search
No version awareness or AI
Option 3: Typesense DocSearch
Architecture
Typesense crawler indexes docs
Data stored in Typesense cluster (self-hosted or cloud - * recommended cluster costs $21.60 /month))
Community-maintained Docusaurus plugin
Trade-offs
Milestones
Beta Was this translation helpful? Give feedback.
All reactions