← Back to Documentation

Search Strategies

How SkillSea finds the right skills for your prompts.

Overview

When you type a prompt, SkillSea searches its Neo4j skill database to find relevant skills to load. Four search strategies are available, each with different trade-offs between speed and accuracy.

You type:
"my container keeps crashing"
SkillSea finds:
k8s-debug, docker-registry, k8s-deploy

Smart (Default)

Speed: ~150ms
Accuracy: High
Requires: Fulltext index

Smart search first tries exact name matching, then falls back to fulltext Lucene search. It extracts key terms from your prompt and matches them against skill names, titles, descriptions, and tags. This is the default because it's fast and works without an embedding model.

Best for: Quick keyword-based lookups when you mention skill names directly (e.g., "help me with git commit" matches the "git-commit" skill).

Semantic

Speed: ~400ms
Accuracy: Very High
Requires: Embedding model + vector index

Semantic search uses AI embeddings to find skills by meaning, not just keywords. Your prompt is converted to a vector and compared against pre-computed skill embeddings using cosine similarity.

The embedding model (all-MiniLM-L6-v2) runs locally inside the SkillSea binary. No data is sent to external APIs.

Best for: Natural language queries where you describe what you want, not what skills exist (e.g., "my database queries are slow" finds database-optimization, postgres-admin, caching-strategies).

Text2Cypher

Speed: ~800ms
Accuracy: Very High
Requires: Neo4j + MCP tools

Text2Cypher converts your natural language prompt into a Neo4j Cypher query. Instead of matching against pre-computed embeddings or keyword indexes, the AI generates a graph query that traverses your skill database directly.

This enables complex queries that other strategies can't handle — like finding skills by relationships, filtering by tags, or traversing the graph structure (e.g., "skills related to Docker that also handle networking").

The MCP server exposes get_neo4j_schema to inspect the graph structure and read_neo4j_cypher to execute generated queries. The AI uses the schema to write accurate Cypher.

Best for: Complex graph queries, relationship traversal, and multi-criteria filtering (e.g., "find all deployment skills that depend on container orchestration").

# Example: natural language to Cypher
Prompt: "skills for debugging crashed Kubernetes pods"
# AI generates:
MATCH (s:Skill)-[:TAGGED]->(t:Tag)
WHERE t.name IN ['kubernetes', 'debugging', 'pods']
RETURN s ORDER BY s.score DESC LIMIT 10

Fulltext

Speed: ~100ms
Accuracy: Medium
Requires: Fulltext index

Direct Lucene fulltext search against Neo4j. Fastest option, keyword-based matching only. Uses the trial tier and doesn't require an embedding model. Good as a fallback.

Comparison

StrategySpeedAccuracyLicense TierUse Case
smart~150msHighTrial+Default, keyword + name matching
semantic~400msVery HighBasic+Natural language, meaning-based
text2cypher~800msVery HighPro+Graph traversal, relationship queries
fulltext~100msMediumTrial+Fastest, keyword only

Switching Strategy

You can change the search strategy via the CLI hook install or MCP tool:

Via CLI

# Set strategy during hook sync
skillsea hook sync --strategy semantic

Via MCP Tool

# In Claude Code, the AI can call:
set_search_strategy(strategy="semantic", scope="local")

Via Environment Variable

SKILLSEA_SEARCH_STRATEGY=semantic

Warming Up the Embedding Model

The first semantic search may take a few seconds while the embedding model loads into memory. To avoid this delay, you can pre-load it:

# CLI
skillsea hook status --warmup
# MCP tool
warmup_semantic()
# Server startup flag
skillsea --preload-embeddings

After warmup, subsequent semantic searches are typically under 50ms.

Related