How SkillSea finds the right skills for your prompts.
When you type a prompt, SkillSea searches its Neo4j skill database to find relevant skills to load. Four search strategies are available, each with different trade-offs between speed and accuracy.
Smart search first tries exact name matching, then falls back to fulltext Lucene search. It extracts key terms from your prompt and matches them against skill names, titles, descriptions, and tags. This is the default because it's fast and works without an embedding model.
Best for: Quick keyword-based lookups when you mention skill names directly (e.g., "help me with git commit" matches the "git-commit" skill).
Semantic search uses AI embeddings to find skills by meaning, not just keywords. Your prompt is converted to a vector and compared against pre-computed skill embeddings using cosine similarity.
The embedding model (all-MiniLM-L6-v2) runs locally inside the SkillSea binary. No data is sent to external APIs.
Best for: Natural language queries where you describe what you want, not what skills exist (e.g., "my database queries are slow" finds database-optimization, postgres-admin, caching-strategies).
Text2Cypher converts your natural language prompt into a Neo4j Cypher query. Instead of matching against pre-computed embeddings or keyword indexes, the AI generates a graph query that traverses your skill database directly.
This enables complex queries that other strategies can't handle — like finding skills by relationships, filtering by tags, or traversing the graph structure (e.g., "skills related to Docker that also handle networking").
The MCP server exposes get_neo4j_schema to inspect the graph structure and read_neo4j_cypher to execute generated queries. The AI uses the schema to write accurate Cypher.
Best for: Complex graph queries, relationship traversal, and multi-criteria filtering (e.g., "find all deployment skills that depend on container orchestration").
Direct Lucene fulltext search against Neo4j. Fastest option, keyword-based matching only. Uses the trial tier and doesn't require an embedding model. Good as a fallback.
| Strategy | Speed | Accuracy | License Tier | Use Case |
|---|---|---|---|---|
| smart | ~150ms | High | Trial+ | Default, keyword + name matching |
| semantic | ~400ms | Very High | Basic+ | Natural language, meaning-based |
| text2cypher | ~800ms | Very High | Pro+ | Graph traversal, relationship queries |
| fulltext | ~100ms | Medium | Trial+ | Fastest, keyword only |
You can change the search strategy via the CLI hook install or MCP tool:
The first semantic search may take a few seconds while the embedding model loads into memory. To avoid this delay, you can pre-load it:
After warmup, subsequent semantic searches are typically under 50ms.