But the "Usage Guidelines" part of the "License" section at the end of the README says: "License required for: Commercial embedding in products you sell or offering Mantic as a hosted service."
This is not completely true, since it seems that the software is licensed under AGPLv3, which of course allow the use of the software for any purpose, even commercial.
Are you sure it's correct? It says it's AGPL but the explanation given sounds like what you want is actually the LGPL. AGPL is about what happens if you expose a program as a SaaS and is generally banned from any company due to the "viral" nature i.e. a service that used Mantic would need to be fully open sourced even if the code was never distributed.
LGPL is for libraries: you can use an LGPLd program in proprietary software, but you have to make the source of the LGPLd program with modifications available if you distribute it. It doesn't infect the rest of the program, and it doesn't have any clauses that trigger for SaaS scenarios.
Your current explanation doesn't jive with my understanding the AGPL. For example, you cannot realistically sell a service that incorporates an AGPLd component because it'd require you to open source the entire service.
Thanks for the correction, you’re right that I mixed up LGPL and AGPL there. I haven’t updated the license yet but I plan to adjust it so it better matches the usage model and doesn’t create the “everything must be open source” issue you mentioned. Really appreciate you pointing it out. Thanks Mike!
Interesting idea but its very strong path-dependence makes me wary on its general use and reliability. E.g. on project's own codebase querying "extract exports" will've expected to get `src/dependency-graph.ts`, which has an `extractExports` function, 1st rather 7th. (Though in out of ~30 total files, that means gives expected result in top 25%.) Trying to search anything on chromium repo (just "git clone https://chromium.googlesource.com/chromium", no deps/submodules; only ~44k paths in `git ls-files`) returns "Error: Scanner failed to produce scored files. This is a bug."
Hello! Cool tool, I'm going to give it a try on my personal assistant. The vector DB prices look a bit cynical to me, even incredible. Do you think you could break down how you arrived at the cost estimation both for competing vector DBs and Mantic? For example, I use Weaviate at the moment and I don't come close to this cost even at a years perspective with a generous amount of usage from multiple users (~60)
You're absolutely right, it wasn't implemented, just documented. Thanks for catching that!
I just shipped v1.0.13 (literally 5 minutes ago) that implements all three environment variables:
• MANTIC_IGNORE_PATTERNS - Custom glob patterns to exclude files
• MANTIC_MAX_FILES - Limit number of files returned
• MANTIC_TIMEOUT - Search timeout in milliseconds
Also fixed the regex bug that was breaking glob pattern matching.
Appreciate you pointing it out, having users actually test the features is way more valuable than my own QA.
No embeddings at all, neither stored nor on-the-fly.
Instead of converting queries to vectors, Mantic.sh uses structural inference, it ranks files based on path components, folder depth, naming patterns, and git metadata.
So "stripe webhook" matches /services/stripe/webhook.handler.ts highly because the path literally contains both terms in logical positions, not because their embeddings are close in vector space.
"Cognitive" just means it mirrors how developers already think about code organization, it encodes intent into paths, so searching paths directly often beats semantic similarity.
What kind of agent workflow is it called when you post a hastily vibecoded Show HN, feed glaringly obvious user feedback reports one by one into the LLM, wait for the LLM to course-correct (YOU'RE ABSOLUTELY RIGHT!), then push a coincidentally timed "bugfix" while informing the user that their feedback was addressed by said bugfix?
I think there’s a misunderstanding of mantic.sh architecture...
The “weights” as you described in Mantic.sh are not neural network weights. They’re deterministic ranking heuristics, similar to what IDE file pickers use. For example, extension weights, path depth penalties, and filename matches. You can see them directly in brain-scorer.ts (EXTENSION_WEIGHTS).
Correct! The key insight isn't the algorithm itself—it's that structural metadata is enough. Traditional tools assume you need semantic understanding (embeddings), but we found that path structure + filename + recency gets you 90% of the way there in <200ms.
The 'custom ranking' is inspired by how expert developers navigate codebases: they don't read every file, they infer from structure. /services/stripe/webhook.handler.ts is obviously the webhook handler—no need to embed it.
The innovation is removing unnecessary work (content reading, chunking, embedding) and proving that simple heuristics are faster and more deterministic.
Actually, I haven't used JetBrains, didn't know they did something similar until now!
This came from a different angle I read about how the human brain operates on ~20 watts yet processes information incredibly efficiently. That got me thinking about how developers naturally encode semantics into folder structures without realizing it.
The "cognitive" framing is because we're already doing the work of organizing code meaningfully Mantic.sh just searches that existing structure instead of recreating it as embeddings. Turns out path-based search is just efficient pattern matching, which explains why it's so fast.
Interesting to hear JetBrains converged on a similar approach from the IDE side though!
Thanks for trying it! This sounds like a bug. Mantic.sh supports all languages (Python, Rust, Go, Java, etc.) it's language-agnostic since it ranks files by path/filename, not content.
A few debugging questions:
- What query did you run? (e.g., mantic "auth logic")
- What's your project structure? (Is it a monorepo, or does it have a non-standard layout?)
- Can you share the output of mantic "your query" --json?
If it's only returning
package.json, it likely means:
- The query is too generic (e.g., mantic "project"), OR
- The file scanner isn't finding your source files (possible .gitignore issue)
Tip: Try running git ls-files | wc -l in your project, if that returns 0 or a very small number, Mantic won't have files to search.
Happy to debug further if you can share more details!
Fair point the README focuses more on benchmarks than implementation here's the short version:
1. Use `git ls-files` instead of walking the filesystem (huge speed win - dropped Chromium 59GB scan from 6.6s to 0.46s)
2. Parse each file path into components (folders, filename, extension)
3. Score each file based on how query terms match path components, weighted by position and depth
4. Return top N matches sorted by score
The core insight: /services/stripe/webhook.handler.ts already encodes the semantic relationship between "stripe" and "webhook" through its structure. No need to read file contents or generate embeddings.
I should add an architecture doc to the repo, thanks for the nudge.
But the "Usage Guidelines" part of the "License" section at the end of the README says: "License required for: Commercial embedding in products you sell or offering Mantic as a hosted service."
This is not completely true, since it seems that the software is licensed under AGPLv3, which of course allow the use of the software for any purpose, even commercial.
LGPL is for libraries: you can use an LGPLd program in proprietary software, but you have to make the source of the LGPLd program with modifications available if you distribute it. It doesn't infect the rest of the program, and it doesn't have any clauses that trigger for SaaS scenarios.
Your current explanation doesn't jive with my understanding the AGPL. For example, you cannot realistically sell a service that incorporates an AGPLd component because it'd require you to open source the entire service.
Chromium timeout - Increased default to 30s (configurable via MANTIC_TIMEOUT). Now completes ~23s on 481k files.
Ranking (extract exports) - Added two-pass scoring, ultra-fast path/filename first, then lightweight regex extraction of function/class names from top 50 files.
Exact match: +200 boost Partial: +100 Result: dependency-graph.ts (with extractExports) now ranks #1.
Extras from your feedback:
Added Python (def), Rust (fn), Go (func) patterns Better camel/snake/keyword handling
New env var: MANTIC_FUNCTION_SCAN_LIMIT
Performance: +100-200ms overhead. Tested on Chromium (23s) and Cal.com (220ms).
Huge thanks—feedback like yours is gold!
The cost estimates were rough illustrations for high-usage cloud setups (100 devs × 100 searches/day = ~3.65M queries/year):
Vector embeddings: ~$0.003/query (OpenAI embeddings + managed DB like Pinecone) → $10,950/yr Sourcegraph: Older Enterprise rate (~$91/user/mo) → $109k/yr Mantic: $0 (local, no APIs/DBs)
You're spot on—these are high-end, Weaviate (esp. self-hosted/compressed) can be way cheaper for moderate use like your ~60 users.
I leaned toward worst-case managed pricing to highlight the "no ongoing cost" upside.
Let me know how the trial feels!
I just shipped v1.0.13 (literally 5 minutes ago) that implements all three environment variables:
• MANTIC_IGNORE_PATTERNS - Custom glob patterns to exclude files • MANTIC_MAX_FILES - Limit number of files returned • MANTIC_TIMEOUT - Search timeout in milliseconds
Also fixed the regex bug that was breaking glob pattern matching.
Appreciate you pointing it out, having users actually test the features is way more valuable than my own QA.
edit: That's structural or syntax-aware search.
Instead of converting queries to vectors, Mantic.sh uses structural inference, it ranks files based on path components, folder depth, naming patterns, and git metadata.
So "stripe webhook" matches /services/stripe/webhook.handler.ts highly because the path literally contains both terms in logical positions, not because their embeddings are close in vector space.
"Cognitive" just means it mirrors how developers already think about code organization, it encodes intent into paths, so searching paths directly often beats semantic similarity.
You should benchmark this against other rankers.
The “weights” as you described in Mantic.sh are not neural network weights. They’re deterministic ranking heuristics, similar to what IDE file pickers use. For example, extension weights, path depth penalties, and filename matches. You can see them directly in brain-scorer.ts (EXTENSION_WEIGHTS).
const EXTENSION_WEIGHTS: Record<string, number> = { '.ts': 20, '.tsx': 20, '.js': 15, '.jsx': 15, '.rs': 20, '.go': 20, '.py': 15, '.prisma': 15, '.graphql': 10, '.css': 5, '.json': 5, '.md': 2 };
There’s no LLM involved in the actual search or scoring, it’s a static heuristic engine, not a learned model.
I'd love to have the implementation critiqued on its merits!
This sounds a lot like document search ontop of your specific attributes and you have a custom ranking algorithm.
The 'custom ranking' is inspired by how expert developers navigate codebases: they don't read every file, they infer from structure. /services/stripe/webhook.handler.ts is obviously the webhook handler—no need to embed it.
The innovation is removing unnecessary work (content reading, chunking, embedding) and proving that simple heuristics are faster and more deterministic.
https://junegunn.github.io/fzf/
This came from a different angle I read about how the human brain operates on ~20 watts yet processes information incredibly efficiently. That got me thinking about how developers naturally encode semantics into folder structures without realizing it.
The "cognitive" framing is because we're already doing the work of organizing code meaningfully Mantic.sh just searches that existing structure instead of recreating it as embeddings. Turns out path-based search is just efficient pattern matching, which explains why it's so fast.
Interesting to hear JetBrains converged on a similar approach from the IDE side though!
A few debugging questions:
- What query did you run? (e.g., mantic "auth logic") - What's your project structure? (Is it a monorepo, or does it have a non-standard layout?) - Can you share the output of mantic "your query" --json?
If it's only returning package.json, it likely means:
- The query is too generic (e.g., mantic "project"), OR - The file scanner isn't finding your source files (possible .gitignore issue)
Tip: Try running git ls-files | wc -l in your project, if that returns 0 or a very small number, Mantic won't have files to search.
Happy to debug further if you can share more details!
1. Use `git ls-files` instead of walking the filesystem (huge speed win - dropped Chromium 59GB scan from 6.6s to 0.46s) 2. Parse each file path into components (folders, filename, extension) 3. Score each file based on how query terms match path components, weighted by position and depth 4. Return top N matches sorted by score
The core insight: /services/stripe/webhook.handler.ts already encodes the semantic relationship between "stripe" and "webhook" through its structure. No need to read file contents or generate embeddings.
I should add an architecture doc to the repo, thanks for the nudge.
For Cursor:
Click the "Install in Cursor" badge at the top of the README, or Use this deep link: https://cursor.com/en-US/install-mcp?name=mantic&config=eyJ0...
For VS Code:
Click the "Install in VS Code" badge, or Use this deep link: https://vscode.dev/redirect/mcp/install?name=mantic&config=%...
Manual Installation: Add this to your MCP settings (e.g., ~/Library/Application Support/Claude/claude_desktop_config.json):
json { "mcpServers": { "mantic": { "type": "stdio", "command": "npx", "args": ["-y", "mantic.sh@latest", "server"] } } }
Once installed, Claude Desktop (or any MCP client) can call the search_codebase tool to find relevant files before making code changes.
The MCP server implementation is in src/mcp-server.ts
if you want to see the code.