solidworks_mcp.agents.vector_rag¶
solidworks_mcp.agents.vector_rag ¶
FAISS-backed vector RAG index for SolidWorks design knowledge.
Supports local file and URL ingestion (PDF, Markdown, plain text, HTML), embedding via sentence-transformers (all-MiniLM-L6-v2 by default), and FAISS cosine-similarity search.
Usage::
from solidworks_mcp.agents.vector_rag import VectorRAGIndex
Build / update an index idx = VectorRAGIndex(namespace="3d-print-design")¶
idx.ingest_text("snap-fit-guide.md content …", source="snap-fit-guide.md") idx.save()
Query later idx2 = VectorRAGIndex.load(namespace="3d-print-design") hits =¶
idx2.query("snap fit cantilever deflection", top_k=5) for hit in hits: print(hit["score"], hit.get("text", ""))
Attributes¶
Classes¶
VectorRAGIndex ¶
VectorRAGIndex(namespace: str = 'engineering-reference', *, model_name: str = DEFAULT_MODEL, rag_dir: Path | None = None)
FAISS cosine-similarity index for a named namespace of design knowledge.
Files on disk: - {rag_dir}/{namespace}.faiss – FAISS flat-IP index -
{rag_dir}/{namespace}.meta.json – chunk metadata (source, text, etc.)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
namespace
|
str
|
Namespace used to isolate stored data. Defaults to "engineering- reference". |
'engineering-reference'
|
model_name
|
str
|
Embedding model name to use. Defaults to DEFAULT_MODEL. |
DEFAULT_MODEL
|
rag_dir
|
Path | None
|
Directory where RAG assets are stored. Defaults to None. |
None
|
Attributes:
| Name | Type | Description |
|---|---|---|
_faiss_path |
Any
|
The faiss path value. |
_meta_path |
Any
|
The meta path value. |
model_name |
Any
|
The model name value. |
namespace |
Any
|
The namespace value. |
rag_dir |
Any
|
The rag dir value. |
Initialize the vector ragindex.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
namespace
|
str
|
Namespace used to isolate stored data. Defaults to "engineering- reference". |
'engineering-reference'
|
model_name
|
str
|
Embedding model name to use. Defaults to DEFAULT_MODEL. |
DEFAULT_MODEL
|
rag_dir
|
Path | None
|
Directory where RAG assets are stored. Defaults to None. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
None |
None
|
None. |
Source code in src/solidworks_mcp/agents/vector_rag.py
Attributes¶
chunk_count
property
¶
Provide chunk count support for the vector ragindex.
Returns:
| Name | Type | Description |
|---|---|---|
int |
int
|
The computed numeric result. |
index_path
property
¶
Provide index path support for the vector ragindex.
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
The resulting text value. |
Functions¶
ingest_text ¶
ingest_text(text: str, *, source: str = 'unknown', chunk_size: int = DEFAULT_CHUNK_SIZE, overlap: int = DEFAULT_OVERLAP, tags: list[str] | None = None, deduplicate: bool = True) -> int
Chunk text, embed chunks, add to FAISS index.
Returns the number of new chunks added.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Input text processed by the operation. |
required |
source
|
str
|
Source label associated with the input content. Defaults to "unknown". |
'unknown'
|
chunk_size
|
int
|
Maximum number of characters to keep in each chunk. Defaults to DEFAULT_CHUNK_SIZE. |
DEFAULT_CHUNK_SIZE
|
overlap
|
int
|
Number of overlapping characters between chunks. Defaults to DEFAULT_OVERLAP. |
DEFAULT_OVERLAP
|
tags
|
list[str] | None
|
Optional tags associated with the input content. Defaults to None. |
None
|
deduplicate
|
bool
|
Whether duplicate content should be skipped. Defaults to True. |
True
|
Returns:
| Name | Type | Description |
|---|---|---|
int |
int
|
The computed numeric result. |
Source code in src/solidworks_mcp/agents/vector_rag.py
load
classmethod
¶
load(namespace: str = 'engineering-reference', *, model_name: str = DEFAULT_MODEL, rag_dir: Path | None = None) -> VectorRAGIndex
Load an existing index from disk. Returns an empty index if not found.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
namespace
|
str
|
Namespace used to isolate stored data. Defaults to "engineering- reference". |
'engineering-reference'
|
model_name
|
str
|
Embedding model name to use. Defaults to DEFAULT_MODEL. |
DEFAULT_MODEL
|
rag_dir
|
Path | None
|
Directory where RAG assets are stored. Defaults to None. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
VectorRAGIndex |
VectorRAGIndex
|
The result produced by the operation. |
Source code in src/solidworks_mcp/agents/vector_rag.py
query ¶
Semantic search. Returns list of {score, id, source, text, tags} dicts.
sorted by cosine similarity descending.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query_text
|
str
|
Query text used to search the index. |
required |
top_k
|
int
|
Maximum number of matches to return. Defaults to 5. |
5
|
Returns:
| Type | Description |
|---|---|
list[dict[str, Any]]
|
list[dict[str, Any]]: A list containing the resulting items. |
Source code in src/solidworks_mcp/agents/vector_rag.py
save ¶
Persist index and metadata to disk.
Returns:
| Name | Type | Description |
|---|---|---|
None |
None
|
None. |
Source code in src/solidworks_mcp/agents/vector_rag.py
_AwaitableQueryResult ¶
Bases: str
String result that can also be awaited for list-style compatibility.
Functions¶
_chunk_text ¶
_chunk_text(text: str, chunk_size: int = DEFAULT_CHUNK_SIZE, overlap: int = DEFAULT_OVERLAP) -> list[str]
Build internal chunk text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Input text processed by the operation. |
required |
chunk_size
|
int
|
Maximum number of characters to keep in each chunk. Defaults to DEFAULT_CHUNK_SIZE. |
DEFAULT_CHUNK_SIZE
|
overlap
|
int
|
Number of overlapping characters between chunks. Defaults to DEFAULT_OVERLAP. |
DEFAULT_OVERLAP
|
Returns:
| Type | Description |
|---|---|
list[str]
|
list[str]: A list containing the resulting items. |
Source code in src/solidworks_mcp/agents/vector_rag.py
_get_embedding_model ¶
Return a cached SentenceTransformer, loading it on first call.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Embedding model name to use. Defaults to DEFAULT_MODEL. |
DEFAULT_MODEL
|
Returns:
| Name | Type | Description |
|---|---|---|
Any |
Any
|
The result produced by the operation. |
Source code in src/solidworks_mcp/agents/vector_rag.py
_require_faiss ¶
Return the required require faiss.
Returns:
| Name | Type | Description |
|---|---|---|
Any |
Any
|
The result produced by the operation. |
Raises:
| Type | Description |
|---|---|
ImportError
|
Faiss-cpu is required for vector RAG. Install with: pip install faiss- cpu. |
Source code in src/solidworks_mcp/agents/vector_rag.py
_require_sentence_transformers ¶
Return the required require sentence transformers.
Returns:
| Name | Type | Description |
|---|---|---|
Any |
Any
|
The result produced by the operation. |
Raises:
| Type | Description |
|---|---|
ImportError
|
Sentence-transformers is required for vector RAG. Install with: pip install sentence-transformers. |
Source code in src/solidworks_mcp/agents/vector_rag.py
build_solidworks_api_docs_index ¶
build_solidworks_api_docs_index(docs_json_path: Path | None = None, *, rag_dir: Path | None = None, namespace: str = SW_API_DOCS_NAMESPACE) -> VectorRAGIndex
Ingest a solidworks_docs_index_*.json file into a FAISS namespace so.
the SolidWorks COM/VBA API surface is searchable by Gemma and other agents.
Each COM interface becomes its own chunk; the VBA TypeLib catalogue becomes a single
chunk. Call save() on the returned index to persist to disk.
Parameters ---------- docs_json_path: Path to the JSON file produced by
SolidWorksDocsDiscovery.save_index(). rag_dir: Override for the FAISS storage
directory. namespace: FAISS namespace name (default: "solidworks-api-docs").
Returns ------- VectorRAGIndex A populated (but not yet saved) index. Call .save()
after ingestion.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
docs_json_path
|
Path
|
The docs json path value. |
None
|
rag_dir
|
Path | None
|
Directory where RAG assets are stored. Defaults to None. |
None
|
namespace
|
str
|
Namespace used to isolate stored data. Defaults to SW_API_DOCS_NAMESPACE. |
SW_API_DOCS_NAMESPACE
|
Returns:
| Name | Type | Description |
|---|---|---|
VectorRAGIndex |
VectorRAGIndex
|
The result produced by the operation. |
Source code in src/solidworks_mcp/agents/vector_rag.py
534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 | |
query_design_knowledge ¶
query_design_knowledge(query: str, *, namespace: str = 'engineering-reference', top_k: int = 5, rag_dir: Path | None = None, score_threshold: float = 0.25) -> str
Query the FAISS index and return a formatted context string for LLM injection.
Returns empty string if no index or no relevant results.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Query text used for the operation. |
required |
namespace
|
str
|
Namespace used to isolate stored data. Defaults to "engineering- reference". |
'engineering-reference'
|
top_k
|
int
|
Maximum number of matches to return. Defaults to 5. |
5
|
rag_dir
|
Path | None
|
Directory where RAG assets are stored. Defaults to None. |
None
|
score_threshold
|
float
|
The score threshold value. Defaults to 0.25. |
0.25
|
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
The resulting text value. |
Source code in src/solidworks_mcp/agents/vector_rag.py
query_solidworks_api_docs ¶
query_solidworks_api_docs(query: str, *, namespace: str = SW_API_DOCS_NAMESPACE, top_k: int = 5, rag_dir: Path | None = None, score_threshold: float = 0.2) -> str
Semantic search over the SolidWorks COM/VBA API surface.
Returns a formatted markdown context string ready to inject into an LLM system prompt, or an empty string if no relevant results are found.
Parameters ---------- query: Natural-language question or task description. top_k: Maximum number of chunks to return. rag_dir: Override for the FAISS storage directory. score_threshold: Minimum cosine-similarity score (0–1) to include a chunk.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Query text used for the operation. |
required |
top_k
|
int
|
Maximum number of matches to return. Defaults to 5. |
5
|
rag_dir
|
Path | None
|
Directory where RAG assets are stored. Defaults to None. |
None
|
score_threshold
|
float
|
The score threshold value. Defaults to 0.20. |
0.2
|
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
The resulting text value. |