solidworks_mcp.ui.local_llm¶

solidworks_mcp.ui.local_llm ¶

Local LLM integration helpers for the SolidWorks MCP UI.

Provides three layers of typed abstraction:

Hardware – detect GPU VRAM / system RAM and pick the right Gemma tier. 2. Config – LocalLLMConfig is the single source of truth for endpoint, model name, and tier choice, shared by the UI, server endpoints, and the pydantic-ai agent runner.
Agent runner – run_local_agent() mirrors _run_structured_agent in service.py but routes exclusively to a local Ollama server. Both accept any BaseModel subclass as result_type so callers get a fully typed, validated response regardless of which backend they use.

Model tiers (Gemma 4 family via Ollama's OpenAI-compatible API): small : gemma4:e2b (~0-4 GB VRAM / CPU-capable) — edge and smoke tests balanced: gemma4:e4b (~8 GB VRAM) — recommended default for local planning large : gemma4:26b (~18 GB VRAM) — workstation- class local evaluation

Usage::

from solidworks_mcp.ui.local_llm import probe_local_model, run_local_agent from solidworks_mcp.agents.schemas import ClarificationResponse

probe = await probe_local_model() # LocalModelProbeResult result = await run_local_agent( system_prompt="You are a SolidWorks CAD assistant.", user_prompt="How many sketch constraints are needed for a slot?", result_type=ClarificationResponse, config=probe.to_config(), )

Attributes¶

GEMMA_TIERS `module-attribute` ¶

GEMMA_TIERS: dict[str, GemmaTierSpec] = {'small': GemmaTierSpec(ollama='gemma4:e2b', service='local:gemma4:e2b', label='Gemma 4 E2B (small — CPU or 4 GB VRAM)', min_vram_gb=0, min_ram_gb=8), 'balanced': GemmaTierSpec(ollama='gemma4:e4b', service='local:gemma4:e4b', label='Gemma 4 E4B (balanced — 8 GB VRAM)', min_vram_gb=8, min_ram_gb=16), 'large': GemmaTierSpec(ollama='gemma4:26b', service='local:gemma4:26b', label='Gemma 4 26B (large — 18 GB VRAM)', min_vram_gb=18, min_ram_gb=32)}

OLLAMA_DEFAULT_ENDPOINT `module-attribute` ¶

OLLAMA_DEFAULT_ENDPOINT = 'http://127.0.0.1:11434'

OLLAMA_OPENAI_ENDPOINT `module-attribute` ¶

OLLAMA_OPENAI_ENDPOINT = f'{OLLAMA_DEFAULT_ENDPOINT}/v1'

_T `module-attribute` ¶

_T = TypeVar('_T', bound=BaseModel)

logger `module-attribute` ¶

logger = getLogger(__name__)

Classes¶

GemmaTierSpec ¶

Bases: BaseModel

Hardware and model metadata for a single Gemma inference tier.

Attributes:

Name	Type	Description
`label`	`str`	The label value.
`min_ram_gb`	`float`	The min ram gb value.
`min_vram_gb`	`float`	The min vram gb value.
`ollama`	`str`	The ollama value.
`service`	`str`	The service value.

LocalAgentResult ¶

Bases: BaseModel, Generic[_T]

Typed envelope wrapping a structured pydantic-ai agent response.

data holds the validated result_type instance; config echoes back the LocalLLMConfig used so callers can log or audit provenance. Set success=False and error when the agent returned a RecoverableFailure or raised an exception.

Attributes:

Name	Type	Description
`config`	`LocalLLMConfig`	The config value.
`data`	`Any`	The data value.
`error`	`str \| None`	The error value.
`retry_hint`	`str \| None`	The retry hint value.
`success`	`bool`	The success value.

LocalLLMConfig ¶

Bases: BaseModel

Runtime configuration for a local Ollama LLM connection.

Passed from the probe result into run_local_agent() or directly into _build_agent_model() in service.py to keep settings consistent across all layers (UI state, server endpoints, pydantic-ai agent runner).

Attributes:

Name	Type	Description
`api_key`	`str`	The api key value.
`endpoint`	`str`	The endpoint value.
`ollama_model`	`str`	The ollama model value.
`openai_endpoint`	`str`	The openai endpoint value.
`service_model`	`str`	The service model value.
`tier`	`Literal['small', 'balanced', 'large']`	The tier value.

Functions¶

from_env `classmethod` ¶

from_env() -> LocalLLMConfig

Build config from environment variables, falling back to defaults.

Returns:

Name	Type	Description
`LocalLLMConfig`	`LocalLLMConfig`	The result produced by the operation.

Source code in src/solidworks_mcp/ui/local_llm.py

@classmethod
def from_env(cls) -> LocalLLMConfig:
    """Build config from environment variables, falling back to defaults.

    Returns:
        LocalLLMConfig: The result produced by the operation.
    """
    endpoint = os.getenv("SOLIDWORKS_UI_OLLAMA_ENDPOINT", OLLAMA_DEFAULT_ENDPOINT)
    service_model = os.getenv("SOLIDWORKS_UI_MODEL", "local:gemma4:e4b")
    tier = "small"
    for t, spec in GEMMA_TIERS.items():
        if spec.service == service_model:
            tier = t
            break
    spec = GEMMA_TIERS[tier]
    return cls(
        endpoint=endpoint,
        openai_endpoint=f"{endpoint}/v1",
        tier=tier,  # type: ignore[arg-type]
        ollama_model=spec.ollama,
        service_model=spec.service,
        api_key=os.getenv("LOCAL_OPENAI_API_KEY", "local"),
    )

LocalModelProbeResult ¶

Bases: BaseModel

Full hardware-detection and Ollama availability result.

Returned by probe_local_model() and serialised as the JSON response from GET /api/ui/local-model/probe. The to_config() helper converts directly into a LocalLLMConfig ready for run_local_agent().

Attributes:

Name	Type	Description
`all_tiers`	`dict[str, str]`	The all tiers value.
`available`	`bool`	The available value.
`endpoint`	`str`	The endpoint value.
`label`	`str`	The label value.
`ollama_model`	`str`	The ollama model value.
`openai_endpoint`	`str`	The openai endpoint value.
`pull_command`	`str`	The pull command value.
`pulled_models`	`list[str]`	The pulled models value.
`ram_gb`	`float`	The ram gb value.
`service_model`	`str`	The service model value.
`status_message`	`str`	The status message value.
`tier`	`Literal['small', 'balanced', 'large']`	The tier value.
`tier_already_pulled`	`bool`	The tier already pulled value.
`vram_gb`	`float`	The vram gb value.

Functions¶

to_config ¶

to_config() -> LocalLLMConfig

Convert probe result into a ready-to-use LocalLLMConfig.

Returns:

Name	Type	Description
`LocalLLMConfig`	`LocalLLMConfig`	The result produced by the operation.

Source code in src/solidworks_mcp/ui/local_llm.py

def to_config(self) -> LocalLLMConfig:
    """Convert probe result into a ready-to-use ``LocalLLMConfig``.

    Returns:
        LocalLLMConfig: The result produced by the operation.
    """
    return LocalLLMConfig(
        endpoint=self.endpoint,
        openai_endpoint=self.openai_endpoint,
        tier=self.tier,
        ollama_model=self.ollama_model,
        service_model=self.service_model,
    )

LocalModelPullRequest ¶

Bases: BaseModel

Request body for POST /api/ui/local-model/pull.

Attributes:

Name	Type	Description
`endpoint`	`str \| None`	The endpoint value.
`model`	`str`	The model value.

LocalModelPullResult ¶

Bases: BaseModel

Result from POST /api/ui/local-model/pull.

Attributes:

Name	Type	Description
`error`	`str \| None`	The error value.
`model`	`str`	The model value.
`queued`	`bool`	The queued value.
`response`	`dict[str, Any] \| None`	The response value.

LocalModelQueryRequest ¶

Bases: BaseModel

Request body for POST /api/ui/local-model/query.

Attributes:

Name	Type	Description
`endpoint`	`str \| None`	The endpoint value.
`model`	`str \| None`	The model value.
`prompt`	`str`	The prompt value.
`system_prompt`	`str`	The system prompt value.

Functions¶

_detect_gpu_vram_gb ¶

_detect_gpu_vram_gb() -> float

Return best-effort GPU VRAM estimate in GB, or 0.0 on failure.

Returns:

Name	Type	Description
`float`	`float`	The computed numeric result.

Source code in src/solidworks_mcp/ui/local_llm.py

def _detect_gpu_vram_gb() -> float:
    """Return best-effort GPU VRAM estimate in GB, or 0.0 on failure.

    Returns:
        float: The computed numeric result.
    """
    # Try nvidia-smi first
    try:
        out = subprocess.check_output(
            ["nvidia-smi", "--query-gpu=memory.total", "--format=csv,noheader,nounits"],
            stderr=subprocess.DEVNULL,
            timeout=5,
            text=True,
        )
        mib = max(
            int(x.strip()) for x in out.strip().splitlines() if x.strip().isdigit()
        )
        return mib / 1024.0
    except Exception:
        pass

    # Try wmic on Windows
    if platform.system() == "Windows":
        try:
            out = subprocess.check_output(
                ["wmic", "path", "win32_VideoController", "get", "AdapterRAM"],
                stderr=subprocess.DEVNULL,
                timeout=5,
                text=True,
            )
            bytes_vals = [
                int(x.strip())
                for x in out.splitlines()
                if x.strip().lstrip("-").isdigit() and int(x.strip()) > 0
            ]
            if bytes_vals:
                return max(bytes_vals) / (1024**3)
        except Exception:
            pass

    return 0.0

_detect_system_ram_gb ¶

_detect_system_ram_gb() -> float

Return total system RAM in GB.

Returns:

Name	Type	Description
`float`	`float`	The computed numeric result.

Source code in src/solidworks_mcp/ui/local_llm.py

def _detect_system_ram_gb() -> float:
    """Return total system RAM in GB.

    Returns:
        float: The computed numeric result.
    """
    try:
        import psutil  # optional dependency

        return psutil.virtual_memory().total / (1024**3)
    except ImportError:
        pass

    if platform.system() == "Windows":
        try:
            out = subprocess.check_output(
                ["wmic", "computersystem", "get", "TotalPhysicalMemory"],
                stderr=subprocess.DEVNULL,
                timeout=5,
                text=True,
            )
            for line in out.splitlines():
                line = line.strip()
                if line.isdigit():
                    return int(line) / (1024**3)
        except Exception:
            pass

    return 0.0

_ollama_health `async` ¶

_ollama_health(endpoint: str = OLLAMA_DEFAULT_ENDPOINT) -> bool

Return True if Ollama HTTP server is responding.

Parameters:

Name	Type	Description	Default
`endpoint`	`str`	The endpoint value. Defaults to OLLAMA_DEFAULT_ENDPOINT.	`OLLAMA_DEFAULT_ENDPOINT`

Returns:

Name	Type	Description
`bool`	`bool`	True if ollama health, otherwise False.

Source code in src/solidworks_mcp/ui/local_llm.py

async def _ollama_health(endpoint: str = OLLAMA_DEFAULT_ENDPOINT) -> bool:
    """Return True if Ollama HTTP server is responding.

    Args:
        endpoint (str): The endpoint value. Defaults to OLLAMA_DEFAULT_ENDPOINT.

    Returns:
        bool: True if ollama health, otherwise False.
    """
    import urllib.request

    loop = asyncio.get_event_loop()
    try:

        def _get() -> bool:
            """Build internal get.

            Returns:
                bool: True if get, otherwise False.
            """

            try:
                with urllib.request.urlopen(f"{endpoint}/api/tags", timeout=3) as r:
                    return r.status == 200
            except Exception:
                return False

        return await loop.run_in_executor(None, _get)
    except Exception:
        return False

_ollama_list_models `async` ¶

_ollama_list_models(endpoint: str = OLLAMA_DEFAULT_ENDPOINT) -> list[str]

Return list of model names currently pulled in Ollama.

Parameters:

Name	Type	Description	Default
`endpoint`	`str`	The endpoint value. Defaults to OLLAMA_DEFAULT_ENDPOINT.	`OLLAMA_DEFAULT_ENDPOINT`

Returns:

Type	Description
`list[str]`	list[str]: A list containing the resulting items.

Source code in src/solidworks_mcp/ui/local_llm.py

async def _ollama_list_models(endpoint: str = OLLAMA_DEFAULT_ENDPOINT) -> list[str]:
    """Return list of model names currently pulled in Ollama.

    Args:
        endpoint (str): The endpoint value. Defaults to OLLAMA_DEFAULT_ENDPOINT.

    Returns:
        list[str]: A list containing the resulting items.
    """
    import json
    import urllib.request

    loop = asyncio.get_event_loop()

    def _get() -> list[str]:
        """Build internal get.

        Returns:
            list[str]: A list containing the resulting items.
        """

        try:
            with urllib.request.urlopen(f"{endpoint}/api/tags", timeout=5) as r:
                data = json.loads(r.read())
                return [m.get("name", "") for m in data.get("models", [])]
        except Exception:
            return []

    return await loop.run_in_executor(None, _get)

probe_local_model `async` ¶

probe_local_model(endpoint: str | None = None) -> LocalModelProbeResult

Probe Ollama for availability and return a typed recommendation result.

The returned LocalModelProbeResult can be forwarded directly as a FastAPI JSON response (it is a BaseModel). Call .to_config() on the result to build a LocalLLMConfig for run_local_agent().

Parameters:

Name	Type	Description	Default
`endpoint`	`str \| None`	The endpoint value. Defaults to None.	`None`

Returns:

Name	Type	Description
`LocalModelProbeResult`	`LocalModelProbeResult`	The result produced by the operation.

Source code in src/solidworks_mcp/ui/local_llm.py

async def probe_local_model(
    endpoint: str | None = None,
) -> LocalModelProbeResult:
    """Probe Ollama for availability and return a typed recommendation result.

    The returned ``LocalModelProbeResult`` can be forwarded directly as a FastAPI JSON
    response (it is a ``BaseModel``).  Call ``.to_config()`` on the result to build a
    ``LocalLLMConfig`` for ``run_local_agent()``.

    Args:
        endpoint (str | None): The endpoint value. Defaults to None.

    Returns:
        LocalModelProbeResult: The result produced by the operation.
    """
    resolved_endpoint = endpoint or os.getenv(
        "SOLIDWORKS_UI_OLLAMA_ENDPOINT", OLLAMA_DEFAULT_ENDPOINT
    )

    vram_gb = _detect_gpu_vram_gb()
    ram_gb = _detect_system_ram_gb()
    tier = recommend_model_tier(vram_gb=vram_gb, ram_gb=ram_gb)
    spec = GEMMA_TIERS[tier]

    available = await _ollama_health(resolved_endpoint)
    pulled_models: list[str] = []
    if available:
        pulled_models = await _ollama_list_models(resolved_endpoint)

    tier_model = spec.ollama
    tier_already_pulled = any(tier_model in m for m in pulled_models)

    if not available:
        status = (
            f"Ollama is not running at {resolved_endpoint}. "
            "Install from https://ollama.com and run: ollama serve"
        )
    elif tier_already_pulled:
        status = f"Ready: {spec.label} is loaded in Ollama."
    else:
        status = (
            f"Ollama is running. Pull the recommended model with: "
            f"ollama pull {tier_model}"
        )

    return LocalModelProbeResult(
        available=available,
        endpoint=resolved_endpoint,
        openai_endpoint=f"{resolved_endpoint}/v1",
        tier=tier,  # type: ignore[arg-type]
        ollama_model=tier_model,
        service_model=spec.service,
        label=spec.label,
        vram_gb=round(vram_gb, 1),
        ram_gb=round(ram_gb, 1),
        pulled_models=pulled_models,
        tier_already_pulled=tier_already_pulled,
        pull_command=f"ollama pull {tier_model}",
        status_message=status,
        all_tiers={k: v.label for k, v in GEMMA_TIERS.items()},
    )

pull_ollama_model `async` ¶

pull_ollama_model(model: str, endpoint: str | None = None) -> LocalModelPullResult

Trigger an Ollama model pull. Runs in a thread; returns immediately.

Returns a typed LocalModelPullResult with queued=True on success.

Parameters:

Name	Type	Description	Default
`model`	`str`	The model value.	required
`endpoint`	`str \| None`	The endpoint value. Defaults to None.	`None`

Returns:

Name	Type	Description
`LocalModelPullResult`	`LocalModelPullResult`	The result produced by the operation.

Source code in src/solidworks_mcp/ui/local_llm.py

async def pull_ollama_model(
    model: str,
    endpoint: str | None = None,
) -> LocalModelPullResult:
    """Trigger an Ollama model pull. Runs in a thread; returns immediately.

    Returns a typed ``LocalModelPullResult`` with ``queued=True`` on success.

    Args:
        model (str): The model value.
        endpoint (str | None): The endpoint value. Defaults to None.

    Returns:
        LocalModelPullResult: The result produced by the operation.
    """
    import json
    import urllib.request

    resolved_endpoint = endpoint or os.getenv(
        "SOLIDWORKS_UI_OLLAMA_ENDPOINT", OLLAMA_DEFAULT_ENDPOINT
    )
    loop = asyncio.get_event_loop()

    def _pull() -> LocalModelPullResult:
        """Build internal pull.

        Returns:
            LocalModelPullResult: The result produced by the operation.
        """

        payload = json.dumps({"name": model, "stream": False}).encode()
        req = urllib.request.Request(
            f"{resolved_endpoint}/api/pull",
            data=payload,
            method="POST",
            headers={"Content-Type": "application/json"},
        )
        try:
            with urllib.request.urlopen(req, timeout=300) as r:
                body = json.loads(r.read())
                return LocalModelPullResult(queued=True, model=model, response=body)
        except Exception as exc:
            return LocalModelPullResult(queued=False, model=model, error=str(exc))

    return await loop.run_in_executor(None, _pull)

recommend_model_tier ¶

recommend_model_tier(vram_gb: float = 0.0, ram_gb: float = 0.0) -> str

Return 'small' | 'balanced' | 'large' based on available hardware.

Parameters:

Name	Type	Description	Default
`vram_gb`	`float`	The vram gb value. Defaults to 0.0.	`0.0`
`ram_gb`	`float`	The ram gb value. Defaults to 0.0.	`0.0`

Returns:

Name	Type	Description
`str`	`str`	The resulting text value.

Source code in src/solidworks_mcp/ui/local_llm.py

def recommend_model_tier(vram_gb: float = 0.0, ram_gb: float = 0.0) -> str:
    """Return 'small' | 'balanced' | 'large' based on available hardware.

    Args:
        vram_gb (float): The vram gb value. Defaults to 0.0.
        ram_gb (float): The ram gb value. Defaults to 0.0.

    Returns:
        str: The resulting text value.
    """
    for tier in ("large", "balanced", "small"):
        spec = GEMMA_TIERS[tier]
        if vram_gb >= spec.min_vram_gb and ram_gb >= spec.min_ram_gb:
            return tier
    return "small"  # always runnable with quantized 4B

run_local_agent `async` ¶

run_local_agent(*, system_prompt: str, user_prompt: str, result_type: type[_T], config: LocalLLMConfig | None = None, rag_query: str | None = None, rag_namespace: str = 'solidworks-api-docs') -> LocalAgentResult[_T]

Run a pydantic-ai Agent against the local Ollama server and return a.

typed LocalAgentResult.

This mirrors _run_structured_agent in service.py but is self- contained in this module so any layer (UI route, service function, or CLI) can call local inference without importing the full service graph.

Parameters ---------- system_prompt: Instruction preamble for the LLM. user_prompt: The concrete question or task. result_type: A BaseModel subclass. pydantic-ai validates the LLM output against this schema and retries automatically on parse failures. config: Connection settings. Defaults to LocalLLMConfig.from_env(). rag_query: If provided, the FAISS solidworks-api-docs namespace is queried with this string and the top results are prepended to system_prompt as grounded API context for the model. Pass the same text as user_prompt for a simple "augment with API docs" pattern, or a more specific sub-query for targeted retrieval. rag_namespace: FAISS namespace to query when rag_query is set. Defaults to "solidworks-api-docs" (the COM/VBA surface index).

Returns ------- LocalAgentResult[_T] success=True with data set to a validated result_type instance, or success=False with an error message.

Parameters:

Name	Type	Description	Default
`system_prompt`	`str`	The system prompt value.	required
`user_prompt`	`str`	The user prompt value.	required
`result_type`	`type[_T]`	The result type value.	required
`config`	`LocalLLMConfig \| None`	Configuration values for the operation. Defaults to None.	`None`
`rag_query`	`str \| None`	The rag query value. Defaults to None.	`None`
`rag_namespace`	`str`	The rag namespace value. Defaults to "solidworks-api-docs".	`'solidworks-api-docs'`

Returns:

Type	Description
`LocalAgentResult[_T]`	LocalAgentResult[_T]: The result produced by the operation.

Source code in src/solidworks_mcp/ui/local_llm.py

async def run_local_agent(
    *,
    system_prompt: str,
    user_prompt: str,
    result_type: type[_T],
    config: LocalLLMConfig | None = None,
    rag_query: str | None = None,
    rag_namespace: str = "solidworks-api-docs",
) -> LocalAgentResult[_T]:
    """Run a pydantic-ai ``Agent`` against the local Ollama server and return a.

    typed ``LocalAgentResult``.

    This mirrors ``_run_structured_agent`` in ``service.py`` but is self- contained in this
    module so any layer (UI route, service function, or CLI) can call local inference
    without importing the full service graph.

    Parameters ---------- system_prompt: Instruction preamble for the LLM. user_prompt: The
    concrete question or task. result_type: A ``BaseModel`` subclass.  pydantic-ai validates
    the LLM output against this schema and retries automatically on parse failures. config:
    Connection settings.  Defaults to ``LocalLLMConfig.from_env()``. rag_query: If provided,
    the FAISS ``solidworks-api-docs`` namespace is queried with this string and the top
    results are prepended to ``system_prompt`` as grounded API context for the model.  Pass
    the same text as ``user_prompt`` for a simple "augment with API docs" pattern, or a more
    specific sub-query for targeted retrieval. rag_namespace: FAISS namespace to query when
    ``rag_query`` is set.  Defaults to ``"solidworks-api-docs"`` (the COM/VBA surface
    index).

    Returns ------- LocalAgentResult[_T] ``success=True`` with ``data`` set to a validated
    ``result_type`` instance, or ``success=False`` with an ``error`` message.

    Args:
        system_prompt (str): The system prompt value.
        user_prompt (str): The user prompt value.
        result_type (type[_T]): The result type value.
        config (LocalLLMConfig | None): Configuration values for the operation. Defaults to
                                        None.
        rag_query (str | None): The rag query value. Defaults to None.
        rag_namespace (str): The rag namespace value. Defaults to "solidworks-api-docs".

    Returns:
        LocalAgentResult[_T]: The result produced by the operation.
    """
    from ..agents.schemas import RecoverableFailure  # avoid circular at import time

    resolved_config = config or LocalLLMConfig.from_env()

    # --- RAG augmentation ---
    augmented_system_prompt = system_prompt
    if rag_query:
        try:
            from ..agents.vector_rag import (
                query_design_knowledge,
                query_solidworks_api_docs,
            )

            api_context = (
                query_solidworks_api_docs(rag_query)
                if rag_namespace == "solidworks-api-docs"
                else query_design_knowledge(rag_query, namespace=rag_namespace)
            )
            if api_context:
                augmented_system_prompt = f"{system_prompt}\n\n{api_context}"
                logger.debug(
                    "run_local_agent: injected %d RAG chars from '%s'",
                    len(api_context),
                    rag_namespace,
                )
        except Exception as _rag_exc:
            logger.debug("RAG augmentation skipped: %s", _rag_exc)

    try:
        from pydantic_ai import Agent
        from pydantic_ai.models.openai import OpenAIChatModel
        from pydantic_ai.providers.openai import OpenAIProvider
    except ImportError:  # pragma: no cover
        return LocalAgentResult(
            success=False,
            error="pydantic-ai is not installed in this environment.",
            config=resolved_config,
        )

    model_id = (
        resolved_config.service_model.split(":", 1)[1]
        if resolved_config.service_model.startswith("local:")
        else resolved_config.service_model
    )
    provider = OpenAIProvider(
        base_url=resolved_config.openai_endpoint,
        api_key=resolved_config.api_key,
    )
    configured_model = OpenAIChatModel(model_id, provider=provider)

    agent: Agent[None, _T | RecoverableFailure] = Agent(
        configured_model,
        system_prompt=augmented_system_prompt,
        output_type=[result_type, RecoverableFailure],  # type: ignore[list-item]
    )

    try:
        result = await agent.run(user_prompt)
        payload = result.data if hasattr(result, "data") else result.output
    except Exception as exc:
        logger.exception("run_local_agent failed")
        return LocalAgentResult(
            success=False,
            error=str(exc),
            config=resolved_config,
        )

    if isinstance(payload, RecoverableFailure):
        return LocalAgentResult(
            success=False,
            error=payload.explanation,
            retry_hint=getattr(payload, "retry_hint", None),
            config=resolved_config,
        )

    return LocalAgentResult(success=True, data=payload, config=resolved_config)

solidworks_mcp.ui.local_llm¶