Troubleshooting & Known Limitations

Common problems and where to look. Most issues fall into one of three buckets: the connector isn’t on the classpath, the LLM endpoint/credentials are wrong, or the RAG/tool configuration doesn’t line up.

The template doesn’t appear / “connector not found”

The overlay isn’t enabled. Confirm AI is on for your distribution: run/run4 print AI Agent connector enabled at startup (and it’s off under --production unless --ai / AI_AGENT_ENABLED=true); Tomcat needs lib-ai/ on common.loader; WildFly needs the module dir + import present. See Installation.
Template path not configured. The modeler must scan classpath*:element-templates/*.json (see Installation).
connectorId mismatch. It must be exactly cibseven-ai-agent (or cibseven-knowledge-ingestor).

Authentication / endpoint errors (401, connection refused, wrong host)

Missing key. Set apiKey on the task or the OPENAI_API_KEY environment variable (see Configuring the Agent).
Wrong endpoint. Check baseUrl (or OPENAI_BASE_URL). For Ollama use http://localhost:11434/v1; for Azure/OpenRouter see the provider recipes in Configuring the Agent.
Gateway expects the key in a header. Put it in customHeaders (e.g. {"Authorization": "Bearer ..."}) rather than apiKey.
Placeholder key reaches the public API. If baseUrl is an internal gateway that injects the real credential, a placeholder apiKey may still be sent and rejected — verify the gateway’s auth expectations.

“Model not found” / unexpected model

The task left Model empty and the deployment default resolves to a model your provider doesn’t serve. Set model explicitly, or configure the default via CIBSEVEN_CONNECT_AI_AGENT_DEFAULT_MODEL / -Dcibseven.connect.ai-agent.defaultModel (see Installation). The built-in fallback is a placeholder, not a guarantee.

RAG returns nothing / errors

Embedding dimension mismatch. The biggest gotcha: embeddingDimension on the AI Agent task must match what was used at ingestion (384 for local AllMiniLmL6V2, 1536 for text-embedding-ada-002). Mismatched dimensions break retrieval. See RAG.
Different store/table. The AI Agent and the Knowledge Ingestor must use the same pgHost/ pgDatabase/pgTable.
pgvector missing. Run CREATE EXTENSION IF NOT EXISTS vector; on the database.
Too strict a filter. If minRagScore is high, all chunks may be filtered out — start at 0.0.
Embeddings hitting the wrong endpoint. A remote embeddingModelName currently always calls the public OpenAI endpoint regardless of baseUrl — expect connection failures in air-gapped setups and use the local model instead. See RAG.

Tool problems

toolClasses won’t load. The connector fails fast if a class is missing or lacks a public no-arg constructor, or isn’t on the engine classpath. Check the FQN spelling and packaging (see Tools).
MCP duplicate tool names. Give each MCP server a distinct name so tools are prefixed <name>__<tool> and don’t collide (see Tools).
MCP unreachable / 401. Verify the url and any auth headers per server entry.
ProcessStarterTool seems to hang. A large maxRetries/pollIntervalMillis (chosen by the model) makes it poll for a long time on a blocking thread. Constrain it via the prompt; see the limitations below.

Where did my output / audit go?

No agentOutput. Ensure the output mapping binds ${output} to a variable.
Audit timeline. It’s in the process variable cibseven-connect-ai-agent_<activityId>, not an output parameter, and may be suppressed by persistChatLog or the global chatLogVariable.enabled=false. See Audit Trail.
Content looks hashed. Redaction is on (redactContent) — that’s expected (see Audit Trail).

Output varies, or JSON won’t parse

Different result each run. Expected — LLM output is non-deterministic. Branch/validate on meaning, not exact text. Lower-variance models/settings help but don’t guarantee identical output.
agentOutput won’t parse as JSON. ${output} is a String; parse it with Spin (S(agentOutput)) in a following step. The model may add prose or deviate from the requested shape — ask for JSON explicitly (see Examples) and route unparseable output through a validation/error gateway. See Getting Started.

The agent task failed / raised an incident

The LLM call threw (timeout, 401/429/5xx, bad request). With asyncBefore the job retries, then raises an incident — inspect the incident message in Cockpit. Common causes are in the Authentication/endpoint and Model sections above. Consider a timeout/error boundary event for graceful handling.

Memory doesn’t persist

Chat memory is JVM-local and lost on restart / not shared across cluster nodes. This is by design in this release; see Chat Memory.

Known Limitations

This section documents the current constraints of the connector. Review these before relying on it for production-critical flows. For security implications, see Security & Data Handling.

Chat memory is JVM-local

The chat-memory store defaults to an in-memory implementation. Conversations are lost on engine restart and not shared across nodes in an HA/clustered deployment, and the message-window limit is enforced per JVM. Use a single node for durable conversations, or replace the backing store at startup with a persistent implementation. See Chat Memory.

No transaction rollback for agent-started subprocesses

A process started by ProcessStarterTool commits in its own transaction. If the agent task or a later step rolls back, the started subprocess is not undone — there is no rollback symmetry. Design tool-driven starts to be idempotent or compensable. See Developer Guide.

`ProcessStarterTool` poll budget is unbounded server-side

maxRetries and pollIntervalMillis come from the model’s tool-call arguments with no server-side ceiling, and polling uses a blocking sleep on a tool thread. A large budget can hold that thread for a long time and, under load, pressure the job executor. Constrain expectations via the prompt until a server-side cap lands. See Tools.

The engine-call executor is never shut down

ProcessStarterTool uses a static daemon thread pool that is not explicitly shut down. On hot redeploy of the engine this can retain a classloader reference (a potential leak). Not an issue for normal start/stop lifecycles. See Developer Guide.

Remote embeddings ignore `baseUrl`

When a remote embeddingModelName is set, the embedding calls go to the public OpenAI endpoint regardless of the chat baseUrl (and customHeaders are not applied to them). For air-gapped or data-residency-constrained deployments, use the local AllMiniLmL6V2 model so no document or query text leaves your network. See RAG.

The chat-log audit variable can grow large

The per-activity cibseven-connect-ai-agent_<activityId> variable accumulates every event (including system-prompt text, tool inventories, and stack traces) and can become a large JSON value per AI activity per process instance. For high-volume or stateless workloads, disable it (persistChatLog / chatLogVariable.enabled) and route audit to an external sink — mind the EU AI Act trade-off. See Audit Trail.

The default model name is a placeholder

The built-in fallback model is a configurable placeholder, not a guarantee that your provider serves it. Always set model on the task or a deployment-wide default. See Installation.

`toolClasses` has no allowlist

Any class named in toolClasses is loaded and its @Tool methods are exposed to the LLM. Treat the field as privileged configuration; an allowlist / @AgentTool SPI is a planned hardening. See Tools.

No MCP tool allowlist / lazy discovery yet

All tools exposed by a configured MCP server are registered. There is no per-task allowlist/filter to narrow the LLM-visible set, and no lazy tool discovery — both are planned. Disambiguation across servers (name prefixing) is shipped. See Tools.

No RAG vector index by default

The pgvector store is created without an ANN index, so retrieval is a linear scan — fine for small/medium knowledge bases; add an index out-of-band for very large stores. See RAG.

Not in this release: process-as-tool / ad-hoc subprocess orchestration

The umbrella concept and early presentations describe deeper “process-as-tool” and ad-hoc-subprocess orchestration (auto-exposing engine processes as MCP tools via an mcpProcess_ convention, AgentAdHocSubProcessBehavior, an AgentProcessEnginePlugin with Spring auto-configuration). None of that shipped in this module. The shipped connector exposes processes to the agent only through the built-in ProcessStarterTool (see Tools), and any MCP “process executor” integration is via a separate plugin configured through mcpServers.