AI Agent Connector

The AI Agent Connector brings agentic AI into CIB seven business processes. It is a CIB seven engine connect plugin: you drop an AI Agent service task into a BPMN model, point it at an LLM provider, and the agent reasons over your input, optionally calls tools, retrieves knowledge, remembers the conversation, and returns a result the next BPMN step can consume — all under the process engine’s transaction, authorization, and history machinery.

The plugin ships two connectors:

Connector	`connectorId`	Purpose
AI Agent	`cibseven-ai-agent`	Runs an LLM agent inside a service task: prompt → (tools / RAG / memory) → answer.
Knowledge Ingestor	`cibseven-knowledge-ingestor`	Embeds text into a pgvector store so the AI Agent can retrieve it later (RAG).

Two different 'AI Agent' features — don't confuse them

CIB seven has two unrelated AI features with similar names. This documentation covers only the AI Agent Connector.

	AI Agent Connector (this doc)	Modeler “BPMN AI Agent” / wizard
When	Runtime — executes during a process instance	Design-time — helps you author BPMN
Where	`connect/ai-agent` engine connect plugin	CIB seven Modeler (webclient)
Configured via	Element-template fields + `CIBSEVEN_CONNECT_AI_AGENT_DEFAULT_MODEL`	`cibseven.webclient.modeler.ai` / `GET /agent/config`
Connectors	`cibseven-ai-agent`, `cibseven-knowledge-ingestor`	—

What it can do

LLM agent loop — sends your message plus a system prompt to an OpenAI-compatible model and returns the model’s text answer. See Configuring the Agent.
Tools (function calling) — the model can call Java @Tool classes and MCP servers. A built-in ProcessStarterTool lets the agent launch a CIB seven process by key and react to its result. See Tools.
RAG (retrieval-augmented generation) — ground answers in your own documents using a PostgreSQL pgvector store and a local or remote embedding model. See RAG.
Chat memory — keep a multi-turn conversation across BPMN steps (e.g. a human-feedback loop). See Chat Memory.
Reasoning models — pass effort/summary hints to reasoning-capable models. See Configuring the Agent.
Audit trail & AI transparency — every interaction is recorded in a per-activity chat-log variable, output is marked as AI-generated (aiMeta), and content can be redacted. See Audit Trail.

Architecture at a glance

Built on LangChain4j — the agent loop, tool calling, RAG, and embedding integrations come from LangChain4j 1.14.
Talks to any OpenAI-compatible API (/v1/chat/completions or the Responses API for reasoning summaries). The default endpoint is https://api.openai.com/v1; point baseUrl elsewhere for Azure OpenAI, OpenRouter, Ollama, or a self-hosted gateway. Because the interface is provider-neutral, you can switch providers — or self-host — without re-modeling, which avoids vendor lock-in.
Registers with the engine through the CIB seven Connect SPI (ConnectorProvider via META-INF/services). No engine changes, no @ComponentScan.
RAG uses pgvector on PostgreSQL — no separate vector database to operate.
Embeddings default to a local all-MiniLM-L6-v2 model (384-dim, no external call); a remote OpenAI embedding model is optional.

Non-invasive by design — the adoption path

The connector is delivered as a removable overlay. If the plugin JARs are absent, the engine behaves exactly as before and pulls in no AI dependencies. Customers adopt at their own pace:

No AI — don’t install (or remove) the overlay. The engine is unchanged.
Basic AI — add the overlay, set an LLM API key, drop an AI Agent service task. You get the agent loop plus the audit trail.
Full AI — add PostgreSQL + pgvector for RAG, expose tools via Java @Tool classes or MCP servers, and enable chat memory for multi-turn flows.

How the overlay is enabled or removed per distribution is covered in Installation & Enablement.

Requirements

Java 17+ (required by LangChain4j). All current CIB seven distributions already run on Java 17+.
An OpenAI-compatible LLM endpoint and, for hosted providers, an API key.
Optional: PostgreSQL with the vector (pgvector) extension — only needed for RAG.
Optional: MCP servers if you want to expose external tools to the agent.

Cost & token usage

Every agent invocation calls an LLM and consumes provider tokens (i.e. cost, for paid providers). Cost and latency scale with how much context you send and produce, so the feature levers are also the cost levers:

Model — pick a model sized to the task; smaller/cheaper models for simple classification, larger ones only where needed.
Chat memory (chatMemoryMaxMessages) — replayed history is re-sent every turn; a smaller window costs less.
RAG (maxRagResults) — each retrieved chunk is added to the prompt; retrieve fewer for less cost.
Tools / reasoning — tool schemas, tool results, and reasoning effort all add tokens.

There is no built-in per-call token cap in the connector; control spend through the levers above and your provider’s own limits/budgets. A local provider (e.g. Ollama) removes per-token cost entirely.

Maturity

The AI Agent connector is introduced with the 2.2 release. It is functionally complete for the capabilities documented here; a set of hardening and enhancement items is tracked and listed honestly in Limitations & Known Issues. Review that chapter and Security & Data Handling before production use.

Where to go next

New to the connector? Start with Getting Started.
Operator setting up a distribution? See Installation & Enablement.
Looking for a specific parameter? See the Configuration Reference.
Want to know exactly what’s recorded? See Audit Trail & AI Transparency.
Security, data flows, secrets → Security & Data Handling.
Honest about the edges — see Limitations & Known Issues.