Polling Client - Data Copy and Import

Polling Client - Data Copy and Import

The Polling Client is a core component of CIB ins7ght responsible for continuously synchronizing data from CIB seven to the CIB ins7ght database. This section explains the architecture, data flow, and mechanisms behind this critical component.


Architecture Overview

The Polling Client operates as a scheduled background service within the CIB ins7ght application. It connects to the CIB seven database through REST APIs, retrieves process data, and stores it in the CIB ins7ght database for analysis.

Key Components:

  • Polling Scheduler: Orchestrates the polling cycles and manages timing
  • Data Provider: Interfaces with CIB seven REST API to fetch data
  • Data Transformer: Converts CIB seven data format to CIB ins7ght schema
  • Data Persister: Stores transformed data into the CIB ins7ght database

Data Flow

CIB seven Database (Read-Only)
        ↓
   REST API Layer
        ↓
  Polling Scheduler
        ↓
   Data Providers
        ↓
  Data Transformers
        ↓
 CIB ins7ght Database
        ↓
  Analytics Engine

Polling Mechanism

Polling Cycle

The Polling Client operates in continuous cycles:

  1. Trigger: The scheduler initiates a polling cycle based on the configured interval
  2. Fetch: Each data provider retrieves new or updated records from CIB seven
  3. Transform: Data is converted to the CIB ins7ght schema
  4. Persist: Transformed data is saved to the CIB ins7ght database
  5. Wait: The system waits for the next polling interval

Default Polling Interval: 10 seconds (configurable)

Incremental Synchronization

After the initial full import, the Polling Client uses incremental synchronization:

  • Tracks the last successfully imported timestamp for each data type
  • Only fetches records created or modified after the last sync
  • Minimizes network traffic and processing overhead
  • Ensures efficient real-time data updates

Data Sources

The Polling Client retrieves data from CIB seven via REST API:

1. Process Instances

Purpose: Historical process instance information including execution status and timing

Key Data Elements:

  • Process instance identifiers
  • Process definition references
  • Start and end timestamps
  • Duration metrics
  • Current state (ACTIVE, COMPLETED, CANCELED)

Synchronization: Incremental updates based on timestamp, fetching only new or modified instances since last poll.

2. Activity Instances

Purpose: Individual activity execution data for detailed performance analysis

Key Data Elements:

  • Activity instance identifiers
  • Parent process instance reference
  • Activity ID and name from process definition
  • Start and end timestamps
  • Duration metrics

Synchronization: Continuous sync of activity-level execution data to enable bottleneck analysis.

3. Process Definitions

Purpose: Process definition metadata including versioning information

Key Data Elements:

  • Process definition identifiers
  • Process keys and names
  • Version numbers
  • Deployment references

Synchronization: Triggered when new process versions are deployed to CIB seven.

4. Incidents

Purpose: Process execution incidents and error tracking

Key Data Elements:

  • Incident identifiers
  • Process instance references
  • Incident types and categories
  • Error messages and details
  • Timestamp of occurrence

Synchronization: Real-time import of incidents to ensure immediate visibility into process issues.

5. BPMN Process Definitions

Purpose: Serialized BPMN diagrams for process visualization

Key Data Elements:

  • BPMN XML content
  • Resource names
  • Deployment information

Synchronization: Fetched when new process definitions are detected or on-demand for visualization purposes.


Data Providers

Each data type has a dedicated provider component that specializes in fetching and processing specific types of data:

Common Provider Functionality:

  • Connection management to CIB seven
  • Error handling and retry logic
  • Timestamp tracking for incremental sync
  • Batch processing capabilities

Provider Types:

The system includes specialized providers for each data source:

  • Process Instance Provider: Fetches process instance data
  • Activity Instance Provider: Retrieves activity execution data
  • Process Definition Provider: Imports process definitions
  • Incident Provider: Collects incident information
  • BPMN Definition Provider: Downloads BPMN process definitions

Data Transformation

Raw data from CIB seven is transformed to match the CIB ins7ght schema:

Transformation Steps:

  1. Field Mapping: CIB seven column names are mapped to CIB ins7ght equivalents
  2. Data Type Conversion: Date formats, numeric types are standardized
  3. Calculated Fields: Additional metrics are computed (e.g., normalized durations)
  4. Relationship Mapping: Foreign key relationships are established
  5. Validation: Data integrity checks are performed

Initial Import

When CIB ins7ght is first connected to a CIB seven instance:

  1. Full Historical Import: All existing data is retrieved from CIB seven
  2. Baseline Establishment: The system establishes a complete historical dataset
  3. Index Creation: Database indexes are created for optimal query performance
  4. Statistics Calculation: Initial KPIs and metrics are computed

Note: Initial import duration depends on data volume. For large datasets (millions of records), this process may take several hours.


Error Handling and Resilience

The Polling Client includes error handling:

Failure Recovery

  • Transaction Management: Database operations use transactions
  • State Persistence: Last successful sync timestamp is tracked in the database
  • Resume from Last Success: Polling resumes from the last known good state after failures
  • Error Logging: Errors are logged for troubleshooting

Monitoring

  • Logs: Detailed logging for troubleshooting and monitoring polling activity

Performance Optimization

Batch Processing

  • Data is fetched in configurable page sizes
  • Default page sizes:
    • Deployments: 500
    • Activity Instances: 100,000
    • Process Instances: 500
    • Incidents: 500
  • Reduces memory footprint
  • Optimizes network utilization

Configuration

Key configuration properties for the Polling Client:

polling:
  time: 10                                 # Polling interval in seconds
  engineRestUrl: http://localhost:8080/engine-rest
  maxPageSizeDeployments: 500              # Max deployments per request
  maxPageSizeActivityInstances: 100000     # Max activity instances per request
  maxPageSizeProcessInstances: 500         # Max process instances per request
  maxPageSizeIncidents: 500                # Max incidents per request
  batchIntervalInHours: 24                 # Batch processing interval
  auth:
    type: basic                            # Authentication type: basic, webclient, or sso
    basic:
      username: demo
      password: demo

See Configuration Guide for detailed authentication options.


Monitoring and Troubleshooting

Check Polling Status

Polling Info Endpoint:

GET /pollinfo

This endpoint returns the current polling status and timestamps.

Response Example:

[
  {
    "id": "uuid",
    "fetchFrom": "2025-10-29T10:00:00Z",
    "fetchedUntil": "2025-10-29T10:30:00Z",
    "isPollingSince": "2025-10-29T10:30:15Z"
  }
]

Fields:

  • fetchFrom: Last timestamp from which data was fetched
  • fetchedUntil: Last timestamp until which data was fetched
  • isPollingSince: Timestamp when current polling cycle started

Application Logs:

Monitor polling activity through application logs for detailed information:

  • Polling start and completion times
  • Number of records fetched per data type
  • Connection status to CIB seven
  • Any errors or warnings

Common Issues

Polling Not Starting:

  • Check CIB seven connection settings (polling.engineRestUrl)
  • Verify authentication configuration (polling.auth)
  • Review application logs for errors
  • Ensure CIB seven REST API is accessible

Slow Import Performance:

  • Increase page size parameters for larger datasets
  • Check network latency to CIB seven
  • Review database performance and indexing
  • Verify CIB seven server performance

Data Inconsistencies:

  • Verify CIB seven version compatibility
  • Check for schema changes in CIB seven
  • Review transformation logic for errors

Security Considerations

  • Polling Client accesses CIB seven via REST API
  • Credentials should be stored as environment variables
  • HTTPS/TLS can be configured for encrypted connections to CIB seven
  • Application logs record polling activity

On this Page: