Configuring Event Sources

KillIT v3 supports multiple monitoring tools and event sources. This guide covers how to configure each supported integration.

Supported Event Sources

KillIT v3 provides comprehensive integration with various monitoring platforms:

Infrastructure Monitoring

Nagios - Full webhook support with state mapping and service/host events
Zabbix - Trigger-based alerts with severity mapping
Prometheus - Alertmanager webhook with label extraction
PRTG - Network monitoring with sensor data
Icinga2 - API integration with state changes

Cloud Platforms

AWS CloudWatch - Alarms, metrics, and logs integration
Azure Monitor - Alerts, metrics, and activity logs
Google Cloud Monitoring - Alerting policies and uptime checks

Application Monitoring

New Relic - APM alerts and synthetic monitoring
Datadog - Monitors and event stream integration
AppDynamics - Application and business transaction alerts
Dynatrace - Problem detection and AI-powered insights

Log Management

Splunk - Alert actions and saved searches
Elastic/ELK - Watcher alerts and anomaly detection
Syslog - RFC3164 and RFC5424 formats
Windows Event Log - Critical system events

Network & Security

SNMP - Trap processing v1/v2c/v3
NetFlow/sFlow - Anomaly detection
Security Tools - SIEM integration

Custom Integrations

REST API - Flexible JSON format
Webhooks - Custom payload mapping
Email - Parse alerts from email
Scripts - CLI and SDK support

Integration Architecture

Event Flow

Ingestion - Events received via webhooks, API, or polling
Normalization - Convert to standard KillIT format
Enrichment - Add CI relationships, business context
Correlation - Identify related events and root causes
Storage - Persist in MongoDB with indexing
Analysis - AI-powered insights and recommendations
Actions - Trigger automations and notifications

High Availability

Multiple ingestion endpoints for redundancy
Queue-based processing for reliability
Automatic retry with exponential backoff
Dead letter queue for failed events

Authentication

All event ingestion endpoints require authentication. You can use either:

JWT Token - For user-based access
API Token - For monitoring system integration

Creating an API Token

curl -X POST https://your-killit-instance/api/auth/create-api-token \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Nagios Integration",
    "permissions": ["event:write"]
  }'

Nagios Integration

Configure Nagios Command

Add this command to your Nagios configuration:

define command {
    command_name    notify-killit
    command_line    /usr/local/bin/notify-killit.sh "$NOTIFICATIONTYPE$" "$SERVICEDESC$" "$HOSTNAME$" "$SERVICESTATE$" "$SERVICEOUTPUT$" "$LONGSERVICEOUTPUT$" "$TIMET$"
}

Create Notification Script

Create /usr/local/bin/notify-killit.sh:

#!/bin/bash

KILLIT_URL="https://your-killit-instance/api/events/webhook/nagios"
API_TOKEN="your-api-token"

NOTIFICATION_TYPE=$1
SERVICE_DESC=$2
HOSTNAME=$3
SERVICE_STATE=$4
SERVICE_OUTPUT=$5
LONG_SERVICE_OUTPUT=$6
TIMESTAMP=$7

curl -X POST $KILLIT_URL \
  -H "Authorization: Bearer $API_TOKEN" \
  -H "Content-Type: application/json" \
  -d "{
    \"notification_type\": \"$NOTIFICATION_TYPE\",
    \"service\": \"$SERVICE_DESC\",
    \"hostname\": \"$HOSTNAME\",
    \"state\": \"$SERVICE_STATE\",
    \"output\": \"$SERVICE_OUTPUT\",
    \"long_output\": \"$LONG_SERVICE_OUTPUT\",
    \"timestamp\": $TIMESTAMP
  }"

Configure Service Notification

define service {
    use                     generic-service
    host_name               production-server
    service_description     CPU Load
    check_command           check_cpu
    event_handler           notify-killit
    event_handler_enabled   1
}

Zabbix Integration

Configure Zabbix Media Type

Go to Administration → Media types

Create new media type:

Name: KillIT Event Management
Type: Webhook

Parameters:

URL: https://your-killit-instance/api/events/webhook/zabbix
API_TOKEN: your-api-token
EVENT_ID: {EVENT.ID}
EVENT_NAME: {EVENT.NAME}
EVENT_SEVERITY: {EVENT.SEVERITY}
HOST_NAME: {HOST.NAME}
TRIGGER_NAME: {TRIGGER.NAME}
TRIGGER_DESCRIPTION: {TRIGGER.DESCRIPTION}

Webhook Script

try {
    var params = JSON.parse(value);
    var req = new CurlHttpRequest();
    var url = params.URL;
    
    req.AddHeader('Content-Type: application/json');
    req.AddHeader('Authorization: Bearer ' + params.API_TOKEN);
    
    var payload = {
        event_id: params.EVENT_ID,
        trigger_name: params.TRIGGER_NAME,
        trigger_description: params.TRIGGER_DESCRIPTION,
        hostname: params.HOST_NAME,
        severity: params.EVENT_SEVERITY,
        timestamp: new Date().toISOString()
    };
    
    var resp = req.Post(url, JSON.stringify(payload));
    
    if (req.Status() != 200) {
        throw 'Response code: ' + req.Status();
    }
    
    return 'OK';
} catch (error) {
    Zabbix.Log(3, 'KillIT webhook error: ' + error);
    throw 'Failed to send event: ' + error;
}

Prometheus Integration

Configure Alertmanager

Edit alertmanager.yml:

global:
  resolve_timeout: 5m

route:
  group_by: ['alertname', 'cluster', 'service']
  group_wait: 10s
  group_interval: 10s
  repeat_interval: 12h
  receiver: 'killit-webhook'

receivers:
  - name: 'killit-webhook'
    webhook_configs:
      - url: 'https://your-killit-instance/api/events/webhook/prometheus'
        http_config:
          bearer_token: 'your-api-token'
        send_resolved: true

Alert Rules Example

groups:
  - name: example
    rules:
      - alert: HighCPUUsage
        expr: 100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
        for: 5m
        labels:
          severity: warning
          service: infrastructure
        annotations:
          summary: "High CPU usage detected on {{ $labels.instance }}"
          description: "CPU usage is above 80% (current value: {{ $value }}%)"

Custom Event Format

For custom integrations, use the standard event ingestion endpoint:

Single Event

curl -X POST https://your-killit-instance/api/events/ingest \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "source": "custom",
    "sourceId": "CUSTOM-12345",
    "severity": "major",
    "title": "Database Connection Pool Exhausted",
    "description": "The connection pool for MySQL has reached its limit",
    "timestamp": "2024-01-15T10:30:00Z",
    "hostname": "db-prod-01",
    "service": "mysql",
    "application": "order-service",
    "environment": "production",
    "tags": ["database", "connection-pool"],
    "ciName": "MYSQL-PROD-01",  // Optional: Direct CI reference
    "parentEventId": "EVENT-12344", // Optional: For correlated events
    "details": {
      "pool_size": 100,
      "active_connections": 100,
      "waiting_requests": 25,
      "query_time_avg": 250,
      "slow_queries": 15
    }
  }'

Bulk Events

curl -X POST https://your-killit-instance/api/events/bulk-ingest \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "source": "custom",
    "events": [
      {
        "sourceId": "METRIC-001",
        "severity": "warning",
        "title": "High Memory Usage",
        "timestamp": "2024-01-15T10:30:00Z",
        "hostname": "web-01"
      },
      {
        "sourceId": "METRIC-002",
        "severity": "critical",
        "title": "Disk Space Low",
        "timestamp": "2024-01-15T10:31:00Z",
        "hostname": "web-01"
      }
    ]
  }'

CloudWatch Integration

Using AWS Lambda

Create a Lambda function to forward CloudWatch alarms:

import json
import requests
import os

def lambda_handler(event, context):
    killit_url = os.environ['KILLIT_URL']
    api_token = os.environ['KILLIT_API_TOKEN']
    
    # Parse SNS message
    sns_message = json.loads(event['Records'][0]['Sns']['Message'])
    
    # Convert to KillIT format
    killit_event = {
        'source': 'cloudwatch',
        'sourceId': sns_message.get('AlarmName', 'unknown'),
        'severity': 'critical' if sns_message.get('NewStateValue') == 'ALARM' else 'info',
        'title': sns_message.get('AlarmName'),
        'description': sns_message.get('AlarmDescription', ''),
        'timestamp': sns_message.get('StateChangeTime'),
        'details': {
            'metric': sns_message.get('MetricName'),
            'namespace': sns_message.get('Namespace'),
            'reason': sns_message.get('NewStateReason')
        }
    }
    
    # Send to KillIT
    response = requests.post(
        f"{killit_url}/api/events/webhook/cloudwatch",
        headers={
            'Authorization': f'Bearer {api_token}',
            'Content-Type': 'application/json'
        },
        json=killit_event
    )
    
    return {
        'statusCode': response.status_code,
        'body': json.dumps('Event sent to KillIT')
    }

Rate Limits

To protect the system, the following rate limits apply:

Single Event Ingestion: 100 requests per minute per IP
Bulk Event Ingestion: 10 requests per minute per IP
Webhook Endpoints: 1000 requests per minute per IP

Best Practices

Use Bulk Ingestion when sending multiple events to reduce API calls
Include CI Information (hostname, IP) to enable automatic correlation
Standardize Severity Levels across different monitoring tools
Use Meaningful Source IDs to prevent duplicate events
Include Timestamps in ISO 8601 format for accurate ordering
Add Relevant Tags to improve searchability and correlation

Configuring Event Sources

Supported Event Sources

Infrastructure Monitoring

Cloud Platforms

Application Monitoring

Log Management

Network & Security

Custom Integrations

Integration Architecture

Event Flow

High Availability

Authentication

Creating an API Token

Nagios Integration

Configure Nagios Command

Create Notification Script

Configure Service Notification

Zabbix Integration

Configure Zabbix Media Type

Webhook Script

Prometheus Integration

Configure Alertmanager

Alert Rules Example

Custom Event Format

Single Event

Bulk Events

CloudWatch Integration

Using AWS Lambda

Rate Limits

Best Practices

Troubleshooting

Events Not Appearing

Duplicate Events

Missing CI Correlation

Next Steps

Supported Event Sources​

Infrastructure Monitoring​

Cloud Platforms​

Application Monitoring​

Log Management​

Network & Security​

Custom Integrations​

Integration Architecture​

Event Flow​

High Availability​

Authentication​

Creating an API Token​

Nagios Integration​

Configure Nagios Command​

Create Notification Script​

Configure Service Notification​

Zabbix Integration​

Configure Zabbix Media Type​

Webhook Script​

Prometheus Integration​

Configure Alertmanager​

Alert Rules Example​

Custom Event Format​

Single Event​

Bulk Events​

CloudWatch Integration​

Using AWS Lambda​

Rate Limits​

Best Practices​

Troubleshooting​

Events Not Appearing​

Duplicate Events​

Missing CI Correlation​

Next Steps​

Supported Event Sources

Infrastructure Monitoring

Cloud Platforms

Application Monitoring

Log Management

Network & Security

Custom Integrations

Integration Architecture

Event Flow

High Availability

Authentication

Creating an API Token

Nagios Integration

Configure Nagios Command

Create Notification Script

Configure Service Notification

Zabbix Integration

Configure Zabbix Media Type

Webhook Script

Prometheus Integration

Configure Alertmanager

Alert Rules Example

Custom Event Format

Single Event

Bulk Events

CloudWatch Integration

Using AWS Lambda

Rate Limits

Best Practices

Troubleshooting

Events Not Appearing

Duplicate Events

Missing CI Correlation

Next Steps