Skip to main content

Configuring Event Sources

KillIT v3 supports multiple monitoring tools and event sources. This guide covers how to configure each supported integration.

Supported Event Sources

KillIT v3 provides comprehensive integration with various monitoring platforms:

Infrastructure Monitoring

  • Nagios - Full webhook support with state mapping and service/host events
  • Zabbix - Trigger-based alerts with severity mapping
  • Prometheus - Alertmanager webhook with label extraction
  • PRTG - Network monitoring with sensor data
  • Icinga2 - API integration with state changes

Cloud Platforms

  • AWS CloudWatch - Alarms, metrics, and logs integration
  • Azure Monitor - Alerts, metrics, and activity logs
  • Google Cloud Monitoring - Alerting policies and uptime checks

Application Monitoring

  • New Relic - APM alerts and synthetic monitoring
  • Datadog - Monitors and event stream integration
  • AppDynamics - Application and business transaction alerts
  • Dynatrace - Problem detection and AI-powered insights

Log Management

  • Splunk - Alert actions and saved searches
  • Elastic/ELK - Watcher alerts and anomaly detection
  • Syslog - RFC3164 and RFC5424 formats
  • Windows Event Log - Critical system events

Network & Security

  • SNMP - Trap processing v1/v2c/v3
  • NetFlow/sFlow - Anomaly detection
  • Security Tools - SIEM integration

Custom Integrations

  • REST API - Flexible JSON format
  • Webhooks - Custom payload mapping
  • Email - Parse alerts from email
  • Scripts - CLI and SDK support

Integration Architecture

Event Flow

  1. Ingestion - Events received via webhooks, API, or polling
  2. Normalization - Convert to standard KillIT format
  3. Enrichment - Add CI relationships, business context
  4. Correlation - Identify related events and root causes
  5. Storage - Persist in MongoDB with indexing
  6. Analysis - AI-powered insights and recommendations
  7. Actions - Trigger automations and notifications

High Availability

  • Multiple ingestion endpoints for redundancy
  • Queue-based processing for reliability
  • Automatic retry with exponential backoff
  • Dead letter queue for failed events

Authentication

All event ingestion endpoints require authentication. You can use either:

  1. JWT Token - For user-based access
  2. API Token - For monitoring system integration

Creating an API Token

curl -X POST https://your-killit-instance/api/auth/create-api-token \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "Nagios Integration",
"permissions": ["event:write"]
}'

Nagios Integration

Configure Nagios Command

Add this command to your Nagios configuration:

define command {
command_name notify-killit
command_line /usr/local/bin/notify-killit.sh "$NOTIFICATIONTYPE$" "$SERVICEDESC$" "$HOSTNAME$" "$SERVICESTATE$" "$SERVICEOUTPUT$" "$LONGSERVICEOUTPUT$" "$TIMET$"
}

Create Notification Script

Create /usr/local/bin/notify-killit.sh:

#!/bin/bash

KILLIT_URL="https://your-killit-instance/api/events/webhook/nagios"
API_TOKEN="your-api-token"

NOTIFICATION_TYPE=$1
SERVICE_DESC=$2
HOSTNAME=$3
SERVICE_STATE=$4
SERVICE_OUTPUT=$5
LONG_SERVICE_OUTPUT=$6
TIMESTAMP=$7

curl -X POST $KILLIT_URL \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d "{
\"notification_type\": \"$NOTIFICATION_TYPE\",
\"service\": \"$SERVICE_DESC\",
\"hostname\": \"$HOSTNAME\",
\"state\": \"$SERVICE_STATE\",
\"output\": \"$SERVICE_OUTPUT\",
\"long_output\": \"$LONG_SERVICE_OUTPUT\",
\"timestamp\": $TIMESTAMP
}"

Configure Service Notification

define service {
use generic-service
host_name production-server
service_description CPU Load
check_command check_cpu
event_handler notify-killit
event_handler_enabled 1
}

Zabbix Integration

Configure Zabbix Media Type

  1. Go to Administration → Media types
  2. Create new media type:
    • Name: KillIT Event Management
    • Type: Webhook
    • Parameters:
      URL: https://your-killit-instance/api/events/webhook/zabbix
      API_TOKEN: your-api-token
      EVENT_ID: {EVENT.ID}
      EVENT_NAME: {EVENT.NAME}
      EVENT_SEVERITY: {EVENT.SEVERITY}
      HOST_NAME: {HOST.NAME}
      TRIGGER_NAME: {TRIGGER.NAME}
      TRIGGER_DESCRIPTION: {TRIGGER.DESCRIPTION}

Webhook Script

try {
var params = JSON.parse(value);
var req = new CurlHttpRequest();
var url = params.URL;

req.AddHeader('Content-Type: application/json');
req.AddHeader('Authorization: Bearer ' + params.API_TOKEN);

var payload = {
event_id: params.EVENT_ID,
trigger_name: params.TRIGGER_NAME,
trigger_description: params.TRIGGER_DESCRIPTION,
hostname: params.HOST_NAME,
severity: params.EVENT_SEVERITY,
timestamp: new Date().toISOString()
};

var resp = req.Post(url, JSON.stringify(payload));

if (req.Status() != 200) {
throw 'Response code: ' + req.Status();
}

return 'OK';
} catch (error) {
Zabbix.Log(3, 'KillIT webhook error: ' + error);
throw 'Failed to send event: ' + error;
}

Prometheus Integration

Configure Alertmanager

Edit alertmanager.yml:

global:
resolve_timeout: 5m

route:
group_by: ['alertname', 'cluster', 'service']
group_wait: 10s
group_interval: 10s
repeat_interval: 12h
receiver: 'killit-webhook'

receivers:
- name: 'killit-webhook'
webhook_configs:
- url: 'https://your-killit-instance/api/events/webhook/prometheus'
http_config:
bearer_token: 'your-api-token'
send_resolved: true

Alert Rules Example

groups:
- name: example
rules:
- alert: HighCPUUsage
expr: 100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 5m
labels:
severity: warning
service: infrastructure
annotations:
summary: "High CPU usage detected on {{ $labels.instance }}"
description: "CPU usage is above 80% (current value: {{ $value }}%)"

Custom Event Format

For custom integrations, use the standard event ingestion endpoint:

Single Event

curl -X POST https://your-killit-instance/api/events/ingest \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"source": "custom",
"sourceId": "CUSTOM-12345",
"severity": "major",
"title": "Database Connection Pool Exhausted",
"description": "The connection pool for MySQL has reached its limit",
"timestamp": "2024-01-15T10:30:00Z",
"hostname": "db-prod-01",
"service": "mysql",
"application": "order-service",
"environment": "production",
"tags": ["database", "connection-pool"],
"ciName": "MYSQL-PROD-01", // Optional: Direct CI reference
"parentEventId": "EVENT-12344", // Optional: For correlated events
"details": {
"pool_size": 100,
"active_connections": 100,
"waiting_requests": 25,
"query_time_avg": 250,
"slow_queries": 15
}
}'

Bulk Events

curl -X POST https://your-killit-instance/api/events/bulk-ingest \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"source": "custom",
"events": [
{
"sourceId": "METRIC-001",
"severity": "warning",
"title": "High Memory Usage",
"timestamp": "2024-01-15T10:30:00Z",
"hostname": "web-01"
},
{
"sourceId": "METRIC-002",
"severity": "critical",
"title": "Disk Space Low",
"timestamp": "2024-01-15T10:31:00Z",
"hostname": "web-01"
}
]
}'

CloudWatch Integration

Using AWS Lambda

Create a Lambda function to forward CloudWatch alarms:

import json
import requests
import os

def lambda_handler(event, context):
killit_url = os.environ['KILLIT_URL']
api_token = os.environ['KILLIT_API_TOKEN']

# Parse SNS message
sns_message = json.loads(event['Records'][0]['Sns']['Message'])

# Convert to KillIT format
killit_event = {
'source': 'cloudwatch',
'sourceId': sns_message.get('AlarmName', 'unknown'),
'severity': 'critical' if sns_message.get('NewStateValue') == 'ALARM' else 'info',
'title': sns_message.get('AlarmName'),
'description': sns_message.get('AlarmDescription', ''),
'timestamp': sns_message.get('StateChangeTime'),
'details': {
'metric': sns_message.get('MetricName'),
'namespace': sns_message.get('Namespace'),
'reason': sns_message.get('NewStateReason')
}
}

# Send to KillIT
response = requests.post(
f"{killit_url}/api/events/webhook/cloudwatch",
headers={
'Authorization': f'Bearer {api_token}',
'Content-Type': 'application/json'
},
json=killit_event
)

return {
'statusCode': response.status_code,
'body': json.dumps('Event sent to KillIT')
}

Rate Limits

To protect the system, the following rate limits apply:

  • Single Event Ingestion: 100 requests per minute per IP
  • Bulk Event Ingestion: 10 requests per minute per IP
  • Webhook Endpoints: 1000 requests per minute per IP

Best Practices

  1. Use Bulk Ingestion when sending multiple events to reduce API calls
  2. Include CI Information (hostname, IP) to enable automatic correlation
  3. Standardize Severity Levels across different monitoring tools
  4. Use Meaningful Source IDs to prevent duplicate events
  5. Include Timestamps in ISO 8601 format for accurate ordering
  6. Add Relevant Tags to improve searchability and correlation

Troubleshooting

Events Not Appearing

  1. Check authentication token is valid
  2. Verify endpoint URL is correct
  3. Check rate limits haven't been exceeded
  4. Review event format matches requirements

Duplicate Events

  1. Ensure sourceId is unique per event
  2. Check monitoring tool isn't sending duplicates
  3. Review deduplication window settings

Missing CI Correlation

  1. Verify hostname/IP matches CMDB records
  2. Check CI discovery has run recently
  3. Ensure proper tenant/customer context

Next Steps