Configuring Event Sources
KillIT v3 supports multiple monitoring tools and event sources. This guide covers how to configure each supported integration.
Supported Event Sources
KillIT v3 provides comprehensive integration with various monitoring platforms:
Infrastructure Monitoring
- Nagios - Full webhook support with state mapping and service/host events
- Zabbix - Trigger-based alerts with severity mapping
- Prometheus - Alertmanager webhook with label extraction
- PRTG - Network monitoring with sensor data
- Icinga2 - API integration with state changes
Cloud Platforms
- AWS CloudWatch - Alarms, metrics, and logs integration
- Azure Monitor - Alerts, metrics, and activity logs
- Google Cloud Monitoring - Alerting policies and uptime checks
Application Monitoring
- New Relic - APM alerts and synthetic monitoring
- Datadog - Monitors and event stream integration
- AppDynamics - Application and business transaction alerts
- Dynatrace - Problem detection and AI-powered insights
Log Management
- Splunk - Alert actions and saved searches
- Elastic/ELK - Watcher alerts and anomaly detection
- Syslog - RFC3164 and RFC5424 formats
- Windows Event Log - Critical system events
Network & Security
- SNMP - Trap processing v1/v2c/v3
- NetFlow/sFlow - Anomaly detection
- Security Tools - SIEM integration
Custom Integrations
- REST API - Flexible JSON format
- Webhooks - Custom payload mapping
- Email - Parse alerts from email
- Scripts - CLI and SDK support
Integration Architecture
Event Flow
- Ingestion - Events received via webhooks, API, or polling
- Normalization - Convert to standard KillIT format
- Enrichment - Add CI relationships, business context
- Correlation - Identify related events and root causes
- Storage - Persist in MongoDB with indexing
- Analysis - AI-powered insights and recommendations
- Actions - Trigger automations and notifications
High Availability
- Multiple ingestion endpoints for redundancy
- Queue-based processing for reliability
- Automatic retry with exponential backoff
- Dead letter queue for failed events
Authentication
All event ingestion endpoints require authentication. You can use either:
- JWT Token - For user-based access
- API Token - For monitoring system integration
Creating an API Token
curl -X POST https://your-killit-instance/api/auth/create-api-token \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "Nagios Integration",
"permissions": ["event:write"]
}'
Nagios Integration
Configure Nagios Command
Add this command to your Nagios configuration:
define command {
command_name notify-killit
command_line /usr/local/bin/notify-killit.sh "$NOTIFICATIONTYPE$" "$SERVICEDESC$" "$HOSTNAME$" "$SERVICESTATE$" "$SERVICEOUTPUT$" "$LONGSERVICEOUTPUT$" "$TIMET$"
}
Create Notification Script
Create /usr/local/bin/notify-killit.sh:
#!/bin/bash
KILLIT_URL="https://your-killit-instance/api/events/webhook/nagios"
API_TOKEN="your-api-token"
NOTIFICATION_TYPE=$1
SERVICE_DESC=$2
HOSTNAME=$3
SERVICE_STATE=$4
SERVICE_OUTPUT=$5
LONG_SERVICE_OUTPUT=$6
TIMESTAMP=$7
curl -X POST $KILLIT_URL \
-H "Authorization: Bearer $API_TOKEN" \
-H "Content-Type: application/json" \
-d "{
\"notification_type\": \"$NOTIFICATION_TYPE\",
\"service\": \"$SERVICE_DESC\",
\"hostname\": \"$HOSTNAME\",
\"state\": \"$SERVICE_STATE\",
\"output\": \"$SERVICE_OUTPUT\",
\"long_output\": \"$LONG_SERVICE_OUTPUT\",
\"timestamp\": $TIMESTAMP
}"
Configure Service Notification
define service {
use generic-service
host_name production-server
service_description CPU Load
check_command check_cpu
event_handler notify-killit
event_handler_enabled 1
}
Zabbix Integration
Configure Zabbix Media Type
- Go to Administration → Media types
- Create new media type:
- Name: KillIT Event Management
- Type: Webhook
- Parameters:
URL: https://your-killit-instance/api/events/webhook/zabbix
API_TOKEN: your-api-token
EVENT_ID: {EVENT.ID}
EVENT_NAME: {EVENT.NAME}
EVENT_SEVERITY: {EVENT.SEVERITY}
HOST_NAME: {HOST.NAME}
TRIGGER_NAME: {TRIGGER.NAME}
TRIGGER_DESCRIPTION: {TRIGGER.DESCRIPTION}
Webhook Script
try {
var params = JSON.parse(value);
var req = new CurlHttpRequest();
var url = params.URL;
req.AddHeader('Content-Type: application/json');
req.AddHeader('Authorization: Bearer ' + params.API_TOKEN);
var payload = {
event_id: params.EVENT_ID,
trigger_name: params.TRIGGER_NAME,
trigger_description: params.TRIGGER_DESCRIPTION,
hostname: params.HOST_NAME,
severity: params.EVENT_SEVERITY,
timestamp: new Date().toISOString()
};
var resp = req.Post(url, JSON.stringify(payload));
if (req.Status() != 200) {
throw 'Response code: ' + req.Status();
}
return 'OK';
} catch (error) {
Zabbix.Log(3, 'KillIT webhook error: ' + error);
throw 'Failed to send event: ' + error;
}
Prometheus Integration
Configure Alertmanager
Edit alertmanager.yml:
global:
resolve_timeout: 5m
route:
group_by: ['alertname', 'cluster', 'service']
group_wait: 10s
group_interval: 10s
repeat_interval: 12h
receiver: 'killit-webhook'
receivers:
- name: 'killit-webhook'
webhook_configs:
- url: 'https://your-killit-instance/api/events/webhook/prometheus'
http_config:
bearer_token: 'your-api-token'
send_resolved: true
Alert Rules Example
groups:
- name: example
rules:
- alert: HighCPUUsage
expr: 100 - (avg(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 5m
labels:
severity: warning
service: infrastructure
annotations:
summary: "High CPU usage detected on {{ $labels.instance }}"
description: "CPU usage is above 80% (current value: {{ $value }}%)"
Custom Event Format
For custom integrations, use the standard event ingestion endpoint:
Single Event
curl -X POST https://your-killit-instance/api/events/ingest \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"source": "custom",
"sourceId": "CUSTOM-12345",
"severity": "major",
"title": "Database Connection Pool Exhausted",
"description": "The connection pool for MySQL has reached its limit",
"timestamp": "2024-01-15T10:30:00Z",
"hostname": "db-prod-01",
"service": "mysql",
"application": "order-service",
"environment": "production",
"tags": ["database", "connection-pool"],
"ciName": "MYSQL-PROD-01", // Optional: Direct CI reference
"parentEventId": "EVENT-12344", // Optional: For correlated events
"details": {
"pool_size": 100,
"active_connections": 100,
"waiting_requests": 25,
"query_time_avg": 250,
"slow_queries": 15
}
}'
Bulk Events
curl -X POST https://your-killit-instance/api/events/bulk-ingest \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"source": "custom",
"events": [
{
"sourceId": "METRIC-001",
"severity": "warning",
"title": "High Memory Usage",
"timestamp": "2024-01-15T10:30:00Z",
"hostname": "web-01"
},
{
"sourceId": "METRIC-002",
"severity": "critical",
"title": "Disk Space Low",
"timestamp": "2024-01-15T10:31:00Z",
"hostname": "web-01"
}
]
}'
CloudWatch Integration
Using AWS Lambda
Create a Lambda function to forward CloudWatch alarms:
import json
import requests
import os
def lambda_handler(event, context):
killit_url = os.environ['KILLIT_URL']
api_token = os.environ['KILLIT_API_TOKEN']
# Parse SNS message
sns_message = json.loads(event['Records'][0]['Sns']['Message'])
# Convert to KillIT format
killit_event = {
'source': 'cloudwatch',
'sourceId': sns_message.get('AlarmName', 'unknown'),
'severity': 'critical' if sns_message.get('NewStateValue') == 'ALARM' else 'info',
'title': sns_message.get('AlarmName'),
'description': sns_message.get('AlarmDescription', ''),
'timestamp': sns_message.get('StateChangeTime'),
'details': {
'metric': sns_message.get('MetricName'),
'namespace': sns_message.get('Namespace'),
'reason': sns_message.get('NewStateReason')
}
}
# Send to KillIT
response = requests.post(
f"{killit_url}/api/events/webhook/cloudwatch",
headers={
'Authorization': f'Bearer {api_token}',
'Content-Type': 'application/json'
},
json=killit_event
)
return {
'statusCode': response.status_code,
'body': json.dumps('Event sent to KillIT')
}
Rate Limits
To protect the system, the following rate limits apply:
- Single Event Ingestion: 100 requests per minute per IP
- Bulk Event Ingestion: 10 requests per minute per IP
- Webhook Endpoints: 1000 requests per minute per IP
Best Practices
- Use Bulk Ingestion when sending multiple events to reduce API calls
- Include CI Information (hostname, IP) to enable automatic correlation
- Standardize Severity Levels across different monitoring tools
- Use Meaningful Source IDs to prevent duplicate events
- Include Timestamps in ISO 8601 format for accurate ordering
- Add Relevant Tags to improve searchability and correlation
Troubleshooting
Events Not Appearing
- Check authentication token is valid
- Verify endpoint URL is correct
- Check rate limits haven't been exceeded
- Review event format matches requirements
Duplicate Events
- Ensure
sourceIdis unique per event - Check monitoring tool isn't sending duplicates
- Review deduplication window settings
Missing CI Correlation
- Verify hostname/IP matches CMDB records
- Check CI discovery has run recently
- Ensure proper tenant/customer context