Skip to main content

Discovery Scheduling

Effective discovery scheduling ensures your CMDB stays current while minimizing impact on network and system resources. NopeSight provides flexible scheduling options that adapt to your infrastructure's needs and operational windows.

Scheduling Architecture

Scheduling Engine

Schedule Types

Fixed Schedules

Daily Discovery:
schedule: "0 2 * * *" # 2 AM daily
targets: all_infrastructure
type: incremental
max_duration: 4h

Weekly Full Scan:
schedule: "0 6 * * 0" # Sunday 6 AM
targets: all_infrastructure
type: full
max_duration: 12h

Hourly Critical:
schedule: "0 * * * *" # Every hour
targets:
- tag: critical
- tag: production
type: incremental
max_duration: 45m

Dynamic Schedules

Event-Driven:
triggers:
- new_device_detected
- configuration_change
- incident_created
- deployment_completed

response:
delay: 5m
type: targeted
scope: affected_items

Change-Based:
monitor:
- deployment_pipeline
- change_calendar
- maintenance_windows

action:
pre_change: baseline_scan
post_change: verification_scan
delay: 30m

Schedule Configuration

Basic Scheduling

# Schedule configuration example
schedules:
production_servers:
name: "Production Server Discovery"
description: "Critical production infrastructure"
enabled: true

timing:
frequency: every_4_hours
start_time: "00:00"
timezone: "America/New_York"
blackout_windows:
- start: "08:00"
end: "09:00"
days: ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"]
reason: "Peak business hours"

targets:
include:
- ip_range: "10.1.0.0/16"
- tags: ["production", "critical"]
exclude:
- ip: "10.1.1.1" # Router
- tag: "maintenance"

discovery:
type: incremental
methods: ["agent", "wmi", "ssh"]
timeout: 300
parallel_jobs: 10

Advanced Scheduling

Intelligent Scheduling

smart_schedule:
name: "Adaptive Infrastructure Discovery"

rules:
business_criticality:
critical:
frequency: real_time
method: agent_only
high:
frequency: every_2_hours
method: agent_preferred
medium:
frequency: every_12_hours
method: agentless_ok
low:
frequency: daily
method: any

device_type:
database_servers:
frequency: every_hour
preferred_window: "02:00-05:00"
web_servers:
frequency: every_4_hours
avoid_window: "09:00-17:00"
workstations:
frequency: on_login
max_daily: 2

change_frequency:
high_change: # > 10 changes/week
frequency: every_2_hours
moderate_change: # 3-10 changes/week
frequency: every_6_hours
stable: # < 3 changes/week
frequency: daily

Resource-Aware Scheduling

resource_limits:
global:
max_concurrent_discoveries: 50
max_network_bandwidth: "100Mbps"
max_cpu_usage: 70
max_memory_usage: "8GB"

per_target:
max_connections: 5
max_bandwidth: "10Mbps"
backoff_on_error: exponential
retry_limit: 3

adaptive_throttling:
high_load_threshold: 80
reduce_concurrency_by: 50
increase_interval_by: 100

priority_queues:
critical:
reserved_slots: 20
max_wait: "5m"
high:
reserved_slots: 15
max_wait: "15m"
normal:
reserved_slots: 10
max_wait: "1h"
low:
reserved_slots: 5
max_wait: "4h"

Scheduling Strategies

Infrastructure-Based Scheduling

strategies:
geographic_distribution:
regions:
us_east:
window: "02:00-06:00 EST"
stagger: 15m
us_west:
window: "02:00-06:00 PST"
stagger: 15m
europe:
window: "02:00-06:00 CET"
stagger: 20m
asia:
window: "02:00-06:00 JST"
stagger: 20m

network_topology:
core:
frequency: every_30_minutes
priority: critical
distribution:
frequency: every_2_hours
priority: high
access:
frequency: every_6_hours
priority: normal
edge:
frequency: daily
priority: low

service_dependencies:
tier_1_services:
discover_first: true
frequency: continuous
tier_2_services:
after: tier_1_services
frequency: every_hour
tier_3_services:
after: tier_2_services
frequency: every_4_hours

Business-Aligned Scheduling

business_alignment:
maintenance_windows:
source: change_management_system
respect_blackouts: true
pre_maintenance_scan: "-1h"
post_maintenance_scan: "+30m"

business_cycles:
end_of_month:
dates: [28, 29, 30, 31]
reduce_discovery_by: 75%
priority_only: true

quarter_end:
months: [3, 6, 9, 12]
dates: [25-31]
minimal_discovery: true
defer_non_critical: true

year_end:
dates: ["12/24-12/31", "01/01-01/02"]
emergency_only: true
manual_approval: required

sla_driven:
platinum_sla:
discovery_interval: 15m
availability_requirement: 99.99%
gold_sla:
discovery_interval: 1h
availability_requirement: 99.9%
silver_sla:
discovery_interval: 4h
availability_requirement: 99%

Schedule Management

Web UI Management

Schedule Dashboard:
Views:
- Calendar view: Visual schedule timeline
- List view: Tabular schedule details
- Gantt chart: Resource utilization
- Heat map: Discovery density

Actions:
- Create/Edit schedules
- Enable/Disable schedules
- Run now option
- Skip next run
- View history
- Clone schedule

Monitoring:
- Next run times
- Currently running
- Success/failure rates
- Average duration
- Resource usage

API Management

# Schedule management via API
import requests
from datetime import datetime, timedelta

# Create a new schedule
schedule = {
"name": "Database Server Discovery",
"description": "Discover all database servers",
"enabled": True,
"schedule": {
"type": "cron",
"expression": "0 */4 * * *", # Every 4 hours
"timezone": "UTC"
},
"targets": {
"tags": ["database", "production"],
"discovery_type": "full"
},
"options": {
"timeout": 600,
"parallel_jobs": 5,
"retry_failed": True
}
}

response = requests.post(
"https://nopesight.company.com/api/schedules",
json=schedule,
headers={"Authorization": f"Bearer {api_token}"}
)

schedule_id = response.json()["id"]

# Trigger immediate discovery
requests.post(
f"https://nopesight.company.com/api/schedules/{schedule_id}/run",
headers={"Authorization": f"Bearer {api_token}"}
)

# Get schedule statistics
stats = requests.get(
f"https://nopesight.company.com/api/schedules/{schedule_id}/stats",
params={"period": "7d"},
headers={"Authorization": f"Bearer {api_token}"}
).json()

print(f"Success rate: {stats['success_rate']}%")
print(f"Average duration: {stats['avg_duration_minutes']} minutes")

CLI Management

# NopeSight CLI schedule management

# List all schedules
nopesight schedule list --format table

# Create schedule from file
nopesight schedule create --file production_schedule.yaml

# Update schedule
nopesight schedule update db_servers \
--frequency "every 2 hours" \
--window "22:00-06:00"

# Disable schedule temporarily
nopesight schedule disable web_servers \
--reason "Maintenance" \
--until "2024-01-20"

# View schedule history
nopesight schedule history db_servers \
--last 10 \
--include-details

# Run schedule immediately
nopesight schedule run production_servers \
--wait --timeout 30m

Schedule Optimization

Performance Analysis

metrics:
discovery_performance:
- completion_time
- success_rate
- resource_usage
- queue_depth
- wait_time

optimization_recommendations:
overlap_detection:
finding: "Schedules A and B overlap by 45%"
recommendation: "Stagger by 2 hours"
impact: "Reduce resource contention by 40%"

underutilized_windows:
finding: "02:00-04:00 window only 20% utilized"
recommendation: "Move low-priority discoveries here"
impact: "Better resource distribution"

long_running_jobs:
finding: "Full scan takes 6+ hours"
recommendation: "Split into regional schedules"
impact: "Reduce completion time by 60%"

Adaptive Scheduling

// Adaptive scheduling algorithm
const adaptiveScheduler = {
analyze: function(historicalData) {
return {
peak_usage_times: this.findPeakTimes(historicalData),
optimal_windows: this.findOptimalWindows(historicalData),
bottlenecks: this.identifyBottlenecks(historicalData),
recommendations: this.generateRecommendations(historicalData)
};
},

adjust: function(schedule, analysis) {
if (analysis.bottlenecks.network) {
schedule.parallel_jobs *= 0.8;
schedule.bandwidth_limit = "50Mbps";
}

if (analysis.peak_usage_times.includes(schedule.start_time)) {
schedule.start_time = analysis.optimal_windows[0];
}

return schedule;
},

learn: function(executionResults) {
// Machine learning feedback loop
this.updateModel({
schedule: executionResults.schedule,
performance: executionResults.metrics,
success: executionResults.success_rate > 95
});
}
};

Monitoring & Alerting

Schedule Monitoring

monitoring:
dashboards:
schedule_overview:
widgets:
- upcoming_schedules
- currently_running
- recent_failures
- resource_utilization
- sla_compliance

performance_metrics:
widgets:
- completion_times_trend
- success_rate_gauge
- discovery_coverage_map
- queue_depth_chart
- bottleneck_analysis

kpis:
- discovery_coverage: "> 95%"
- success_rate: "> 98%"
- avg_completion_time: "< 30m"
- resource_utilization: "60-80%"
- schedule_adherence: "> 95%"

Alert Configuration

alerts:
schedule_failures:
condition: "failed_count > 2"
severity: high
notification:
- email: ops-team@company.com
- slack: #infrastructure-alerts
auto_action:
- retry_with_backoff
- create_incident

long_running:
condition: "duration > expected_duration * 2"
severity: medium
notification:
- email: discovery-admin@company.com
auto_action:
- check_resource_usage
- throttle_if_needed

missed_schedule:
condition: "missed_run_count > 0"
severity: high
notification:
- sms: on-call
- email: ops-team@company.com
auto_action:
- run_immediately
- investigate_cause

Best Practices

1. Schedule Design

  • ✅ Align with business hours
  • ✅ Consider geographic distribution
  • ✅ Respect maintenance windows
  • ✅ Plan for growth

2. Resource Management

  • ✅ Monitor resource usage
  • ✅ Implement throttling
  • ✅ Use priority queues
  • ✅ Balance load distribution

3. Reliability

  • ✅ Build in redundancy
  • ✅ Handle failures gracefully
  • ✅ Implement retry logic
  • ✅ Monitor success rates

4. Optimization

  • ✅ Regular performance reviews
  • ✅ Adjust based on metrics
  • ✅ Eliminate redundancy
  • ✅ Continuous improvement

Troubleshooting

Common Issues

Schedules Not Running

Diagnostic Steps:
1. Check schedule status (enabled?)
2. Verify schedule expression
3. Check blackout windows
4. Review system resources
5. Examine scheduler logs

Common Causes:
- Disabled schedule
- Invalid cron expression
- Blackout window active
- Resource limits reached
- Scheduler service down

Performance Degradation

Symptoms:
- Increasing completion times
- High resource usage
- Queue buildup
- Timeout errors

Solutions:
- Reduce parallel jobs
- Increase intervals
- Optimize discovery scope
- Add more workers
- Implement caching

Schedule Analysis

-- Analyze schedule performance
SELECT
schedule_name,
AVG(duration_minutes) as avg_duration,
COUNT(*) as total_runs,
SUM(CASE WHEN status = 'success' THEN 1 ELSE 0 END) as successful,
SUM(CASE WHEN status = 'failed' THEN 1 ELSE 0 END) as failed,
ROUND(100.0 * SUM(CASE WHEN status = 'success' THEN 1 ELSE 0 END) / COUNT(*), 2) as success_rate
FROM discovery_runs
WHERE run_date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY schedule_name
ORDER BY success_rate ASC, avg_duration DESC;

Advanced Topics

Multi-Site Scheduling

multi_site:
coordination:
mode: distributed
sites:
- name: datacenter_east
timezone: "America/New_York"
bandwidth_to_central: "1Gbps"
- name: datacenter_west
timezone: "America/Los_Angeles"
bandwidth_to_central: "1Gbps"
- name: europe_dc
timezone: "Europe/London"
bandwidth_to_central: "500Mbps"

strategy:
- local_discovery_first
- aggregate_to_central
- deduplicate_results
- sync_on_completion

Predictive Scheduling

# ML-based schedule optimization
from sklearn.ensemble import RandomForestRegressor
import pandas as pd

# Load historical data
history = pd.read_csv('discovery_history.csv')

# Features: time_of_day, day_of_week, target_count, discovery_type
# Target: completion_time

model = RandomForestRegressor()
model.fit(history[features], history['completion_time'])

# Predict optimal time for new schedule
new_schedule = {
'target_count': 500,
'discovery_type': 'full',
'preferred_window': '00:00-06:00'
}

predicted_duration = model.predict([new_schedule])
optimal_start = find_optimal_slot(predicted_duration, preferred_window)

Next Steps