Operations & Maintenance
Chapter 12 — Ongoing Security Operations, Maintenance Procedures, and Lifecycle Management
Router security is not a one-time deployment activity — it requires continuous operational attention to remain effective against evolving threats. This chapter defines the ongoing operations and maintenance procedures required to sustain the security posture established during deployment. It covers daily monitoring tasks, periodic security reviews, patch management, key rotation procedures, configuration change management, and end-of-life planning. The NOC team shown below represents the operational model: continuous monitoring, proactive maintenance, and rapid incident response.
12.1 Security Operations Model
Effective router security operations require a clearly defined operational model with assigned responsibilities, documented procedures, and measurable performance indicators. The NOC team is responsible for day-to-day monitoring and first-level incident response; the network engineering team handles configuration changes and security reviews; the security team owns policy, compliance, and vulnerability management.
12.2 Daily Operations Checklist
The daily operations checklist defines the minimum set of monitoring and verification tasks that must be performed every day by the NOC team. These tasks are designed to detect security incidents and configuration drift as early as possible, minimizing the window of exposure. All checklist items should be automated where possible, with alerts sent to the NOC when any item falls outside expected parameters.
| Task | Frequency | Method | Alert Threshold | Owner |
|---|---|---|---|---|
| BGP Session Status Check | Every 5 minutes (automated) | SNMPv3 polling; BGP MIB; SNMP trap on state change | Any session down for >5 minutes | NOC (automated alert) |
| CPU Utilization Check | Every 5 minutes (automated) | SNMPv3 polling; CPU utilization OID | CPU >70% for 5 consecutive minutes | NOC (automated alert) |
| CoPP Drop Rate Review | Daily (manual review) | SNMP polling; CoPP statistics; trend analysis | Drop rate increase >50% vs. 7-day average | NOC (daily review) |
| Authentication Failure Review | Daily (manual review) | SIEM query for authentication failure events | More than 5 failures from same source in 1 hour | Security team |
| Configuration Change Detection | Every 4 hours (automated) | Config management system diff against baseline | Any unauthorized change detected | Security team (automated alert) |
| NTP Synchronization Check | Every 15 minutes (automated) | SNMPv3 polling; NTP MIB | Stratum >3 or unsynchronized | NOC (automated alert) |
| Interface Error Rate Check | Every 5 minutes (automated) | SNMPv3 polling; ifInErrors, ifOutErrors OIDs | Error rate >0.01% of total traffic | NOC (automated alert) |
12.3 Periodic Security Review Schedule
Periodic security reviews ensure that the router's security configuration remains aligned with current threats, organizational policies, and vendor recommendations. The review schedule below defines the minimum review frequency for each security domain. Reviews should be performed by a qualified engineer who was not involved in the original deployment, to provide an independent perspective.
| Review Type | Frequency | Scope | Output | Responsible Party |
|---|---|---|---|---|
| Configuration Compliance Review | Monthly | Full configuration diff against approved baseline; verify all hardening controls present | Compliance report; list of deviations with remediation plan | Network Engineering |
| BGP Peer Review | Quarterly | Review all BGP peers; verify authentication; verify max-prefix limits; remove stale peers | BGP peer inventory; list of changes required | Network Engineering |
| ACL Review | Quarterly | Review all ACLs for stale rules, overly permissive rules, and alignment with current network design | ACL review report; list of rules to remove or tighten | Security Team |
| User Account Review | Quarterly | Review all local accounts; verify AAA server accounts; remove accounts for departed staff | Account inventory; list of accounts to remove or modify | Security Team |
| Vulnerability Assessment | Semi-annually | Review vendor security advisories; assess applicability of published CVEs; plan patching | Vulnerability assessment report; patching plan with timelines | Security Team |
| Full Security Audit | Annually | Comprehensive review of all security controls; penetration testing of management plane; policy alignment review | Full audit report; risk register update; remediation roadmap | Security Team + External Auditor |
12.4 Patch Management Procedures
Router software patches address security vulnerabilities, stability issues, and feature defects. A structured patch management process ensures that critical security patches are applied promptly while minimizing the risk of service disruption. The patch management process must be documented, tested in a lab environment before production deployment, and executed during approved maintenance windows with a tested rollback plan.
| Patch Category | Definition | Target Deployment Timeline | Testing Required | Approval Required |
|---|---|---|---|---|
| Emergency Patch | Actively exploited vulnerability; CVSS score ≥9.0; vendor emergency advisory | Within 72 hours of advisory | Abbreviated lab test (4 hours minimum) | Security Director (emergency approval) |
| Critical Patch | CVSS score 7.0–8.9; no known active exploitation | Within 30 days of advisory | Full lab test (24 hours minimum) | Security Team + Network Engineering |
| High Patch | CVSS score 4.0–6.9; limited exploitation potential | Within 90 days of advisory | Full lab test (48 hours minimum) | Network Engineering |
| Routine Maintenance | CVSS score <4.0; feature updates; stability improvements | Next scheduled maintenance window | Full lab test (72 hours minimum) | Network Engineering |
12.5 Configuration Change Management
All changes to router security configurations must follow a formal change management process. Unauthorized changes are one of the most common causes of security incidents and outages. The change management process ensures that every change is reviewed, tested, approved, and documented before implementation, and that a rollback plan is available if the change causes unexpected issues.
| Change Type | Examples | Required Approvals | Testing Requirement | Rollback Plan |
|---|---|---|---|---|
| Standard Change | Adding a BGP peer, modifying an ACL entry, updating a prefix list | Network Engineering lead | Lab test or peer review | Documented rollback commands; 30-minute observation period |
| Major Change | Upgrading router software, modifying CoPP policy, changing AAA configuration | Network Engineering + Security Team | Full lab test; change advisory board review | Tested rollback procedure; 2-hour observation period; NOC on standby |
| Emergency Change | Blocking an active attack, applying emergency patch, isolating a compromised device | On-call Security Lead (verbal approval acceptable; written within 24 hours) | Abbreviated review; implement immediately if active threat | Document changes made; review and normalize within 48 hours |
12.6 End-of-Life & Decommissioning
Router hardware and software have defined end-of-life (EoL) dates after which the vendor no longer provides security patches or technical support. Operating routers beyond their EoL date is a significant security risk. The decommissioning process must ensure that sensitive configuration data (including authentication keys, community strings, and routing policies) is securely erased before the device is disposed of or returned.
| EoL Milestone | Definition | Required Action | Timeline |
|---|---|---|---|
| End of Sale | Vendor stops selling the product; no new orders accepted | Begin planning for replacement; do not purchase additional units | Replacement plan within 6 months of EoS announcement |
| End of Software Maintenance | Vendor stops releasing software updates and security patches | Accelerate replacement timeline; implement compensating controls | Replacement deployed within 12 months of EoSM |
| End of Support | Vendor stops providing any technical support | Device must be decommissioned; no exceptions for internet-facing devices | Decommissioned before EoS date |
| Secure Decommissioning | Process of securely removing a device from service | Erase all configuration (write erase; reload); physically destroy storage media if required; update CMDB; remove from monitoring | Complete within 30 days of service removal |