Self Test Training – CCIE RS 400-101: Step-by-Step Troubleshooting DrillsPassing the CCIE Routing & Switching (now commonly called CCIE Enterprise Infrastructure) written exam 400-101 demands not only deep theoretical knowledge but also the ability to diagnose and resolve complex network problems quickly and methodically. This article presents a comprehensive self-test training approach focused on step-by-step troubleshooting drills designed to build the diagnostic mindset, practical skills, and speed required for success on the 400-101 exam and real-world network operations.
Why troubleshooting drills matter
Troubleshooting is the bridge between knowing protocols and applying them under pressure. The written exam evaluates conceptual mastery and scenario-based problem solving; the lab or practical assessments evaluate hands-on configuration and fault isolation. Structured troubleshooting drills improve:
- Analytical thinking — breaking down complex failures into manageable hypotheses.
- Systematic process — ensuring you don’t miss basic checks while chasing advanced issues.
- Tool familiarity — efficient use of show commands, debugs, and packet captures.
- Time management — prioritizing high-impact fixes under exam or operational time constraints.
Core troubleshooting methodology (step-by-step)
Adopt this repeatable troubleshooting framework for every drill:
-
Define the problem
- Gather symptoms, error messages, affected services, and scope (single host, site-wide).
- Ask: What changed recently? Is the issue reproducible?
-
Establish the baseline
- Verify design intent and expected behavior (routing, addressing, policy).
- Check device uptime, configuration consistency, and interface states.
-
Form hypotheses
- List potential causes ordered by likelihood and impact.
-
Test hypotheses
- Use non-invasive checks first: show commands, reachability tests (ping/traceroute).
- Escalate to packet captures and debugs only when necessary.
-
Isolate and fix
- Implement the least disruptive correction that addresses the root cause.
- If unsure, apply temporary mitigations to restore service while continuing diagnosis.
-
Verify full restoration
- Confirm end-to-end functionality, performance, and policy compliance.
-
Document and learn
- Record the issue, diagnosis steps, root cause, fix, and preventive actions.
Essential tools and commands to master
Practice these frequently used commands and techniques until they become reflexive:
- Routing protocol states and adjacency: show ip route, show ip bgp summary, show bgp neighbors, show ip ospf neighbor, show isis adjacency
- Interface and connectivity: show interfaces, show ip interface brief, ping, traceroute
- Policy and filtering: show access-lists, show route-map, show ip prefix-list, show ip bgp community
- Switching and STP: show spanning-tree, show mac address-table, show vlan brief
- Troubleshooting aids: debug ip packet (use with caution), packet captures (tcpdump, Wireshark), extended ping/traceroute options
- Virtualization/overlay: show vrf, show nve interface, show vxlan, show l2vpn/evpn routes
- Performance counters: show process cpu, show memory, show controllers
Drill categories and examples
Below are focused drill types. For each, attempt to solve within a fixed time budget (e.g., 20–40 minutes), then document steps and lessons.
-
Layer 1 & 2 connectivity
- Symptom: Intermittent host connectivity on VLAN 10.
- Drill goal: Identify link flaps, duplex mismatches, STP topology changes, or incorrect VLAN assignment.
- Key checks: show interfaces status, show spanning-tree, show mac address-table, port-channel status.
-
IP addressing & subnet issues
- Symptom: Host cannot reach default gateway.
- Drill goal: Find misconfigured IP, wrong mask, duplicate IP, or DHCP failure.
- Key checks: show ip interface brief, arp, ip dhcp binding, ping gateway from switch/router.
-
Routing protocol convergence
- Symptom: Routes missing or not converging after topology change.
- Drill goal: Isolate neighbor adjacency failures, authentication mismatches, or route filtering.
- Key checks: show ip ospf neighbor, debug adjacency (careful), show ip bgp summary, show ip route.
-
Route selection and policy
- Symptom: Traffic takes suboptimal path or preferred route not used.
- Drill goal: Verify route-maps, attribute manipulation, local preference, MED, AS-path, or distribute-list errors.
- Key checks: show ip bgp neighbors detailed, show route-map, show ip prefix-list, show ip bgp.
-
Inter-VRF/VRF-lite and MPLS
- Symptom: Route leaking between VRFs fails or L3VPN traffic black-holes.
- Drill goal: Check RD/RT, route-target, MP-BGP sessions, LDP/RSVP labels, RT redistribution.
- Key checks: show vrf detail, show ip bgp vpnv4 all neighbors, show mpls ldp neighbors, show mpls forwarding-table.
-
WAN and BGP scale/stability
- Symptom: Flapping BGP sessions, route churn, or large number of routes causing instability.
- Drill goal: Identify TTL, MTU, route dampening, neighbor configuration, or resource exhaustion.
- Key checks: show ip bgp summary, show bgp flap-statistics, show ip bgp neighbors x advertised-routes, show resource usage.
-
Security and ACL-related problems
- Symptom: Legitimate traffic blocked unexpectedly.
- Drill goal: Find ACL order mistakes, implicit denies, or unintended interface ACL application.
- Key checks: show access-lists, show ip interface, packet captures to confirm matches.
-
Overlay/SD-WAN/EVPN/VXLAN issues
- Symptom: VNs not communicating, MAC/IP routes missing in EVPN.
- Drill goal: Validate BGP EVPN route types, VTEP peering, VNI mappings, and flood/learn behavior.
- Key checks: show bgp l2vpn evpn summary, show vxlan interface, show nve peers, show arp/nd tables.
Sample step-by-step drill (detailed walkthrough)
Scenario: Remote site cannot reach Internet. BGP is used between site and provider.
- Gather facts: Which hosts/networks are affected? Is outage total or partial? When did it start?
- Baseline: Confirm local LAN is healthy (ping default gateway, check ARP, VLANs).
- Check default route: show ip route 0.0.0.0/0 — is there a route and is next-hop valid?
- Check BGP: show ip bgp summary — is the neighbor up? Look for Prefix Received count.
- If neighbor is down:
- show ip bgp neighbors x detail — inspect transport, TCP status, timers, and remote-as.
- verify TCP connectivity: test reachability to neighbor IP and source IP used for session.
- check BGP authentication or TTL security (ebgp multihop).
- If neighbor is up but no routes:
- verify route advertisement: show ip bgp neighbors x advertised-routes.
- check inbound route filters: show ip prefix-list, route-map applied to neighbor.
- Packet capture: capture TCP handshake for BGP on local router to see resets or SYN issues.
- Fix: correct ACLs, route-maps, or interface addressing; if provider misconfiguration, coordinate with NOC.
- Verify: confirm Internet access from affected hosts and check traceroute to known Internet IPs.
Constructing your own lab drills
- Use virtual labs: VIRL/CML, GNS3, EVE-NG, or cloud devices. Mirror real topologies: dual-homed sites, MPLS core, EVPN overlays.
- Seed faults intentionally: misconfigured masks, ACLs, wrong AS numbers, STP root changes, BGP community filtering.
- Time-box drills and keep a checklist template to record symptoms, commands used, findings, and fixes.
- Rotate between purple-team (you introduce faults) and blue-team (you troubleshoot faults introduced by another person) exercises.
Metrics to measure progress
Track these metrics to quantify improvement:
- Mean time to resolve (MTTR) per drill.
- Success rate within time-box.
- Number of correct root-cause identifications vs. false positives.
- Command/technique fluency (how many useful commands executed per drill).
- Knowledge gaps discovered (document and study).
Common pitfalls and how to avoid them
- Rushing to config changes without verifying: always confirm hypothesis before wide changes.
- Overusing debugs in production-like environments: prefer captures and non-invasive checks first.
- Ignoring basics: power, cables, interface states, and correct VLANs often cause the majority of failures.
- Not documenting: you’ll repeat the same mistakes if you don’t capture lessons learned.
Recommended study routine (4–8 weeks plan)
- Weeks 1–2: Focus drills on L1–L3 basics and routing protocols. 4–6 drills/day, short time-box.
- Weeks 3–4: Move to BGP, MPLS, and EVPN scenarios. Include policy/filtering exercises.
- Weeks 5–6: Complex multi-technology scenarios and mixed-fault drills. Emphasize speed.
- Week 7–8: Full-scope timed simulation mimicking exam pressure and mixed failures.
Final notes
Consistent, structured troubleshooting practice builds both confidence and competence. Use realistic topologies, introduce varied faults, and measure progress. Over time your diagnostic process will become systematic and fast — the key to passing CCIE RS 400-101 and excelling in production networks.
Leave a Reply