Skip to content

Part 13: Debugging & Profiling System Surgery

"Performance is not about being fast. It's about not being slow."

Stop using std::cout for debugging. Master Sanitizers, Valgrind, GDB, và Linux Perf.

Performance Profile: Slow Code Analysis (@[/perf-profile])

The Crime Scene

cpp
// ❌ SLOW CODE — But where's the bottleneck?
void ProcessPackets(const std::vector<Packet>& packets) {
    for (const auto& packet : packets) {
        // String allocation on every packet?
        std::string log_msg = "Processing packet: " + packet.id_;
        Logger::Log(log_msg);
        
        // Is this a copy?
        auto data = packet.GetPayload();
        
        // Linear search in inner loop?
        for (const auto& rule : firewall_rules_) {
            if (rule.Matches(packet)) {
                ApplyRule(rule, data);
            }
        }
    }
}

Triệu chứng: HPN Tunnel chỉ xử lý được 10K packets/s thay vì 100K.

Câu hỏi: Làm sao biết dòng code nào chậm mà không đọc toàn bộ codebase?

FlameGraph Reveals the Truth

┌─────────────────────────────────────────────────────────────────────────┐
│                    FLAMEGRAPH - CPU HOTSPOTS                             │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓  100%  │
│   │                    ProcessPackets()                     │          │
│   ├────────────────────┬───────────────────┬────────────────┤          │
│   │ std::string::op+   │ Packet::GetData   │  MatchRules    │          │
│   │    ▓▓▓▓▓▓▓▓▓▓▓     │    ▓▓▓▓▓▓▓▓       │   ▓▓▓▓▓▓▓▓▓▓   │          │
│   │      (35%)         │      (25%)        │     (30%)      │          │
│   ├────────────────────┼───────────────────┼────────────────┤          │
│   │    malloc()        │   memcpy()        │  strcmp()      │          │
│   │      (20%)         │    (15%)          │    (25%)       │          │
│   └────────────────────┴───────────────────┴────────────────┘          │
│                                                                         │
│   🔥 INSIGHT:                                                           │
│   • 35% CPU wasted on string concatenation (malloc!)                    │
│   • 25% CPU on unnecessary data copy                                    │
│   • 30% CPU on linear search O(n)                                       │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

The Fix (Data-Driven Optimization)

cpp
// ✅ OPTIMIZED — Based on FlameGraph evidence
void ProcessPackets(const std::vector<Packet>& packets) {
    // FIX 1: Pre-allocated buffer (no malloc in hot path)
    static thread_local std::string log_buffer;
    log_buffer.reserve(256);
    
    for (const auto& packet : packets) {
        // FIX 1: Reuse buffer
        log_buffer.clear();
        log_buffer.append("Processing: ");
        log_buffer.append(packet.id_);
        Logger::LogAsync(std::string_view(log_buffer));
        
        // FIX 2: Return by const reference (no copy)
        const auto& data = packet.GetPayload();
        
        // FIX 3: Hash-based lookup O(1) instead of O(n)
        if (auto* rule = rule_cache_.Find(packet.signature_)) {
            ApplyRule(*rule, data);
        }
    }
}

🚀 HPN TUNNEL RESULT

Sau khi optimize dựa trên FlameGraph:

  • Before: 10K packets/s
  • After: 80K packets/s
  • Improvement: 8x throughput

Tools Overview

┌─────────────────────────────────────────────────────────────────────────┐
│                    DEBUGGING & PROFILING TOOLKIT                         │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   MEMORY BUGS (Crashes, Corruption)                                     │
│   ─────────────────────────────────                                     │
│   • AddressSanitizer (ASan)  → Use-after-free, Buffer overflow          │
│   • ThreadSanitizer (TSan)   → Data races                               │
│   • Valgrind Memcheck        → Memory leaks (slower)                    │
│                                                                         │
│   PERFORMANCE (Slow Code)                                               │
│   ─────────────────────────                                             │
│   • Linux Perf               → CPU profiling, FlameGraphs               │
│   • Valgrind Massif          → Memory usage over time                   │
│   • Valgrind Cachegrind      → Cache misses                             │
│                                                                         │
│   CRASH INVESTIGATION (Post-mortem)                                     │
│   ─────────────────────────────────                                     │
│   • GDB                      → Interactive debugging                    │
│   • Core Dumps               → "Frozen" crash state                     │
│                                                                         │
│   PRODUCTION MONITORING                                                  │
│   ──────────────────────                                                │
│   • Structured Logging       → Request ID tracing                       │
│   • spdlog                   → High-performance async logging           │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Module Structure

Bài họcNội dung chínhWorkflow
🛡️ SanitizersASan, TSan, Shadow Memory@[/security-scan]
🔬 ProfilingValgrind, Perf, FlameGraphs@[/perf-profile]
🔍 GDBCore Dumps, Interactive Debug@[/debug]
📊 Loggingspdlog, Structured Logging@[/observability]

Prerequisites

Trước khi học module này, bạn cần:


HPN Debugging Mindset

💀 RULES OF DEBUGGING

  1. Measure First, Optimize Second — Không đoán, dùng FlameGraph
  2. Reproduce Before Fix — Nếu không reproduce được, chưa hiểu bug
  3. Never Restart Without Core Dump — Core dump = evidence
  4. Logging is for Machines — JSON logs, not printf()

Bước tiếp theo

🛡️ Sanitizers → — ASan, TSan, UBSan