Performance and Profiling
Performance monitoring and optimization infrastructure for McRogueFace. Press F3 in-game to see real-time metrics.
Quick Reference
Related Issues:
- #104 - Basic Profiling/Metrics (Tier 1 - Implemented)
- #115 - SpatialHash Implementation (Tier 1 - Active)
- #116 - Dirty Flag System (Tier 1 - Active)
- #117 - Memory Pool for Entities (Tier 1 - Active)
- #113 - Batch Operations for Grid (Tier 1 - Active)
Key Files:
src/Profiler.h- ScopedTimer RAII helpersrc/ProfilerOverlay.cpp- F3 overlay visualizationsrc/GameEngine.h- ProfilingMetrics structtests/benchmark_*.py- Performance benchmarks
Implementation: Commit e9e9cd2 (October 2025)
Profiling Tools
F3 Profiler Overlay
Activation: Press F3 during gameplay
Displays:
- Frame time (ms) with color coding:
- Green: < 16ms (60+ FPS)
- Yellow: 16-33ms (30-60 FPS)
- Red: > 33ms (< 30 FPS)
- FPS (averaged over 60 frames)
- Detailed breakdowns:
- Grid rendering time
- Entity rendering time
- FOV overlay time
- Python script time
- Animation update time
- Per-frame counts:
- Grid cells rendered
- Entities rendered (visible/total)
- Draw calls
Implementation: src/ProfilerOverlay.cpp::update(), src/ProfilerOverlay.cpp::render()
ScopedTimer (RAII)
Automatic timing for code blocks:
#include "Profiler.h"
void expensiveFunction() {
ScopedTimer timer(Resources::game->metrics.functionTime);
// ... code to measure ...
// Timer automatically records duration on destruction
}
How it works:
- Constructor: Records start time
- Destructor: Calculates elapsed time, adds to metric
- Zero overhead if profiling disabled
Implementation: src/Profiler.h::ScopedTimer
ProfilingMetrics Struct
Centralized metrics collection:
struct ProfilingMetrics {
float frameTime = 0.0f; // Current frame time (ms)
float avgFrameTime = 0.0f; // 60-frame average
int fps = 0; // Frames per second
// Detailed timing
float gridRenderTime = 0.0f; // Grid rendering
float entityRenderTime = 0.0f; // Entity rendering
float fovOverlayTime = 0.0f; // FOV overlay
float pythonScriptTime = 0.0f; // Python callbacks
float animationTime = 0.0f; // Animation updates
// Per-frame counters
int gridCellsRendered = 0;
int entitiesRendered = 0;
int totalEntities = 0;
int drawCalls = 0;
void resetPerFrame(); // Call at start of frame
};
Usage: Access via Resources::game->metrics
Implementation: src/GameEngine.h::ProfilingMetrics
Optimization Workflow
See Performance-Optimization-Workflow for detailed guide.
Quick workflow:
- Profile: Press F3, identify bottleneck
- Instrument: Add ScopedTimers around suspect code
- Benchmark: Run
tests/benchmark_*.pyfor baseline - Optimize: Make targeted changes
- Verify: Re-run benchmark, check improvement
- Iterate: Repeat until acceptable performance
Performance Benchmarks
Benchmark Suite
Files:
tests/benchmark_static_grid.py- Grid rendering performancetests/benchmark_moving_entities.py- Entity rendering + movement
Running benchmarks:
cd build
./mcrogueface --headless --exec ../tests/benchmark_static_grid.py
Output: Frame time statistics, entity counts, render metrics
Current Performance Characteristics
Grid Rendering:
- Static 100x100 grid: ~3-5ms (optimized with culling)
- Large grids (200x200): Can exceed 16ms without optimization
- Bottleneck: Redrawing unchanged cells every frame
Entity Rendering:
- < 100 entities: < 1ms (with culling)
- 1000 entities: ~5-10ms (needs SpatialHash optimization)
- Bottleneck: O(n) iteration, no spatial indexing
Python Callbacks:
- Minimal overhead if callbacks short (< 1ms typical)
- Bottleneck: Heavy computation in Python update loops
Planned Optimizations
Tier 1 (Active Development)
#116: Dirty Flag System
- Problem: Static grids redrawn every frame wastefully
- Solution: Track changes, only redraw when modified
- Expected: 10-50x improvement for static scenes
#115: SpatialHash
- Problem: O(n) entity iteration for spatial queries
- Solution: Hash grid for O(1) entity lookups by position
- Expected: 100x+ improvement for large entity counts
#113: Batch Operations
- Problem: Python/C++ boundary crossings for individual cells
- Solution: NumPy-style batch operations
- Expected: 10-100x improvement for bulk updates
#117: Memory Pool
- Problem: Entity allocation/deallocation overhead
- Solution: Pre-allocated entity pool, recycling
- Expected: Reduced memory fragmentation, faster spawning
Common Performance Issues
Issue: Low FPS on Static Screens
Symptom: Red FPS counter, high grid render time, nothing moving
Cause: Grid redrawing unchanged cells
Solution:
- Short-term: Reduce grid size
- Long-term: Wait for #116 (Dirty flags)
Issue: Slow Entity Queries
Symptom: High Python script time when finding nearby entities
Cause: O(n) iteration through all entities
Solution:
- Short-term: Limit entity counts (< 100)
- Long-term: Wait for #115 (SpatialHash)
Issue: Frame Drops During Bulk Updates
Symptom: Yellow/red FPS when updating many grid cells from Python
Cause: Python/C++ boundary crossings
Solution:
- Short-term: Batch updates manually in C++
- Long-term: Wait for #113 (Batch operations)
Instrumentation Examples
Adding Metrics to New Systems
// In system's update() or render() function
void MySystem::render() {
ScopedTimer timer(Resources::game->metrics.mySystemTime);
// Your rendering code
for (auto& item : items) {
item->render();
Resources::game->metrics.itemsRendered++;
}
}
Adding New Metric Fields
- Add field to
ProfilingMetricsinsrc/GameEngine.h - Reset in
resetPerFrame()if per-frame counter - Display in
src/ProfilerOverlay.cpp::update() - Instrument code with ScopedTimer
Related Systems
- Grid-System - Grid rendering instrumented with ScopedTimer
- Performance-Optimization-Workflow - Detailed optimization process
- Writing-Tests - Performance test creation
Design Decisions
Why F3 Toggle?
- Standard in game development (Minecraft, Unity, etc)
- Non-intrusive: Disabled by default
- Real-time feedback during gameplay
Why RAII ScopedTimer?
- Automatic cleanup prevents missing timer stops
- Exception-safe timing
- Zero overhead when profiling disabled
Tradeoffs:
- Overlay rendering adds small overhead (~0.1ms)
- Metrics collection adds branching (negligible)
- But: Essential visibility for optimization
See Also:
- Commit e9e9cd2: Profiling system implementation
- Performance-Optimization-Workflow for step-by-step guide