Add "Performance-and-Profiling"

John McCardle 2025-10-25 20:59:11 +00:00
parent f60d04f762
commit 1f06157855
1 changed files with 250 additions and 0 deletions

@ -0,0 +1,250 @@
# Performance and Profiling
Performance monitoring and optimization infrastructure for McRogueFace. Press F3 in-game to see real-time metrics.
## Quick Reference
**Related Issues:**
- [#104](../../issues/104) - Basic Profiling/Metrics (Tier 1 - Implemented)
- [#115](../../issues/115) - SpatialHash Implementation (Tier 1 - Active)
- [#116](../../issues/116) - Dirty Flag System (Tier 1 - Active)
- [#117](../../issues/117) - Memory Pool for Entities (Tier 1 - Active)
- [#113](../../issues/113) - Batch Operations for Grid (Tier 1 - Active)
**Key Files:**
- `src/Profiler.h` - ScopedTimer RAII helper
- `src/ProfilerOverlay.cpp` - F3 overlay visualization
- `src/GameEngine.h` - ProfilingMetrics struct
- `tests/benchmark_*.py` - Performance benchmarks
**Implementation:** Commit e9e9cd2 (October 2025)
## Profiling Tools
### F3 Profiler Overlay
**Activation:** Press F3 during gameplay
**Displays:**
- Frame time (ms) with color coding:
- Green: < 16ms (60+ FPS)
- Yellow: 16-33ms (30-60 FPS)
- Red: > 33ms (< 30 FPS)
- FPS (averaged over 60 frames)
- Detailed breakdowns:
- Grid rendering time
- Entity rendering time
- FOV overlay time
- Python script time
- Animation update time
- Per-frame counts:
- Grid cells rendered
- Entities rendered (visible/total)
- Draw calls
**Implementation:** `src/ProfilerOverlay.cpp::update()`, `src/ProfilerOverlay.cpp::render()`
### ScopedTimer (RAII)
Automatic timing for code blocks:
```cpp
#include "Profiler.h"
void expensiveFunction() {
ScopedTimer timer(Resources::game->metrics.functionTime);
// ... code to measure ...
// Timer automatically records duration on destruction
}
```
**How it works:**
- Constructor: Records start time
- Destructor: Calculates elapsed time, adds to metric
- Zero overhead if profiling disabled
**Implementation:** `src/Profiler.h::ScopedTimer`
### ProfilingMetrics Struct
Centralized metrics collection:
```cpp
struct ProfilingMetrics {
float frameTime = 0.0f; // Current frame time (ms)
float avgFrameTime = 0.0f; // 60-frame average
int fps = 0; // Frames per second
// Detailed timing
float gridRenderTime = 0.0f; // Grid rendering
float entityRenderTime = 0.0f; // Entity rendering
float fovOverlayTime = 0.0f; // FOV overlay
float pythonScriptTime = 0.0f; // Python callbacks
float animationTime = 0.0f; // Animation updates
// Per-frame counters
int gridCellsRendered = 0;
int entitiesRendered = 0;
int totalEntities = 0;
int drawCalls = 0;
void resetPerFrame(); // Call at start of frame
};
```
**Usage:** Access via `Resources::game->metrics`
**Implementation:** `src/GameEngine.h::ProfilingMetrics`
## Optimization Workflow
See [[Performance-Optimization-Workflow]] for detailed guide.
**Quick workflow:**
1. **Profile**: Press F3, identify bottleneck
2. **Instrument**: Add ScopedTimers around suspect code
3. **Benchmark**: Run `tests/benchmark_*.py` for baseline
4. **Optimize**: Make targeted changes
5. **Verify**: Re-run benchmark, check improvement
6. **Iterate**: Repeat until acceptable performance
## Performance Benchmarks
### Benchmark Suite
**Files:**
- `tests/benchmark_static_grid.py` - Grid rendering performance
- `tests/benchmark_moving_entities.py` - Entity rendering + movement
**Running benchmarks:**
```bash
cd build
./mcrogueface --headless --exec ../tests/benchmark_static_grid.py
```
**Output:** Frame time statistics, entity counts, render metrics
### Current Performance Characteristics
**Grid Rendering:**
- Static 100x100 grid: ~3-5ms (optimized with culling)
- Large grids (200x200): Can exceed 16ms without optimization
- Bottleneck: Redrawing unchanged cells every frame
**Entity Rendering:**
- < 100 entities: < 1ms (with culling)
- 1000 entities: ~5-10ms (needs SpatialHash optimization)
- Bottleneck: O(n) iteration, no spatial indexing
**Python Callbacks:**
- Minimal overhead if callbacks short (< 1ms typical)
- Bottleneck: Heavy computation in Python update loops
## Planned Optimizations
### Tier 1 (Active Development)
**[#116](../../issues/116): Dirty Flag System**
- Problem: Static grids redrawn every frame wastefully
- Solution: Track changes, only redraw when modified
- Expected: 10-50x improvement for static scenes
**[#115](../../issues/115): SpatialHash**
- Problem: O(n) entity iteration for spatial queries
- Solution: Hash grid for O(1) entity lookups by position
- Expected: 100x+ improvement for large entity counts
**[#113](../../issues/113): Batch Operations**
- Problem: Python/C++ boundary crossings for individual cells
- Solution: NumPy-style batch operations
- Expected: 10-100x improvement for bulk updates
**[#117](../../issues/117): Memory Pool**
- Problem: Entity allocation/deallocation overhead
- Solution: Pre-allocated entity pool, recycling
- Expected: Reduced memory fragmentation, faster spawning
## Common Performance Issues
### Issue: Low FPS on Static Screens
**Symptom:** Red FPS counter, high grid render time, nothing moving
**Cause:** Grid redrawing unchanged cells
**Solution:**
- Short-term: Reduce grid size
- Long-term: Wait for [#116](../../issues/116) (Dirty flags)
### Issue: Slow Entity Queries
**Symptom:** High Python script time when finding nearby entities
**Cause:** O(n) iteration through all entities
**Solution:**
- Short-term: Limit entity counts (< 100)
- Long-term: Wait for [#115](../../issues/115) (SpatialHash)
### Issue: Frame Drops During Bulk Updates
**Symptom:** Yellow/red FPS when updating many grid cells from Python
**Cause:** Python/C++ boundary crossings
**Solution:**
- Short-term: Batch updates manually in C++
- Long-term: Wait for [#113](../../issues/113) (Batch operations)
## Instrumentation Examples
### Adding Metrics to New Systems
```cpp
// In system's update() or render() function
void MySystem::render() {
ScopedTimer timer(Resources::game->metrics.mySystemTime);
// Your rendering code
for (auto& item : items) {
item->render();
Resources::game->metrics.itemsRendered++;
}
}
```
### Adding New Metric Fields
1. Add field to `ProfilingMetrics` in `src/GameEngine.h`
2. Reset in `resetPerFrame()` if per-frame counter
3. Display in `src/ProfilerOverlay.cpp::update()`
4. Instrument code with ScopedTimer
## Related Systems
- [[Grid-System]] - Grid rendering instrumented with ScopedTimer
- [[Performance-Optimization-Workflow]] - Detailed optimization process
- [[Writing-Tests]] - Performance test creation
## Design Decisions
**Why F3 Toggle?**
- Standard in game development (Minecraft, Unity, etc)
- Non-intrusive: Disabled by default
- Real-time feedback during gameplay
**Why RAII ScopedTimer?**
- Automatic cleanup prevents missing timer stops
- Exception-safe timing
- Zero overhead when profiling disabled
**Tradeoffs:**
- Overlay rendering adds small overhead (~0.1ms)
- Metrics collection adds branching (negligible)
- But: Essential visibility for optimization
---
**See Also:**
- Commit e9e9cd2: Profiling system implementation
- [[Performance-Optimization-Workflow]] for step-by-step guide