Add "Performance-and-Profiling"
parent
f60d04f762
commit
1f06157855
|
|
@ -0,0 +1,250 @@
|
|||
# Performance and Profiling
|
||||
|
||||
Performance monitoring and optimization infrastructure for McRogueFace. Press F3 in-game to see real-time metrics.
|
||||
|
||||
## Quick Reference
|
||||
|
||||
**Related Issues:**
|
||||
- [#104](../../issues/104) - Basic Profiling/Metrics (Tier 1 - Implemented)
|
||||
- [#115](../../issues/115) - SpatialHash Implementation (Tier 1 - Active)
|
||||
- [#116](../../issues/116) - Dirty Flag System (Tier 1 - Active)
|
||||
- [#117](../../issues/117) - Memory Pool for Entities (Tier 1 - Active)
|
||||
- [#113](../../issues/113) - Batch Operations for Grid (Tier 1 - Active)
|
||||
|
||||
**Key Files:**
|
||||
- `src/Profiler.h` - ScopedTimer RAII helper
|
||||
- `src/ProfilerOverlay.cpp` - F3 overlay visualization
|
||||
- `src/GameEngine.h` - ProfilingMetrics struct
|
||||
- `tests/benchmark_*.py` - Performance benchmarks
|
||||
|
||||
**Implementation:** Commit e9e9cd2 (October 2025)
|
||||
|
||||
## Profiling Tools
|
||||
|
||||
### F3 Profiler Overlay
|
||||
|
||||
**Activation:** Press F3 during gameplay
|
||||
|
||||
**Displays:**
|
||||
- Frame time (ms) with color coding:
|
||||
- Green: < 16ms (60+ FPS)
|
||||
- Yellow: 16-33ms (30-60 FPS)
|
||||
- Red: > 33ms (< 30 FPS)
|
||||
- FPS (averaged over 60 frames)
|
||||
- Detailed breakdowns:
|
||||
- Grid rendering time
|
||||
- Entity rendering time
|
||||
- FOV overlay time
|
||||
- Python script time
|
||||
- Animation update time
|
||||
- Per-frame counts:
|
||||
- Grid cells rendered
|
||||
- Entities rendered (visible/total)
|
||||
- Draw calls
|
||||
|
||||
**Implementation:** `src/ProfilerOverlay.cpp::update()`, `src/ProfilerOverlay.cpp::render()`
|
||||
|
||||
### ScopedTimer (RAII)
|
||||
|
||||
Automatic timing for code blocks:
|
||||
|
||||
```cpp
|
||||
#include "Profiler.h"
|
||||
|
||||
void expensiveFunction() {
|
||||
ScopedTimer timer(Resources::game->metrics.functionTime);
|
||||
// ... code to measure ...
|
||||
// Timer automatically records duration on destruction
|
||||
}
|
||||
```
|
||||
|
||||
**How it works:**
|
||||
- Constructor: Records start time
|
||||
- Destructor: Calculates elapsed time, adds to metric
|
||||
- Zero overhead if profiling disabled
|
||||
|
||||
**Implementation:** `src/Profiler.h::ScopedTimer`
|
||||
|
||||
### ProfilingMetrics Struct
|
||||
|
||||
Centralized metrics collection:
|
||||
|
||||
```cpp
|
||||
struct ProfilingMetrics {
|
||||
float frameTime = 0.0f; // Current frame time (ms)
|
||||
float avgFrameTime = 0.0f; // 60-frame average
|
||||
int fps = 0; // Frames per second
|
||||
|
||||
// Detailed timing
|
||||
float gridRenderTime = 0.0f; // Grid rendering
|
||||
float entityRenderTime = 0.0f; // Entity rendering
|
||||
float fovOverlayTime = 0.0f; // FOV overlay
|
||||
float pythonScriptTime = 0.0f; // Python callbacks
|
||||
float animationTime = 0.0f; // Animation updates
|
||||
|
||||
// Per-frame counters
|
||||
int gridCellsRendered = 0;
|
||||
int entitiesRendered = 0;
|
||||
int totalEntities = 0;
|
||||
int drawCalls = 0;
|
||||
|
||||
void resetPerFrame(); // Call at start of frame
|
||||
};
|
||||
```
|
||||
|
||||
**Usage:** Access via `Resources::game->metrics`
|
||||
|
||||
**Implementation:** `src/GameEngine.h::ProfilingMetrics`
|
||||
|
||||
## Optimization Workflow
|
||||
|
||||
See [[Performance-Optimization-Workflow]] for detailed guide.
|
||||
|
||||
**Quick workflow:**
|
||||
1. **Profile**: Press F3, identify bottleneck
|
||||
2. **Instrument**: Add ScopedTimers around suspect code
|
||||
3. **Benchmark**: Run `tests/benchmark_*.py` for baseline
|
||||
4. **Optimize**: Make targeted changes
|
||||
5. **Verify**: Re-run benchmark, check improvement
|
||||
6. **Iterate**: Repeat until acceptable performance
|
||||
|
||||
## Performance Benchmarks
|
||||
|
||||
### Benchmark Suite
|
||||
|
||||
**Files:**
|
||||
- `tests/benchmark_static_grid.py` - Grid rendering performance
|
||||
- `tests/benchmark_moving_entities.py` - Entity rendering + movement
|
||||
|
||||
**Running benchmarks:**
|
||||
```bash
|
||||
cd build
|
||||
./mcrogueface --headless --exec ../tests/benchmark_static_grid.py
|
||||
```
|
||||
|
||||
**Output:** Frame time statistics, entity counts, render metrics
|
||||
|
||||
### Current Performance Characteristics
|
||||
|
||||
**Grid Rendering:**
|
||||
- Static 100x100 grid: ~3-5ms (optimized with culling)
|
||||
- Large grids (200x200): Can exceed 16ms without optimization
|
||||
- Bottleneck: Redrawing unchanged cells every frame
|
||||
|
||||
**Entity Rendering:**
|
||||
- < 100 entities: < 1ms (with culling)
|
||||
- 1000 entities: ~5-10ms (needs SpatialHash optimization)
|
||||
- Bottleneck: O(n) iteration, no spatial indexing
|
||||
|
||||
**Python Callbacks:**
|
||||
- Minimal overhead if callbacks short (< 1ms typical)
|
||||
- Bottleneck: Heavy computation in Python update loops
|
||||
|
||||
## Planned Optimizations
|
||||
|
||||
### Tier 1 (Active Development)
|
||||
|
||||
**[#116](../../issues/116): Dirty Flag System**
|
||||
- Problem: Static grids redrawn every frame wastefully
|
||||
- Solution: Track changes, only redraw when modified
|
||||
- Expected: 10-50x improvement for static scenes
|
||||
|
||||
**[#115](../../issues/115): SpatialHash**
|
||||
- Problem: O(n) entity iteration for spatial queries
|
||||
- Solution: Hash grid for O(1) entity lookups by position
|
||||
- Expected: 100x+ improvement for large entity counts
|
||||
|
||||
**[#113](../../issues/113): Batch Operations**
|
||||
- Problem: Python/C++ boundary crossings for individual cells
|
||||
- Solution: NumPy-style batch operations
|
||||
- Expected: 10-100x improvement for bulk updates
|
||||
|
||||
**[#117](../../issues/117): Memory Pool**
|
||||
- Problem: Entity allocation/deallocation overhead
|
||||
- Solution: Pre-allocated entity pool, recycling
|
||||
- Expected: Reduced memory fragmentation, faster spawning
|
||||
|
||||
## Common Performance Issues
|
||||
|
||||
### Issue: Low FPS on Static Screens
|
||||
|
||||
**Symptom:** Red FPS counter, high grid render time, nothing moving
|
||||
|
||||
**Cause:** Grid redrawing unchanged cells
|
||||
|
||||
**Solution:**
|
||||
- Short-term: Reduce grid size
|
||||
- Long-term: Wait for [#116](../../issues/116) (Dirty flags)
|
||||
|
||||
### Issue: Slow Entity Queries
|
||||
|
||||
**Symptom:** High Python script time when finding nearby entities
|
||||
|
||||
**Cause:** O(n) iteration through all entities
|
||||
|
||||
**Solution:**
|
||||
- Short-term: Limit entity counts (< 100)
|
||||
- Long-term: Wait for [#115](../../issues/115) (SpatialHash)
|
||||
|
||||
### Issue: Frame Drops During Bulk Updates
|
||||
|
||||
**Symptom:** Yellow/red FPS when updating many grid cells from Python
|
||||
|
||||
**Cause:** Python/C++ boundary crossings
|
||||
|
||||
**Solution:**
|
||||
- Short-term: Batch updates manually in C++
|
||||
- Long-term: Wait for [#113](../../issues/113) (Batch operations)
|
||||
|
||||
## Instrumentation Examples
|
||||
|
||||
### Adding Metrics to New Systems
|
||||
|
||||
```cpp
|
||||
// In system's update() or render() function
|
||||
void MySystem::render() {
|
||||
ScopedTimer timer(Resources::game->metrics.mySystemTime);
|
||||
|
||||
// Your rendering code
|
||||
for (auto& item : items) {
|
||||
item->render();
|
||||
Resources::game->metrics.itemsRendered++;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Adding New Metric Fields
|
||||
|
||||
1. Add field to `ProfilingMetrics` in `src/GameEngine.h`
|
||||
2. Reset in `resetPerFrame()` if per-frame counter
|
||||
3. Display in `src/ProfilerOverlay.cpp::update()`
|
||||
4. Instrument code with ScopedTimer
|
||||
|
||||
## Related Systems
|
||||
|
||||
- [[Grid-System]] - Grid rendering instrumented with ScopedTimer
|
||||
- [[Performance-Optimization-Workflow]] - Detailed optimization process
|
||||
- [[Writing-Tests]] - Performance test creation
|
||||
|
||||
## Design Decisions
|
||||
|
||||
**Why F3 Toggle?**
|
||||
- Standard in game development (Minecraft, Unity, etc)
|
||||
- Non-intrusive: Disabled by default
|
||||
- Real-time feedback during gameplay
|
||||
|
||||
**Why RAII ScopedTimer?**
|
||||
- Automatic cleanup prevents missing timer stops
|
||||
- Exception-safe timing
|
||||
- Zero overhead when profiling disabled
|
||||
|
||||
**Tradeoffs:**
|
||||
- Overlay rendering adds small overhead (~0.1ms)
|
||||
- Metrics collection adds branching (negligible)
|
||||
- But: Essential visibility for optimization
|
||||
|
||||
---
|
||||
|
||||
**See Also:**
|
||||
- Commit e9e9cd2: Profiling system implementation
|
||||
- [[Performance-Optimization-Workflow]] for step-by-step guide
|
||||
Loading…
Reference in New Issue