Add "Performance-and-Profiling"
parent
f60d04f762
commit
1f06157855
|
|
@ -0,0 +1,250 @@
|
||||||
|
# Performance and Profiling
|
||||||
|
|
||||||
|
Performance monitoring and optimization infrastructure for McRogueFace. Press F3 in-game to see real-time metrics.
|
||||||
|
|
||||||
|
## Quick Reference
|
||||||
|
|
||||||
|
**Related Issues:**
|
||||||
|
- [#104](../../issues/104) - Basic Profiling/Metrics (Tier 1 - Implemented)
|
||||||
|
- [#115](../../issues/115) - SpatialHash Implementation (Tier 1 - Active)
|
||||||
|
- [#116](../../issues/116) - Dirty Flag System (Tier 1 - Active)
|
||||||
|
- [#117](../../issues/117) - Memory Pool for Entities (Tier 1 - Active)
|
||||||
|
- [#113](../../issues/113) - Batch Operations for Grid (Tier 1 - Active)
|
||||||
|
|
||||||
|
**Key Files:**
|
||||||
|
- `src/Profiler.h` - ScopedTimer RAII helper
|
||||||
|
- `src/ProfilerOverlay.cpp` - F3 overlay visualization
|
||||||
|
- `src/GameEngine.h` - ProfilingMetrics struct
|
||||||
|
- `tests/benchmark_*.py` - Performance benchmarks
|
||||||
|
|
||||||
|
**Implementation:** Commit e9e9cd2 (October 2025)
|
||||||
|
|
||||||
|
## Profiling Tools
|
||||||
|
|
||||||
|
### F3 Profiler Overlay
|
||||||
|
|
||||||
|
**Activation:** Press F3 during gameplay
|
||||||
|
|
||||||
|
**Displays:**
|
||||||
|
- Frame time (ms) with color coding:
|
||||||
|
- Green: < 16ms (60+ FPS)
|
||||||
|
- Yellow: 16-33ms (30-60 FPS)
|
||||||
|
- Red: > 33ms (< 30 FPS)
|
||||||
|
- FPS (averaged over 60 frames)
|
||||||
|
- Detailed breakdowns:
|
||||||
|
- Grid rendering time
|
||||||
|
- Entity rendering time
|
||||||
|
- FOV overlay time
|
||||||
|
- Python script time
|
||||||
|
- Animation update time
|
||||||
|
- Per-frame counts:
|
||||||
|
- Grid cells rendered
|
||||||
|
- Entities rendered (visible/total)
|
||||||
|
- Draw calls
|
||||||
|
|
||||||
|
**Implementation:** `src/ProfilerOverlay.cpp::update()`, `src/ProfilerOverlay.cpp::render()`
|
||||||
|
|
||||||
|
### ScopedTimer (RAII)
|
||||||
|
|
||||||
|
Automatic timing for code blocks:
|
||||||
|
|
||||||
|
```cpp
|
||||||
|
#include "Profiler.h"
|
||||||
|
|
||||||
|
void expensiveFunction() {
|
||||||
|
ScopedTimer timer(Resources::game->metrics.functionTime);
|
||||||
|
// ... code to measure ...
|
||||||
|
// Timer automatically records duration on destruction
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**How it works:**
|
||||||
|
- Constructor: Records start time
|
||||||
|
- Destructor: Calculates elapsed time, adds to metric
|
||||||
|
- Zero overhead if profiling disabled
|
||||||
|
|
||||||
|
**Implementation:** `src/Profiler.h::ScopedTimer`
|
||||||
|
|
||||||
|
### ProfilingMetrics Struct
|
||||||
|
|
||||||
|
Centralized metrics collection:
|
||||||
|
|
||||||
|
```cpp
|
||||||
|
struct ProfilingMetrics {
|
||||||
|
float frameTime = 0.0f; // Current frame time (ms)
|
||||||
|
float avgFrameTime = 0.0f; // 60-frame average
|
||||||
|
int fps = 0; // Frames per second
|
||||||
|
|
||||||
|
// Detailed timing
|
||||||
|
float gridRenderTime = 0.0f; // Grid rendering
|
||||||
|
float entityRenderTime = 0.0f; // Entity rendering
|
||||||
|
float fovOverlayTime = 0.0f; // FOV overlay
|
||||||
|
float pythonScriptTime = 0.0f; // Python callbacks
|
||||||
|
float animationTime = 0.0f; // Animation updates
|
||||||
|
|
||||||
|
// Per-frame counters
|
||||||
|
int gridCellsRendered = 0;
|
||||||
|
int entitiesRendered = 0;
|
||||||
|
int totalEntities = 0;
|
||||||
|
int drawCalls = 0;
|
||||||
|
|
||||||
|
void resetPerFrame(); // Call at start of frame
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
**Usage:** Access via `Resources::game->metrics`
|
||||||
|
|
||||||
|
**Implementation:** `src/GameEngine.h::ProfilingMetrics`
|
||||||
|
|
||||||
|
## Optimization Workflow
|
||||||
|
|
||||||
|
See [[Performance-Optimization-Workflow]] for detailed guide.
|
||||||
|
|
||||||
|
**Quick workflow:**
|
||||||
|
1. **Profile**: Press F3, identify bottleneck
|
||||||
|
2. **Instrument**: Add ScopedTimers around suspect code
|
||||||
|
3. **Benchmark**: Run `tests/benchmark_*.py` for baseline
|
||||||
|
4. **Optimize**: Make targeted changes
|
||||||
|
5. **Verify**: Re-run benchmark, check improvement
|
||||||
|
6. **Iterate**: Repeat until acceptable performance
|
||||||
|
|
||||||
|
## Performance Benchmarks
|
||||||
|
|
||||||
|
### Benchmark Suite
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
- `tests/benchmark_static_grid.py` - Grid rendering performance
|
||||||
|
- `tests/benchmark_moving_entities.py` - Entity rendering + movement
|
||||||
|
|
||||||
|
**Running benchmarks:**
|
||||||
|
```bash
|
||||||
|
cd build
|
||||||
|
./mcrogueface --headless --exec ../tests/benchmark_static_grid.py
|
||||||
|
```
|
||||||
|
|
||||||
|
**Output:** Frame time statistics, entity counts, render metrics
|
||||||
|
|
||||||
|
### Current Performance Characteristics
|
||||||
|
|
||||||
|
**Grid Rendering:**
|
||||||
|
- Static 100x100 grid: ~3-5ms (optimized with culling)
|
||||||
|
- Large grids (200x200): Can exceed 16ms without optimization
|
||||||
|
- Bottleneck: Redrawing unchanged cells every frame
|
||||||
|
|
||||||
|
**Entity Rendering:**
|
||||||
|
- < 100 entities: < 1ms (with culling)
|
||||||
|
- 1000 entities: ~5-10ms (needs SpatialHash optimization)
|
||||||
|
- Bottleneck: O(n) iteration, no spatial indexing
|
||||||
|
|
||||||
|
**Python Callbacks:**
|
||||||
|
- Minimal overhead if callbacks short (< 1ms typical)
|
||||||
|
- Bottleneck: Heavy computation in Python update loops
|
||||||
|
|
||||||
|
## Planned Optimizations
|
||||||
|
|
||||||
|
### Tier 1 (Active Development)
|
||||||
|
|
||||||
|
**[#116](../../issues/116): Dirty Flag System**
|
||||||
|
- Problem: Static grids redrawn every frame wastefully
|
||||||
|
- Solution: Track changes, only redraw when modified
|
||||||
|
- Expected: 10-50x improvement for static scenes
|
||||||
|
|
||||||
|
**[#115](../../issues/115): SpatialHash**
|
||||||
|
- Problem: O(n) entity iteration for spatial queries
|
||||||
|
- Solution: Hash grid for O(1) entity lookups by position
|
||||||
|
- Expected: 100x+ improvement for large entity counts
|
||||||
|
|
||||||
|
**[#113](../../issues/113): Batch Operations**
|
||||||
|
- Problem: Python/C++ boundary crossings for individual cells
|
||||||
|
- Solution: NumPy-style batch operations
|
||||||
|
- Expected: 10-100x improvement for bulk updates
|
||||||
|
|
||||||
|
**[#117](../../issues/117): Memory Pool**
|
||||||
|
- Problem: Entity allocation/deallocation overhead
|
||||||
|
- Solution: Pre-allocated entity pool, recycling
|
||||||
|
- Expected: Reduced memory fragmentation, faster spawning
|
||||||
|
|
||||||
|
## Common Performance Issues
|
||||||
|
|
||||||
|
### Issue: Low FPS on Static Screens
|
||||||
|
|
||||||
|
**Symptom:** Red FPS counter, high grid render time, nothing moving
|
||||||
|
|
||||||
|
**Cause:** Grid redrawing unchanged cells
|
||||||
|
|
||||||
|
**Solution:**
|
||||||
|
- Short-term: Reduce grid size
|
||||||
|
- Long-term: Wait for [#116](../../issues/116) (Dirty flags)
|
||||||
|
|
||||||
|
### Issue: Slow Entity Queries
|
||||||
|
|
||||||
|
**Symptom:** High Python script time when finding nearby entities
|
||||||
|
|
||||||
|
**Cause:** O(n) iteration through all entities
|
||||||
|
|
||||||
|
**Solution:**
|
||||||
|
- Short-term: Limit entity counts (< 100)
|
||||||
|
- Long-term: Wait for [#115](../../issues/115) (SpatialHash)
|
||||||
|
|
||||||
|
### Issue: Frame Drops During Bulk Updates
|
||||||
|
|
||||||
|
**Symptom:** Yellow/red FPS when updating many grid cells from Python
|
||||||
|
|
||||||
|
**Cause:** Python/C++ boundary crossings
|
||||||
|
|
||||||
|
**Solution:**
|
||||||
|
- Short-term: Batch updates manually in C++
|
||||||
|
- Long-term: Wait for [#113](../../issues/113) (Batch operations)
|
||||||
|
|
||||||
|
## Instrumentation Examples
|
||||||
|
|
||||||
|
### Adding Metrics to New Systems
|
||||||
|
|
||||||
|
```cpp
|
||||||
|
// In system's update() or render() function
|
||||||
|
void MySystem::render() {
|
||||||
|
ScopedTimer timer(Resources::game->metrics.mySystemTime);
|
||||||
|
|
||||||
|
// Your rendering code
|
||||||
|
for (auto& item : items) {
|
||||||
|
item->render();
|
||||||
|
Resources::game->metrics.itemsRendered++;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Adding New Metric Fields
|
||||||
|
|
||||||
|
1. Add field to `ProfilingMetrics` in `src/GameEngine.h`
|
||||||
|
2. Reset in `resetPerFrame()` if per-frame counter
|
||||||
|
3. Display in `src/ProfilerOverlay.cpp::update()`
|
||||||
|
4. Instrument code with ScopedTimer
|
||||||
|
|
||||||
|
## Related Systems
|
||||||
|
|
||||||
|
- [[Grid-System]] - Grid rendering instrumented with ScopedTimer
|
||||||
|
- [[Performance-Optimization-Workflow]] - Detailed optimization process
|
||||||
|
- [[Writing-Tests]] - Performance test creation
|
||||||
|
|
||||||
|
## Design Decisions
|
||||||
|
|
||||||
|
**Why F3 Toggle?**
|
||||||
|
- Standard in game development (Minecraft, Unity, etc)
|
||||||
|
- Non-intrusive: Disabled by default
|
||||||
|
- Real-time feedback during gameplay
|
||||||
|
|
||||||
|
**Why RAII ScopedTimer?**
|
||||||
|
- Automatic cleanup prevents missing timer stops
|
||||||
|
- Exception-safe timing
|
||||||
|
- Zero overhead when profiling disabled
|
||||||
|
|
||||||
|
**Tradeoffs:**
|
||||||
|
- Overlay rendering adds small overhead (~0.1ms)
|
||||||
|
- Metrics collection adds branching (negligible)
|
||||||
|
- But: Essential visibility for optimization
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**See Also:**
|
||||||
|
- Commit e9e9cd2: Profiling system implementation
|
||||||
|
- [[Performance-Optimization-Workflow]] for step-by-step guide
|
||||||
Loading…
Reference in New Issue