Update "Performance-and-Profiling"

2025-11-29 23:30:31 +00:00 · 2025-11-29 23:30:31 +00:00 · ac23c3b889
parent 86ddc7d383
commit ac23c3b889
1 changed files with 226 additions and 0 deletions
--- a/Performance-and-Profiling.md
+++ b/Performance-and-Profiling.md
@ -0,0 +1,226 @@
+# Performance and Profiling
+
+Performance monitoring and optimization infrastructure for McRogueFace. Press F3 in-game to see real-time metrics, or use the benchmark API to capture detailed timing data to disk.
+
+## Quick Reference
+
+**Related Issues:**
+- [#104](../../issues/104) - Basic Profiling/Metrics (Closed - Implemented)
+- [#148](../../issues/148) - Dirty Flag RenderTexture Caching (Closed - Implemented)
+- [#123](../../issues/123) - Chunk-based Grid Rendering (Closed - Implemented)
+- [#115](../../issues/115) - SpatialHash Implementation (Open - Tier 1)
+- [#113](../../issues/113) - Batch Operations for Grid (Open - Tier 1)
+- [#117](../../issues/117) - Memory Pool for Entities (Open - Tier 1)
+
+**Key Files:**
+- `src/Profiler.h` - ScopedTimer RAII helper
+- `src/ProfilerOverlay.cpp` - F3 overlay visualization
+- `src/GameEngine.h` - ProfilingMetrics struct
+
+---
+
+## Benchmark API
+
+The benchmark API captures detailed per-frame timing data to JSON files. C++ handles all timing responsibility; Python processes results afterward.
+
+### Basic Usage
+
+```python
+import mcrfpy
+
+# Start capturing benchmark data
+mcrfpy.start_benchmark()
+
+# ... run your test scenario ...
+
+# Stop and get the output filename
+filename = mcrfpy.end_benchmark()
+print(f"Benchmark saved to: {filename}")
+# e.g., "benchmark_12345_20250528_143022.json"
+```
+
+### Adding Log Messages
+
+Mark specific events within the benchmark:
+
+```python
+mcrfpy.start_benchmark()
+
+# Your code...
+mcrfpy.log_benchmark("Player spawned")
+
+# More code...
+mcrfpy.log_benchmark("Combat started")
+
+filename = mcrfpy.end_benchmark()
+```
+
+Log messages appear in the `logs` array of each frame in the output JSON.
+
+### Output Format
+
+The JSON file contains per-frame data:
+
+```json
+{
+  "frames": [
+    {
+      "frame_number": 1,
+      "frame_time_ms": 12.5,
+      "grid_render_time_ms": 8.2,
+      "entity_render_time_ms": 2.1,
+      "python_time_ms": 1.8,
+      "logs": ["Player spawned"]
+    },
+    ...
+  ],
+  "summary": {
+    "total_frames": 1000,
+    "avg_frame_time_ms": 14.2,
+    "max_frame_time_ms": 28.5,
+    "min_frame_time_ms": 8.1
+  }
+}
+```
+
+### Processing Results
+
+Since Python processes results *after* capture, timing overhead doesn't affect measurements:
+
+```python
+import json
+
+def analyze_benchmark(filename):
+    with open(filename) as f:
+        data = json.load(f)
+
+    frames = data["frames"]
+    slow_frames = [f for f in frames if f["frame_time_ms"] > 16.67]
+
+    print(f"Total frames: {len(frames)}")
+    print(f"Slow frames (>16.67ms): {len(slow_frames)}")
+    print(f"Average: {data['summary']['avg_frame_time_ms']:.2f}ms")
+
+    # Find what was happening during slow frames
+    for frame in slow_frames[:5]:
+        print(f"  Frame {frame['frame_number']}: {frame['frame_time_ms']:.1f}ms")
+        if frame.get("logs"):
+            print(f"    Logs: {frame['logs']}")
+```
+
+---
+
+## F3 Profiler Overlay
+
+**Activation:** Press F3 during gameplay
+
+**Displays:**
+- Frame time (ms) with color coding:
+  - Green: < 16ms (60+ FPS)
+  - Yellow: 16-33ms (30-60 FPS)
+  - Red: > 33ms (< 30 FPS)
+- FPS (averaged over 60 frames)
+- Detailed breakdowns:
+  - Grid rendering time
+  - Entity rendering time
+  - Python script time
+  - Animation update time
+- Per-frame counts:
+  - Grid cells rendered
+  - Entities rendered (visible/total)
+  - Draw calls
+
+**Implementation:** `src/ProfilerOverlay.cpp`
+
+---
+
+## Current Performance
+
+### Implemented Optimizations
+
+**Chunk-based Rendering** ([#123](../../issues/123)):
+- Large grids divided into chunks (~256 cells each)
+- Only visible chunks processed
+- 1000x1000+ grids render efficiently
+
+**Dirty Flag Caching** ([#148](../../issues/148)):
+- Layers track changes per-chunk
+- Unchanged chunks reuse cached RenderTexture
+- Static scenes: near-zero CPU cost after initial render
+
+**Viewport Culling:**
+- Only cells/entities within viewport processed
+- Camera position and zoom respected
+
+### Current Bottlenecks
+
+**Entity Spatial Queries** - O(n) iteration:
+- Finding entities at position requires checking all entities
+- Becomes noticeable at 500+ entities
+- **Solution:** [#115](../../issues/115) SpatialHash
+
+**Bulk Grid Updates** - Python/C++ boundary:
+- Many individual `layer.set()` calls are slower than batch operations
+- Each call crosses the Python/C++ boundary
+- **Solution:** [#113](../../issues/113) Batch Operations
+
+**Entity Allocation** - Memory fragmentation:
+- Frequent spawn/destroy cycles fragment memory
+- **Solution:** [#117](../../issues/117) Memory Pool
+
+---
+
+## Optimization Workflow
+
+1. **Profile**: Press F3, identify which metric is high
+2. **Benchmark**: Use `start_benchmark()` to capture detailed data
+3. **Analyze**: Process JSON to find patterns in slow frames
+4. **Optimize**: Make targeted changes
+5. **Verify**: Re-run benchmark, compare results
+6. **Iterate**: Repeat until acceptable performance
+
+### Performance Targets
+
+| Metric | Target | Notes |
+|--------|--------|-------|
+| Frame time | < 16.67ms | 60 FPS |
+| Grid render | < 5ms | For typical game grids |
+| Entity render | < 2ms | For < 200 entities |
+| Python callbacks | < 2ms | Keep logic light |
+
+---
+
+## C++ Instrumentation
+
+### ScopedTimer (RAII)
+
+Automatic timing for code blocks:
+
+```cpp
+#include "Profiler.h"
+
+void expensiveFunction() {
+    ScopedTimer timer(Resources::game->metrics.functionTime);
+    // ... code to measure ...
+    // Timer automatically records duration on destruction
+}
+```
+
+### Adding New Metrics
+
+1. Add field to `ProfilingMetrics` in `src/GameEngine.h`
+2. Reset in `resetPerFrame()` if per-frame counter
+3. Display in `src/ProfilerOverlay.cpp::update()`
+4. Instrument code with ScopedTimer
+
+---
+
+## Related Systems
+
+- [[Grid-Rendering-Pipeline]] - Chunk caching and dirty flags
+- [[Entity-Management]] - Entity performance considerations
+- [[Writing-Tests]] - Performance test creation
+
+---
+
+*Last updated: 2025-11-29*