Update "Performance-and-Profiling"

2025-11-29 23:30:31 +00:00 · 2025-11-29 23:30:31 +00:00 · ac23c3b889
parent 86ddc7d383
commit ac23c3b889
1 changed files with 226 additions and 0 deletions
--- a/Performance-and-Profiling.md
+++ b/Performance-and-Profiling.md
@ -0,0 +1,226 @@
 # Performance and Profiling
 Performance monitoring and optimization infrastructure for McRogueFace. Press F3 in-game to see real-time metrics, or use the benchmark API to capture detailed timing data to disk.
 ## Quick Reference
 **Related Issues:**
 - [#104](../../issues/104) - Basic Profiling/Metrics (Closed - Implemented)
 - [#148](../../issues/148) - Dirty Flag RenderTexture Caching (Closed - Implemented)
 - [#123](../../issues/123) - Chunk-based Grid Rendering (Closed - Implemented)
 - [#115](../../issues/115) - SpatialHash Implementation (Open - Tier 1)
 - [#113](../../issues/113) - Batch Operations for Grid (Open - Tier 1)
 - [#117](../../issues/117) - Memory Pool for Entities (Open - Tier 1)
 **Key Files:**
 - `src/Profiler.h` - ScopedTimer RAII helper
 - `src/ProfilerOverlay.cpp` - F3 overlay visualization
 - `src/GameEngine.h` - ProfilingMetrics struct
 ---
 ## Benchmark API
 The benchmark API captures detailed per-frame timing data to JSON files. C++ handles all timing responsibility; Python processes results afterward.
 ### Basic Usage
 ```python
 import mcrfpy
 # Start capturing benchmark data
 mcrfpy.start_benchmark()
 # ... run your test scenario ...
 # Stop and get the output filename
 filename = mcrfpy.end_benchmark()
 print(f"Benchmark saved to: {filename}")
 # e.g., "benchmark_12345_20250528_143022.json"
 ```
 ### Adding Log Messages
 Mark specific events within the benchmark:
 ```python
 mcrfpy.start_benchmark()
 # Your code...
 mcrfpy.log_benchmark("Player spawned")
 # More code...
 mcrfpy.log_benchmark("Combat started")
 filename = mcrfpy.end_benchmark()
 ```
 Log messages appear in the `logs` array of each frame in the output JSON.
 ### Output Format
 The JSON file contains per-frame data:
 ```json
 {
  "frames": [
    {
      "frame_number": 1,
      "frame_time_ms": 12.5,
      "grid_render_time_ms": 8.2,
      "entity_render_time_ms": 2.1,
      "python_time_ms": 1.8,
      "logs": ["Player spawned"]
    },
    ...
  ],
  "summary": {
    "total_frames": 1000,
    "avg_frame_time_ms": 14.2,
    "max_frame_time_ms": 28.5,
    "min_frame_time_ms": 8.1
  }
 }
 ```
 ### Processing Results
 Since Python processes results *after* capture, timing overhead doesn't affect measurements:
 ```python
 import json
 def analyze_benchmark(filename):
    with open(filename) as f:
        data = json.load(f)
    frames = data["frames"]
    slow_frames = [f for f in frames if f["frame_time_ms"] > 16.67]
    print(f"Total frames: {len(frames)}")
    print(f"Slow frames (>16.67ms): {len(slow_frames)}")
    print(f"Average: {data['summary']['avg_frame_time_ms']:.2f}ms")
    # Find what was happening during slow frames
    for frame in slow_frames[:5]:
        print(f"  Frame {frame['frame_number']}: {frame['frame_time_ms']:.1f}ms")
        if frame.get("logs"):
            print(f"    Logs: {frame['logs']}")
 ```
 ---
 ## F3 Profiler Overlay
 **Activation:** Press F3 during gameplay
 **Displays:**
 - Frame time (ms) with color coding:
  - Green: < 16ms (60+ FPS)
  - Yellow: 16-33ms (30-60 FPS)
  - Red: > 33ms (< 30 FPS)
 - FPS (averaged over 60 frames)
 - Detailed breakdowns:
  - Grid rendering time
  - Entity rendering time
  - Python script time
  - Animation update time
 - Per-frame counts:
  - Grid cells rendered
  - Entities rendered (visible/total)
  - Draw calls
 **Implementation:** `src/ProfilerOverlay.cpp`
 ---
 ## Current Performance
 ### Implemented Optimizations
 **Chunk-based Rendering** ([#123](../../issues/123)):
 - Large grids divided into chunks (~256 cells each)
 - Only visible chunks processed
 - 1000x1000+ grids render efficiently
 **Dirty Flag Caching** ([#148](../../issues/148)):
 - Layers track changes per-chunk
 - Unchanged chunks reuse cached RenderTexture
 - Static scenes: near-zero CPU cost after initial render
 **Viewport Culling:**
 - Only cells/entities within viewport processed
 - Camera position and zoom respected
 ### Current Bottlenecks
 **Entity Spatial Queries** - O(n) iteration:
 - Finding entities at position requires checking all entities
 - Becomes noticeable at 500+ entities
 - **Solution:** [#115](../../issues/115) SpatialHash
 **Bulk Grid Updates** - Python/C++ boundary:
 - Many individual `layer.set()` calls are slower than batch operations
 - Each call crosses the Python/C++ boundary
 - **Solution:** [#113](../../issues/113) Batch Operations
 **Entity Allocation** - Memory fragmentation:
 - Frequent spawn/destroy cycles fragment memory
 - **Solution:** [#117](../../issues/117) Memory Pool
 ---
 ## Optimization Workflow
 1. **Profile**: Press F3, identify which metric is high
 2. **Benchmark**: Use `start_benchmark()` to capture detailed data
 3. **Analyze**: Process JSON to find patterns in slow frames
 4. **Optimize**: Make targeted changes
 5. **Verify**: Re-run benchmark, compare results
 6. **Iterate**: Repeat until acceptable performance
 ### Performance Targets
 | Metric | Target | Notes |
 |--------|--------|-------|
 | Frame time | < 16.67ms | 60 FPS |
 | Grid render | < 5ms | For typical game grids |
 | Entity render | < 2ms | For < 200 entities |
 | Python callbacks | < 2ms | Keep logic light |
 ---
 ## C++ Instrumentation
 ### ScopedTimer (RAII)
 Automatic timing for code blocks:
 ```cpp
 #include "Profiler.h"
 void expensiveFunction() {
    ScopedTimer timer(Resources::game->metrics.functionTime);
    // ... code to measure ...
    // Timer automatically records duration on destruction
 }
 ```
 ### Adding New Metrics
 1. Add field to `ProfilingMetrics` in `src/GameEngine.h`
 2. Reset in `resetPerFrame()` if per-frame counter
 3. Display in `src/ProfilerOverlay.cpp::update()`
 4. Instrument code with ScopedTimer
 ---
 ## Related Systems
 - [[Grid-Rendering-Pipeline]] - Chunk caching and dirty flags
 - [[Entity-Management]] - Entity performance considerations
 - [[Writing-Tests]] - Performance test creation
 ---
 *Last updated: 2025-11-29*