Video pipeline, Modal integration, and insights
AI Video Analysis
Surflink's AI video analysis is powered by a Modal serverless backend running the SurfVision model. The system processes surf session footage to detect surfers, track movement, classify maneuvers, and generate structured analytics.
Video Processing Pipeline
1. Upload
The coach uploads video files through the /upload page. Files can come from:
- Direct file upload (drag-and-drop or file picker)
- Frame.io import (browsing connected accounts)
Before upload, files are optionally compressed client-side using FFmpeg WASM to reduce file size and upload time.
2. Submit to Modal
Each clip is sent to the Modal API endpoint:
POST $NEXT_PUBLIC_MODAL_API_URL/analyze
The API returns a job_id for tracking the processing job.
3. Processing
The Modal backend runs the SurfVision model which:
- Detects and tracks individual surfers across frames
- Classifies actions/maneuvers per surfer per frame
- Groups detected actions into wave rides
- Generates an annotated output video with overlays
- Re-encodes the output to H.264 via FFmpeg for browser playback
- Produces a structured JSON stats file
Progress is written every 30 frames during analysis. After frame analysis completes, the backend writes a "Re-encoding video..." stage at 98% during FFmpeg re-encode, then "Complete" at 100% before saving the final output. This ensures the UI always reflects the current processing stage rather than appearing stuck.
4. Polling
The frontend polls for progress:
GET $NEXT_PUBLIC_MODAL_API_URL/status/{job_id}
For multi-clip sessions, all clips are polled in parallel every 3 seconds using Promise.allSettled. The session detail page shows per-clip processing indicators with progress bars, stage labels, and ETA until each job completes.
5. CDN Transfer (Storage Caching)
Once analysis completes, a centralized CDN queue manages downloading processed videos and uploading them to Supabase Storage:
- All callers (poll loop, page-load recovery, manual retry) push to a single FIFO queue backed by refs
- Concurrency is limited to
MAX_CDN_CONCURRENCY = 2workers globally -- no competing queues - Local GPU clips:
triggerClipStoreLocaldownloads the video from the local server into the browser and uploads to Supabase Storage - Cloud (Modal) clips:
triggerClipStoreCloudcalls/api/clips/storewhich downloads from Modal server-side and uploads to Supabase Storage - Automatic retry: transient failures (timeouts, network errors) are retried up to 3 times with exponential backoff (2s, 4s, 8s) before marking
cdn_error - Stats are fetched alongside the video and saved to the
stats_jsoncolumn on the clip/session record
6. Streaming
Processed videos are served through the /api/video/stream proxy endpoint, which supports byte-range requests for efficient scrubbing. The proxy allowlists Modal, Supabase, and Frame.io hosts.
Local GPU clips bypass the stream proxy entirely -- the browser plays directly from localhost:{port}/result/{job_id}/video for immediate playback while CDN transfer runs in the background. Once the CDN transfer completes, playback switches to the Supabase Storage URL.
Access to the stream proxy is protected by HMAC-signed tokens -- clients call /api/video/sign (which requires authentication) to obtain a time-limited signed URL, then pass that to the stream proxy. The middleware bypass for /api/video/stream (required to avoid 431 errors from range request cookies) is compensated by this token verification.
AI Insights Panel
The AIInsightsPanel component renders the analysis results in five tabs:
Overview
- Summary cards: waves detected, surfers tracked, actions classified, session duration
- AI-generated text summary of the session
- Top actions bar chart showing the most frequent maneuvers
Actions Detected
- Expandable list of all classified actions
- Categories: paddling, riding, turning, cutback, aerial, wipeout, duck dive, popup, floater, bottom turn, top turn, tube ride
- Each action entry includes timestamp -- click to seek the video to that moment
Wave Analysis
- Per-wave breakdown with surfer count, duration, and classified actions
- Click-to-seek timestamps for each wave
Surfer Tracking
- Per-surfer statistics: waves ridden, time in water, total actions
- Links surfer track IDs to student profiles via the
session_surferstable - Coaches can assign tracked surfers to specific students
Timeline
- Chronological event timeline spanning the full session
- Color-coded action markers
- Click to seek to any event in the video
Multi-Clip Sessions
Sessions can contain multiple clips (e.g., different camera angles or time segments). The parseStats function in AIInsightsPanel aggregates data across all clips, merging action counts, wave data, surfer tracking, and timeline events into a unified view.
Stats JSON Structure
The AI backend returns a JSON payload with this general structure:
{
"summary": "AI-generated text summary...",
"duration_seconds": 180,
"total_waves": 12,
"total_surfers": 3,
"actions": [
{
"type": "cutback",
"surfer_id": 1,
"start_ms": 45200,
"end_ms": 47800,
"confidence": 0.92
}
],
"waves": [
{
"wave_number": 1,
"start_ms": 10000,
"end_ms": 25000,
"surfers": [1, 2],
"actions": [...]
}
],
"surfer_tracks": [
{
"track_id": 1,
"waves_ridden": 5,
"total_actions": 18,
"time_in_water_seconds": 120
}
],
"timeline": [
{
"timestamp_ms": 45200,
"type": "cutback",
"surfer_id": 1
}
]
}