SAM3-Skin-HeartRate — Blood Flow Visualization via Remote Photoplethysmography (rPPG)

Based on ajsteele/faceHR — the original idea and core algorithm (Eulerian color magnification of skin pixels) come from that project. This repo is a reimplementation as an interactive Gradio web app with automatic skin segmentation powered by Meta's SAM 3 (Segment Anything Model 3).

Built with Claude Code.

What It Does

Extracts the photoplethysmographic (PPG) signal from video of human skin, then amplifies the subtle color changes caused by blood flow so they become visible to the naked eye. The end goal is to make arterial pulsation visible enough to help locate arteries (e.g. for arterial blood draws).

Screenshot & Output

A sample output video is available in output.mp4.

Quick Start

uv run app.py

This opens a Gradio web UI on 0.0.0.0:7860. Upload or record a video, click on skin to pick the chroma reference, adjust parameters, and process.

Dependencies (gradio, numpy, scipy, opencv-python-headless, matplotlib, torch, transformers, cupy-cuda12x) are resolved automatically by uv via PEP 723 inline metadata.

Requirements

Python >= 3.12
uv (recommended) or pip
CUDA-capable GPU (for CuPy and SAM 3 skin segmentation)

How It Works

Skin segmentation — Meta's SAM 3 automatically segments skin regions using text-prompted segmentation; alternatively the user clicks a skin pixel and chroma-keying in YUV space classifies skin vs. non-skin.
Spatial averaging — Mean color of skin pixels per frame produces a raw PPG time-series.
Temporal bandpass — Two cascaded moving averages (high-pass to remove baseline drift, low-pass to smooth noise) isolate the cardiac frequency band.
Eulerian color magnification — The filtered PPG signal is amplified and added back onto skin pixels in YUV chrominance space, making blood flow color changes visible.

Detailed Math

1. Skin Segmentation via Chroma Keying

The user clicks a skin pixel. Its YUV chrominance (U, V components only — luminance Y is discarded) is stored as skin_chroma.

For every frame, each pixel is classified as skin or not-skin:

distance² = (U_pixel - U_skin)² + (V_pixel - V_skin)²
is_skin = distance² < chroma_similarity   (default threshold = 100)

This is a simple Euclidean distance in UV space (squared, to avoid a sqrt). The result is a boolean mask skin_key.

2. Spatial Averaging — Extracting the PPG Signal

For each frame, the mean color of all skin-keyed pixels is computed:

ppg_yuv[i] = mean(frame_yuv[skin_key])   → shape (3,) = [Y, U, V]
ppg_rgb[i] = mean(frame_rgb[skin_key])   → shape (3,) = [B, G, R]

Over N frames this produces a 1D time-series per channel: the raw PPG signal. The cardiac pulse modulates skin color (mainly via hemoglobin absorption), so this average tracks blood volume changes.

3. Temporal Filtering — `magnify_colour_ma()`

The raw PPG is noisy and has a slow-moving baseline (lighting drift, movement). Two moving-average steps clean it:

ppg = ppg - moving_average(ppg, n_bg_ma=90)   # high-pass: remove baseline drift
ppg = moving_average(ppg, n_smooth_ma=6)       # low-pass: smooth out noise

This is a bandpass filter implemented as two cascaded moving averages:

Subtracting a wide (90-frame) moving average removes frequencies below ~fps/90 Hz (breathing, lighting changes).
Applying a narrow (6-frame) moving average suppresses high-frequency noise above ~fps/6 Hz.

The result is then normalized to a target amplitude delta:

ppg_filtered = delta * ppg / max(|ppg|)

4. "White" Channel — `ppg_w_ma`

ppg_w_ma = mean(ppg_rgb_ma, axis=1)   # average of filtered B, G, R

This collapses the three RGB channels into a single scalar per frame — a luminance-like "white" PPG signal. This is what gets amplified back onto the video.

5. Eulerian Color Magnification (the Re-Rendering)

In the second pass over the video, each frame is reconstructed with the PPG signal amplified back onto skin pixels:

# For each frame i:
colours_w = [0, skin_key * ppg_w_ma[i], skin_key * ppg_w_ma[i]]  # [Y=0, U=signal, V=signal]
output = frame_yuv + colours_w * 10000   # amplify and add in YUV space

Key details:

The Y (luminance) component is left at 0 — only chrominance (U, V) is boosted. This avoids brightness flicker and emphasizes the color shift caused by blood.
The multiplier 10000 is the amplification factor.
Only pixels inside skin_key receive the boost (others get 0).
The result is converted back to BGR for display/saving.

This is a simplified form of Eulerian Video Magnification (Wu et al., 2012): instead of spatially decomposing into pyramids, it uses a single spatial region (skin mask) and temporal bandpass (moving averages).

6. Optional: Welch Spectral Estimation (`Welch_cuda()`)

A sliding 256-frame window with Welch's method estimates the power spectral density of the PPG signal. The dominant frequency peak in the 0.8–3 Hz range gives the heart rate in Hz (×60 = BPM). This is implemented but commented out.

Pipeline Summary

Video frames
  → YUV conversion
  → Chroma-key skin mask (UV distance)
  → Spatial mean of skin pixels per frame → raw PPG time-series
  → Temporal bandpass (subtract slow MA, apply fast MA) → filtered PPG
  → Average RGB channels → scalar "white" PPG
  → Multiply back onto skin pixels in UV space × 10000
  → Output: video with amplified blood flow color

Key Parameters

Parameter	Default	Purpose
`chroma_similarity`	100 (= 10²)	Skin detection threshold in UV² space
`n_bg_ma`	90	High-pass: frames for baseline removal
`n_smooth_ma`	6	Low-pass: frames for noise smoothing
`delta`	1	Normalized PPG amplitude
Amplification	10000	Multiplier when adding signal back to frames

Future Directions for Arterial Localization

Per-pixel PPG instead of spatial average: The current code averages all skin pixels into one scalar. To locate arteries, you need the PPG signal per pixel (or per small patch). Compute magnify_colour_ma on a per-pixel or patch-grid basis.
Narrow bandpass around cardiac frequency: Use ~0.8–2.5 Hz (48–150 BPM). The moving-average bandpass is simple but imprecise — consider a Butterworth or FIR filter for tighter control.
Amplitude map: The amplitude of the per-pixel PPG correlates with local blood volume pulsation. Arteries will show higher amplitude than veins or capillary beds. Render this as a heatmap overlay.
Phase map: Arterial sites pulse earlier than venous. A phase-delay map (via cross-correlation or Hilbert transform against a reference PPG) can further disambiguate arteries.
GPU acceleration: The current code uses CuPy/cuSignal. For a Gradio app, consider whether GPU is available or if NumPy/SciPy suffices for shorter clips.

Credits

Original idea and algorithm: ajsteele/faceHR
Based on Eulerian Video Magnification (Wu et al., 2012)
Meta's SAM 3 (Segment Anything Model 3) for automatic skin segmentation (HuggingFace)

License

AGPL-3.0

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
output.mp4		output.mp4
screenshot.png		screenshot.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SAM3-Skin-HeartRate — Blood Flow Visualization via Remote Photoplethysmography (rPPG)

What It Does

Screenshot & Output

Quick Start

Requirements

How It Works

Detailed Math

1. Skin Segmentation via Chroma Keying

2. Spatial Averaging — Extracting the PPG Signal

3. Temporal Filtering — `magnify_colour_ma()`

4. "White" Channel — `ppg_w_ma`

5. Eulerian Color Magnification (the Re-Rendering)

6. Optional: Welch Spectral Estimation (`Welch_cuda()`)

Pipeline Summary

Key Parameters

Future Directions for Arterial Localization

Credits

License

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SAM3-Skin-HeartRate — Blood Flow Visualization via Remote Photoplethysmography (rPPG)

What It Does

Screenshot & Output

Quick Start

Requirements

How It Works

Detailed Math

1. Skin Segmentation via Chroma Keying

2. Spatial Averaging — Extracting the PPG Signal

3. Temporal Filtering — magnify_colour_ma()

4. "White" Channel — ppg_w_ma

5. Eulerian Color Magnification (the Re-Rendering)

6. Optional: Welch Spectral Estimation (Welch_cuda())

Pipeline Summary

Key Parameters

Future Directions for Arterial Localization

Credits

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages

3. Temporal Filtering — `magnify_colour_ma()`

4. "White" Channel — `ppg_w_ma`

6. Optional: Welch Spectral Estimation (`Welch_cuda()`)