Skip to content

thiswillbeyourgithub/SAM3-Skin-HeartRate

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SAM3-Skin-HeartRate — Blood Flow Visualization via Remote Photoplethysmography (rPPG)

Based on ajsteele/faceHR — the original idea and core algorithm (Eulerian color magnification of skin pixels) come from that project. This repo is a reimplementation as an interactive Gradio web app with automatic skin segmentation powered by Meta's SAM 3 (Segment Anything Model 3).

Built with Claude Code.

What It Does

Extracts the photoplethysmographic (PPG) signal from video of human skin, then amplifies the subtle color changes caused by blood flow so they become visible to the naked eye. The end goal is to make arterial pulsation visible enough to help locate arteries (e.g. for arterial blood draws).

Screenshot & Output

Screenshot of the Gradio UI

A sample output video is available in output.mp4.

Quick Start

uv run app.py

This opens a Gradio web UI on 0.0.0.0:7860. Upload or record a video, click on skin to pick the chroma reference, adjust parameters, and process.

Dependencies (gradio, numpy, scipy, opencv-python-headless, matplotlib, torch, transformers, cupy-cuda12x) are resolved automatically by uv via PEP 723 inline metadata.

Requirements

  • Python >= 3.12
  • uv (recommended) or pip
  • CUDA-capable GPU (for CuPy and SAM 3 skin segmentation)

How It Works

  1. Skin segmentation — Meta's SAM 3 automatically segments skin regions using text-prompted segmentation; alternatively the user clicks a skin pixel and chroma-keying in YUV space classifies skin vs. non-skin.
  2. Spatial averaging — Mean color of skin pixels per frame produces a raw PPG time-series.
  3. Temporal bandpass — Two cascaded moving averages (high-pass to remove baseline drift, low-pass to smooth noise) isolate the cardiac frequency band.
  4. Eulerian color magnification — The filtered PPG signal is amplified and added back onto skin pixels in YUV chrominance space, making blood flow color changes visible.

Detailed Math

1. Skin Segmentation via Chroma Keying

The user clicks a skin pixel. Its YUV chrominance (U, V components only — luminance Y is discarded) is stored as skin_chroma.

For every frame, each pixel is classified as skin or not-skin:

distance² = (U_pixel - U_skin)² + (V_pixel - V_skin)²
is_skin = distance² < chroma_similarity   (default threshold = 100)

This is a simple Euclidean distance in UV space (squared, to avoid a sqrt). The result is a boolean mask skin_key.

2. Spatial Averaging — Extracting the PPG Signal

For each frame, the mean color of all skin-keyed pixels is computed:

ppg_yuv[i] = mean(frame_yuv[skin_key])   → shape (3,) = [Y, U, V]
ppg_rgb[i] = mean(frame_rgb[skin_key])   → shape (3,) = [B, G, R]

Over N frames this produces a 1D time-series per channel: the raw PPG signal. The cardiac pulse modulates skin color (mainly via hemoglobin absorption), so this average tracks blood volume changes.

3. Temporal Filtering — magnify_colour_ma()

The raw PPG is noisy and has a slow-moving baseline (lighting drift, movement). Two moving-average steps clean it:

ppg = ppg - moving_average(ppg, n_bg_ma=90)   # high-pass: remove baseline drift
ppg = moving_average(ppg, n_smooth_ma=6)       # low-pass: smooth out noise

This is a bandpass filter implemented as two cascaded moving averages:

  • Subtracting a wide (90-frame) moving average removes frequencies below ~fps/90 Hz (breathing, lighting changes).
  • Applying a narrow (6-frame) moving average suppresses high-frequency noise above ~fps/6 Hz.

The result is then normalized to a target amplitude delta:

ppg_filtered = delta * ppg / max(|ppg|)

4. "White" Channel — ppg_w_ma

ppg_w_ma = mean(ppg_rgb_ma, axis=1)   # average of filtered B, G, R

This collapses the three RGB channels into a single scalar per frame — a luminance-like "white" PPG signal. This is what gets amplified back onto the video.

5. Eulerian Color Magnification (the Re-Rendering)

In the second pass over the video, each frame is reconstructed with the PPG signal amplified back onto skin pixels:

# For each frame i:
colours_w = [0, skin_key * ppg_w_ma[i], skin_key * ppg_w_ma[i]]  # [Y=0, U=signal, V=signal]
output = frame_yuv + colours_w * 10000   # amplify and add in YUV space

Key details:

  • The Y (luminance) component is left at 0 — only chrominance (U, V) is boosted. This avoids brightness flicker and emphasizes the color shift caused by blood.
  • The multiplier 10000 is the amplification factor.
  • Only pixels inside skin_key receive the boost (others get 0).
  • The result is converted back to BGR for display/saving.

This is a simplified form of Eulerian Video Magnification (Wu et al., 2012): instead of spatially decomposing into pyramids, it uses a single spatial region (skin mask) and temporal bandpass (moving averages).

6. Optional: Welch Spectral Estimation (Welch_cuda())

A sliding 256-frame window with Welch's method estimates the power spectral density of the PPG signal. The dominant frequency peak in the 0.8–3 Hz range gives the heart rate in Hz (×60 = BPM). This is implemented but commented out.

Pipeline Summary

Video frames
  → YUV conversion
  → Chroma-key skin mask (UV distance)
  → Spatial mean of skin pixels per frame → raw PPG time-series
  → Temporal bandpass (subtract slow MA, apply fast MA) → filtered PPG
  → Average RGB channels → scalar "white" PPG
  → Multiply back onto skin pixels in UV space × 10000
  → Output: video with amplified blood flow color

Key Parameters

Parameter Default Purpose
chroma_similarity 100 (= 10²) Skin detection threshold in UV² space
n_bg_ma 90 High-pass: frames for baseline removal
n_smooth_ma 6 Low-pass: frames for noise smoothing
delta 1 Normalized PPG amplitude
Amplification 10000 Multiplier when adding signal back to frames

Future Directions for Arterial Localization

  1. Per-pixel PPG instead of spatial average: The current code averages all skin pixels into one scalar. To locate arteries, you need the PPG signal per pixel (or per small patch). Compute magnify_colour_ma on a per-pixel or patch-grid basis.
  2. Narrow bandpass around cardiac frequency: Use ~0.8–2.5 Hz (48–150 BPM). The moving-average bandpass is simple but imprecise — consider a Butterworth or FIR filter for tighter control.
  3. Amplitude map: The amplitude of the per-pixel PPG correlates with local blood volume pulsation. Arteries will show higher amplitude than veins or capillary beds. Render this as a heatmap overlay.
  4. Phase map: Arterial sites pulse earlier than venous. A phase-delay map (via cross-correlation or Hilbert transform against a reference PPG) can further disambiguate arteries.
  5. GPU acceleration: The current code uses CuPy/cuSignal. For a Gradio app, consider whether GPU is available or if NumPy/SciPy suffices for shorter clips.

Credits

License

AGPL-3.0

About

Visualizing blood flow from a camera feed using remote photoplethysmography and automatic skin segmentation powered by SAM 3.

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages