Filtering
Apply motion-based filtering to clips and aesthetic filtering to frames to prune low-quality assets during curation.
How it Works
Filtering runs in two passes that balance speed and quality:
- Motion pass (fast): The pipeline decodes lightweight motion vectors and computes motion scores to drop static or near‑static clips at the first filtering stage. This step adds
decoded_motion_dataper clip, then writesmotion_score_global_meanandmotion_score_per_patch_min_256. Clips below thresholds move tovideo.filtered_clips, andvideo.clip_stats.num_filtered_by_motionincrements. - Aesthetic pass (model based): Upstream, the pipeline extracts frames using the
sequencepolicy at a chosentarget_fps. The aesthetic stage readsextracted_frames[sequence-<target_fps>], produces anaesthetic_score, and removes clips below the threshold. These clips move tovideo.filtered_clips, andvideo.clip_stats.num_filtered_by_aestheticincrements.
Before You Start
Motion decoding and aesthetic scoring operate on clip buffers. You must run clipping and encoding first so each clip has a valid buffer (bytes).
Quickstart
Use the pipeline stages or the example script flags to enable motion and aesthetic filtering.
Pipeline Stage
from nemo_curator.pipeline import Pipeline
from nemo_curator.stages.video.filtering.motion_filter import (
MotionVectorDecodeStage,
MotionFilterStage,
)
from nemo_curator.stages.video.filtering.clip_aesthetic_filter import (
ClipAestheticFilterStage,
)
pipe = Pipeline(name="filtering_examples")
# Motion filtering
pipe.add_stage(
MotionVectorDecodeStage(target_fps=2.0, target_duration_ratio=0.5, num_cpus_per_worker=4.0)
)
pipe.add_stage(
MotionFilterStage(
score_only=False,
global_mean_threshold=0.00098,
per_patch_min_256_threshold=0.000001,
motion_filter_batch_size=64,
num_gpus_per_worker=0.5,
verbose=True,
)
)
# Aesthetic filtering (assumes frames extracted upstream)
pipe.add_stage(
ClipAestheticFilterStage(
model_dir="/models",
score_threshold=3.5,
reduction="min",
target_fps=1.0,
num_gpus_per_worker=0.25,
verbose=True,
)
)
pipe.run()Script Flags
# Motion filtering
python tutorials/video/getting-started/video_split_clip_example.py \
... \
--motion-filter enable \
--motion-decode-target-fps 2.0 \
--motion-decode-target-duration-ratio 0.5 \
--motion-decode-cpus-per-worker 4.0 \
--motion-global-mean-threshold 0.00098 \
--motion-per-patch-min-256-threshold 0.000001 \
--motion-score-batch-size 64 \
--motion-score-gpus-per-worker 0.5
# Aesthetic filtering
python tutorials/video/getting-started/video_split_clip_example.py \
... \
--aesthetic-threshold 3.5 \
--aesthetic-reduction min \
--aesthetic-gpus-per-worker 0.25Filtering Options
Motion Filtering
Motion filtering is a two‑step process: first decode motion vectors, then filter clips based on motion scores.
-
Add
MotionVectorDecodeStageto sample motion vectors from each clip.from nemo_curator.stages.video.filtering.motion_filter import MotionVectorDecodeStage decode = MotionVectorDecodeStage( target_fps=2.0, target_duration_ratio=0.5, num_cpus_per_worker=4.0, )This step adds
decoded_motion_datato each clip, or records an error inclip.errors. -
Add
MotionFilterStageto compute motion scores and filter out low‑motion clips.from nemo_curator.stages.video.filtering.motion_filter import MotionFilterStage motion = MotionFilterStage( score_only=False, global_mean_threshold=0.00098, per_patch_min_256_threshold=0.000001, motion_filter_batch_size=64, num_gpus_per_worker=0.5, verbose=True, )- Adds
motion_score_global_meanandmotion_score_per_patch_min_256to each clip. - Moves filtered clips to
video.filtered_clipsand incrementsvideo.clip_stats.num_filtered_by_motion.
- Adds
Parameters
MotionVectorDecodeStage
| Parameter | Type | Default | Description |
|---|---|---|---|
num_cpus_per_worker | float | 6.0 | CPU cores reserved per worker for decoding motion vectors. |
target_fps | float | 2.0 | Target frames per second for sampling motion vectors. |
target_duration_ratio | float | 0.5 | Fraction of each clip’s duration to decode for motion analysis. |
verbose | bool | False | Log warnings and per‑clip issues during decoding. |
MotionFilterStage
| Parameter | Type | Default | Description |
|---|---|---|---|
score_only | bool | False | Compute motion scores without filtering out clips. |
global_mean_threshold | float | 0.00098 | Threshold on the global mean motion score; lower implies less motion. |
per_patch_min_256_threshold | float | 0.000001 | Threshold on the minimum per‑patch score over 256 patches. |
motion_filter_batch_size | int | 256 | Batch size for GPU computation; decrease to reduce memory usage. |
num_gpus_per_worker | float | 0.0 | GPUs reserved per worker for motion scoring (0 uses CPU path). |
verbose | bool | False | Log per‑clip decisions and scores. |
Aesthetic Filtering
Aesthetic filtering works best when you prepare frames first, then score clips using a CLIP‑based aesthetic model.
-
Extract frames earlier in the pipeline. Use a frame extraction stage with a
sequencepolicy and set atarget_fpsthat matches the aesthetic stage. Refer to Frame Extraction for guidance.# Example: upstream frame extraction snippet (pseudocode) from nemo_curator.stages.video.frame_extraction import FrameExtractionStage frames = FrameExtractionStage(policy="sequence", target_fps=1.0)Frame Requirements:
- Use
sequenceframe extraction policy. - Match
target_fpshere and in the aesthetic stage. - Ensure
clip.extracted_framescontains frames for the signaturesequence-<target_fps>.
- Use
-
Add
ClipAestheticFilterStageto score each clip and drop clips below a threshold.from nemo_curator.stages.video.filtering.clip_aesthetic_filter import ClipAestheticFilterStage aesthetic = ClipAestheticFilterStage( model_dir="/models", score_threshold=3.5, reduction="min", # or "mean" target_fps=1.0, num_gpus_per_worker=0.25, verbose=True, )- Adds
aesthetic_scoreto each clip. - Moves filtered clips to
video.filtered_clipsand incrementsvideo.clip_stats.num_filtered_by_aesthetic.
- Adds
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
model_dir | str | "models/clip_aesthetic" | Directory for model weights; downloaded on each node if missing. |
score_threshold | float | 0.5 | Minimum aesthetic score required to keep a clip. |
reduction | min | "min" | Aggregate frame‑level scores using mean or minimum. |
target_fps | float | 1.0 | Frame sampling rate expected to match extracted frames. |
num_gpus_per_worker | float | 0.25 | GPUs reserved per worker for aesthetic scoring. |
verbose | bool | False | Log per‑clip aesthetic scores and decisions. |