Skip to content

Memory leak when remuxing segments with add_stream_from_template(..., opaque=False) #2135

@ncheng89

Description

@ncheng89

Summary

When repeatedly remuxing an input into many short MP4 segments using PyAV, RSS grows significantly if I create output streams via:

out.add_stream_from_template(template_stream, opaque=False)

However, using opaque=True keeps RSS stable over hundreds of segments.

This is reproducible on PyAV 16.1.0 with a long H.264/AAC MP4 input and does not require decoding/encoding (packet remux only).

Environment

PyAV: 16.1.0

Python: 3.11

OS: Linux

Input: long MP4 (H.264 + AAC), segmented into 10s chunks

Reproduction
A minimal script that continuously demuxes packets from an input container and muxes them into a sequence of output MP4 files (10s each). Each segment creates a new output container and streams via add_stream_from_template.

Key difference: opaque=False vs opaque=True.

Repro code

import av, os, math, psutil

SEGMENT_DURATION = 10
OUTPUT_DIR = "segments"
INPUT_URL = "http://commondatastorage.googleapis.com/gtv-videos-bucket/sample/BigBuckBunny.mp4"
os.makedirs(OUTPUT_DIR, exist_ok=True)

in_container = av.open(INPUT_URL)
video_stream = in_container.streams.video[0]
audio_stream = in_container.streams.audio[0] if in_container.streams.audio else None

segment_index = 0
video_pts = 0
audio_pts = 0
segment_start_time = None

def start_new_segment(opaque_flag: bool):
    global out_container, out_video_stream, out_audio_stream, video_pts, audio_pts

    path = os.path.join(OUTPUT_DIR, f"segment_{segment_index:06d}.mp4")
    out_container = av.open(path, mode="w")

    out_video_stream = out_container.add_stream_from_template(
        template=video_stream,
        rate=video_stream.average_rate,
        opaque=opaque_flag,
    )
    out_video_stream.time_base = video_stream.time_base

    out_audio_stream = None
    if audio_stream:
        out_audio_stream = out_container.add_stream_from_template(
            audio_stream,
            opaque=opaque_flag,
        )
        out_audio_stream.time_base = audio_stream.time_base

    video_pts = 0
    audio_pts = 0

    rss = psutil.Process(os.getpid()).memory_info().rss / 1024 / 1024
    print(f"seg={segment_index}, rss={rss:.2f}MB, opaque={opaque_flag}, av={av.__version__}")

# CHANGE THIS FLAG:
OPAQUE_FLAG = False  # True stays stable

start_new_segment(OPAQUE_FLAG)

for packet in in_container.demux((video_stream, audio_stream) if audio_stream else (video_stream,)):
    if packet.pts is None:
        continue

    if packet.stream.type == "video":
        if segment_start_time is None:
            segment_start_time = float(packet.pts * packet.time_base)
        current_time = float(packet.pts * packet.time_base)

        if current_time - segment_start_time >= SEGMENT_DURATION:
            out_container.close()
            segment_index += 1
            segment_start_time = current_time
            start_new_segment(OPAQUE_FLAG)

        packet.pts = video_pts
        packet.dts = video_pts
        packet.stream = out_video_stream
        out_container.mux(packet)
        try:
            packet.unref()
        except Exception:
            pass

        video_pts += 1

    elif packet.stream.type == "audio" and out_audio_stream is not None:
        packet.pts = audio_pts
        packet.dts = audio_pts
        packet.stream = out_audio_stream
        out_container.mux(packet)
        try:
            packet.unref()
        except Exception:
            pass

        audio_pts += packet.duration or 0

out_container.close()
in_container.close()

Observed behavior

With opaque=False, RSS grows quickly across segments (e.g. ~100MB → ~200MB+ over dozens of segments; in longer runs it can keep climbing).

With opaque=True, RSS remains almost flat over hundreds of segments (stable within ~1MB).

Expected behavior

For pure remuxing (no encode/decode), opaque=False and opaque=True should not exhibit such a large difference in memory behavior. At minimum, RSS should not keep growing segment after segment when containers are closed.

Additional findings

I tried patching add_stream_from_template:

Current PyAV code (opaque=False branch):

codec_obj = Codec(template.codec_context.codec.name, "w")

If I change it to:

codec_obj = Codec(template.codec_context.codec.name, "r")

then the remux RSS becomes stable even with opaque=False.

However, this breaks encoding use-cases: using add_stream_from_template(..., opaque=False) for actual encoding fails with:

ValueError(22, 'Invalid argument', 'avcodec_send_frame()')

So simply switching "w" → "r" is not a correct fix, but it strongly suggests the memory growth is related to the "w" (encoder) path / codec context creation in the opaque=False branch.

Hypothesis / suggestion

For remuxing, add_stream_from_template might not need to allocate a new AVCodecContext at all. A “copy codecpar only” fast path (e.g. create stream + avcodec_parameters_copy and avoid encoder/decoder context allocation) could avoid the memory growth and also keep opaque=False meaningful for encoding-related workflows.

When opaque is false, output...

root@1dee0f1ab738:/opt# python3 t.py     
seg=0, rss=103.49MB, opaque=False, av=16.1.0
seg=1, rss=134.75MB, opaque=False, av=16.1.0
seg=2, rss=166.95MB, opaque=False, av=16.1.0
seg=3, rss=174.98MB, opaque=False, av=16.1.0
seg=4, rss=187.02MB, opaque=False, av=16.1.0
seg=5, rss=193.11MB, opaque=False, av=16.1.0
seg=6, rss=195.13MB, opaque=False, av=16.1.0
seg=7, rss=187.11MB, opaque=False, av=16.1.0
seg=8, rss=191.12MB, opaque=False, av=16.1.0
seg=9, rss=193.13MB, opaque=False, av=16.1.0
seg=10, rss=193.14MB, opaque=False, av=16.1.0
seg=11, rss=203.15MB, opaque=False, av=16.1.0
seg=12, rss=193.25MB, opaque=False, av=16.1.0
seg=13, rss=199.27MB, opaque=False, av=16.1.0
seg=14, rss=199.28MB, opaque=False, av=16.1.0
seg=15, rss=199.29MB, opaque=False, av=16.1.0
seg=16, rss=203.36MB, opaque=False, av=16.1.0
seg=17, rss=195.34MB, opaque=False, av=16.1.0
seg=18, rss=205.42MB, opaque=False, av=16.1.0
seg=19, rss=205.43MB, opaque=False, av=16.1.0
seg=20, rss=205.43MB, opaque=False, av=16.1.0
seg=21, rss=207.45MB, opaque=False, av=16.1.0
seg=22, rss=195.32MB, opaque=False, av=16.1.0
seg=23, rss=203.32MB, opaque=False, av=16.1.0
seg=24, rss=205.34MB, opaque=False, av=16.1.0
seg=25, rss=195.32MB, opaque=False, av=16.1.0
seg=26, rss=203.33MB, opaque=False, av=16.1.0
seg=27, rss=209.35MB, opaque=False, av=16.1.0
seg=28, rss=201.33MB, opaque=False, av=16.1.0
seg=29, rss=207.33MB, opaque=False, av=16.1.0
seg=30, rss=209.34MB, opaque=False, av=16.1.0

When opaque is true, output

root@1dee0f1ab738:/opt# python3 t.py 
seg=0, rss=104.32MB, opaque=True, av=16.1.0
seg=1, rss=104.55MB, opaque=True, av=16.1.0
seg=2, rss=104.55MB, opaque=True, av=16.1.0
seg=3, rss=104.55MB, opaque=True, av=16.1.0
seg=4, rss=104.55MB, opaque=True, av=16.1.0
seg=5, rss=104.55MB, opaque=True, av=16.1.0
seg=6, rss=104.55MB, opaque=True, av=16.1.0
seg=7, rss=104.55MB, opaque=True, av=16.1.0
seg=8, rss=104.56MB, opaque=True, av=16.1.0
seg=9, rss=104.56MB, opaque=True, av=16.1.0
seg=10, rss=104.56MB, opaque=True, av=16.1.0
seg=11, rss=104.56MB, opaque=True, av=16.1.0
seg=12, rss=104.56MB, opaque=True, av=16.1.0
seg=13, rss=104.56MB, opaque=True, av=16.1.0
seg=14, rss=104.56MB, opaque=True, av=16.1.0
seg=15, rss=104.56MB, opaque=True, av=16.1.0
seg=16, rss=104.56MB, opaque=True, av=16.1.0
seg=17, rss=104.56MB, opaque=True, av=16.1.0
seg=18, rss=104.56MB, opaque=True, av=16.1.0
seg=19, rss=104.56MB, opaque=True, av=16.1.0
seg=20, rss=104.56MB, opaque=True, av=16.1.0
seg=21, rss=104.56MB, opaque=True, av=16.1.0
seg=22, rss=104.56MB, opaque=True, av=16.1.0
seg=23, rss=104.56MB, opaque=True, av=16.1.0
seg=24, rss=104.56MB, opaque=True, av=16.1.0
seg=25, rss=104.56MB, opaque=True, av=16.1.0
seg=26, rss=104.56MB, opaque=True, av=16.1.0
seg=27, rss=104.57MB, opaque=True, av=16.1.0
seg=28, rss=104.57MB, opaque=True, av=16.1.0
seg=29, rss=104.57MB, opaque=True, av=16.1.0
seg=30, rss=104.57MB, opaque=True, av=16.1.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions