Time¶
Overview¶
Time is expressed as integer multiples of arbitrary units of time called a time_base
. There are different contexts that have different time bases: Stream
has Stream.time_base
, CodecContext
has CodecContext.time_base
, and Container
has av.TIME_BASE
.
>>> fh = av.open(path)
>>> video = fh.streams.video[0]
>>> video.time_base
Fraction(1, 25)
Attributes that represent time on those objects will be in that object’s time_base
:
>>> video.duration
168
>>> float(video.duration * video.time_base)
6.72
Packet
has a Packet.pts
and Packet.dts
(“presentation” and “decode” time stamps), and Frame
has a Frame.pts
(“presentation” time stamp). Both have a time_base
attribute, but it defaults to the time base of the object that handles them. For packets that is streams. For frames it is streams when decoding, and codec contexts when encoding (which is strange, but it is what it is).
In many cases a stream has a time base of 1 / frame_rate
, and then its frames have incrementing integers for times (0, 1, 2, etc.). Those frames take place at pts * time_base
or 0 / frame_rate
, 1 / frame_rate
, 2 / frame_rate
, etc..
>>> p, f = get_nth_packet_and_frame(fh, skip=1)
>>> p.time_base
Fraction(1, 25)
>>> p.dts
1
>>> f.time_base
Fraction(1, 25)
>>> f.pts
1
For convenience, Frame.time
is a float
in seconds:
>>> f.time
0.04
FFmpeg Internals¶
Note
Time in FFmpeg is not 100% clear to us (see Authority of Documentation). At times the FFmpeg documentation and canonical seeming posts in the forums appear contradictory. We’ve experimented with it, and what follows is the picture that we are operating under.
Both AVStream and AVCodecContext have a time_base
member. However, they are used for different purposes, and (this author finds) it is too easy to abstract the concept too far.
When there is no time_base
(such as on AVFormatContext), there is an implicit time_base
of 1/AV_TIME_BASE
.
Encoding¶
For encoding, you (the PyAV developer / FFmpeg “user”) must set AVCodecContext.time_base, ideally to the inverse of the frame rate (or so the library docs say to do if your frame rate is fixed; we’re not sure what to do if it is not fixed), and you may set AVStream.time_base as a hint to the muxer. After you open all the codecs and call avformat_write_header, the stream time base may change, and you must respect it. We don’t know if the codec time base may change, so we will make the safer assumption that it may and respect it as well.
You then prepare AVFrame.pts in AVCodecContext.time_base. The encoded AVPacket.pts is simply copied from the frame by the library, and so is still in the codec’s time base. You must rescale it to AVStream.time_base before muxing (as all stream operations assume the packet time is in stream time base).
Decoding¶
Everything is in AVStream.time_base because we don’t have to rebase it into codec time base (as it generally seems to be the case that AVCodecContext doesn’t really care about your timing; I wish there was a way to assert this without reading every codec).