Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Per segment chunks #8272

Merged
merged 139 commits into from
Sep 24, 2024
Merged
Changes from 1 commit
Commits
Show all changes
139 commits
Select commit Hold shift + click to select a range
b32a9eb
Update frame provider and media cache
zhiltsov-max Jul 25, 2024
cb4ff93
t
zhiltsov-max Jul 25, 2024
d49233c
t
zhiltsov-max Jul 30, 2024
146a896
Support static chunk building, fix av memory leak, add caching media …
zhiltsov-max Aug 1, 2024
52d1bac
Refactor static chunk generation - extract function, revise threading
zhiltsov-max Aug 2, 2024
0c53436
Refactor and fix task chunk creation from segment chunks, any storage
zhiltsov-max Aug 2, 2024
c166123
Fix chunk number validation
zhiltsov-max Aug 5, 2024
630c97e
Enable formatting for updated components
zhiltsov-max Aug 5, 2024
8d710e7
Remove the checksum field
zhiltsov-max Aug 5, 2024
654a827
Be consistent about returned task chunk types (allow video chunks)
zhiltsov-max Aug 6, 2024
12e5f2a
Support iterator input in video chunk writing
zhiltsov-max Aug 6, 2024
a79a681
Fix type annotation
zhiltsov-max Aug 6, 2024
d5118a2
Refactor video reader memory leak fix, add to reader with manifest
zhiltsov-max Aug 6, 2024
1b429cf
Disable threading in video reading in frame provider
zhiltsov-max Aug 6, 2024
d512312
Fix keyframe search
zhiltsov-max Aug 6, 2024
167ee12
Return frames as generator in dynamic chunk creation
zhiltsov-max Aug 6, 2024
88a9cb2
Update chunk requests in UI
zhiltsov-max Aug 7, 2024
30bf8fd
Update cache indices in FrameDecoder, enable video play
zhiltsov-max Aug 7, 2024
ee3c905
Fix frame retrieval for video
zhiltsov-max Aug 7, 2024
dc03220
Fix frame reading in updated dynamic cache building
zhiltsov-max Aug 7, 2024
4bb8a74
Fix invalid frame quality
zhiltsov-max Aug 9, 2024
f7d2c4c
Fix video reading in media_extractors - exception handling, frame mis…
zhiltsov-max Aug 9, 2024
34d9ca0
Allow disabling static chunks, add seamless switching
zhiltsov-max Aug 9, 2024
8c97967
Extend code formatting
zhiltsov-max Aug 9, 2024
a0fd0ba
Rename function argument
zhiltsov-max Aug 9, 2024
c0480c9
Rename configuration parameter
zhiltsov-max Aug 9, 2024
5caf283
Add av version comment
zhiltsov-max Aug 12, 2024
efbe3a0
Refactor av video reading
zhiltsov-max Aug 12, 2024
fb1284d
Fix manifest access
zhiltsov-max Aug 12, 2024
8edcfc5
Add migration
zhiltsov-max Aug 12, 2024
51a7f83
Update downloading from cloud storage for packed data in task creation
zhiltsov-max Aug 12, 2024
5a2a746
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Aug 12, 2024
65e4174
Update changelog
zhiltsov-max Aug 12, 2024
61f1735
Merge remote-tracking branch 'origin/zm/job-chunks' into zm/job-chunks
zhiltsov-max Aug 12, 2024
34f972f
Update migration name
zhiltsov-max Aug 12, 2024
2bb2b17
Polish some code
zhiltsov-max Aug 12, 2024
3788917
Fix frame retrieval by id
zhiltsov-max Aug 12, 2024
f695ae1
Remove extra import
zhiltsov-max Aug 12, 2024
14a9033
Fix frame access in gt jobs
zhiltsov-max Aug 12, 2024
e8bebe9
Fix frame access in export
zhiltsov-max Aug 12, 2024
bbef52f
Fix frame iteration for frame step and excluded frames, fix export in…
zhiltsov-max Aug 12, 2024
3d5bb52
Remove unused import
zhiltsov-max Aug 13, 2024
0e9c5c8
Fix error check in test
zhiltsov-max Aug 13, 2024
351bdc8
Fix cleanup in test
zhiltsov-max Aug 13, 2024
a71852c
Add handling for disabled static cache during task creation
zhiltsov-max Aug 13, 2024
d90ca0d
Refactor some code
zhiltsov-max Aug 13, 2024
03e749a
Fix downloading for cloud data in task creation
zhiltsov-max Aug 13, 2024
c0822a0
Fix preview reading for projects
zhiltsov-max Aug 13, 2024
56d413f
Fix failing sdk tests
zhiltsov-max Aug 13, 2024
48f4794
Fix other failing sdk tests
zhiltsov-max Aug 13, 2024
5c0cc1a
Improve logging for migration
zhiltsov-max Aug 14, 2024
5abd891
Fix invalid starting index
zhiltsov-max Aug 14, 2024
749b970
Fix frame reading in lambda functions
zhiltsov-max Aug 14, 2024
9105cd3
Fix unintended frame indexing changes
zhiltsov-max Aug 14, 2024
8dafcbe
Fix various indexing errors in media extractors
zhiltsov-max Aug 14, 2024
4cbf82f
Fix temp resource cleanup in server tests
zhiltsov-max Aug 14, 2024
88c34a3
Refactor some code
zhiltsov-max Aug 15, 2024
b0fd006
Remove duplicated tests
zhiltsov-max Aug 15, 2024
2eac04a
Remove extra change
zhiltsov-max Aug 15, 2024
640518c
Fix method name, remove extra method
zhiltsov-max Aug 15, 2024
3a246b3
Remove some shared code in tests, add temp data cleanup
zhiltsov-max Aug 15, 2024
a0704f4
Add checks for successful task creation in tests
zhiltsov-max Aug 15, 2024
cf026ef
Fix invalid variable access in test
zhiltsov-max Aug 15, 2024
f73cef3
Update default cache location in test checks
zhiltsov-max Aug 15, 2024
258c800
Update manifest validation logic, allow manifest input in any task da…
zhiltsov-max Aug 16, 2024
b3ae317
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Aug 16, 2024
5e89ef4
Add task chunk caching, refactor chunk building
zhiltsov-max Aug 16, 2024
c5edcda
Refactor some code
zhiltsov-max Aug 16, 2024
7f5c722
Refactor some code
zhiltsov-max Aug 16, 2024
daf4035
Improve parameter name
zhiltsov-max Aug 16, 2024
8c1b82c
Fix function call
zhiltsov-max Aug 16, 2024
f172865
Add basic test set for meta, frames, and chunks reading in tasks
zhiltsov-max Aug 16, 2024
aacceee
Move class declaration for pylint compatibility
zhiltsov-max Aug 16, 2024
c8dbb7c
Add missing original chunk type field in job responses
zhiltsov-max Aug 16, 2024
6b9a3e9
Add tests for job data access
zhiltsov-max Aug 16, 2024
f5661e4
Update test assets
zhiltsov-max Aug 16, 2024
754757f
Clean imports
zhiltsov-max Aug 16, 2024
0c001a5
Python 3.8 compatibility
zhiltsov-max Aug 16, 2024
a9390eb
Python 3.8 compatibility
zhiltsov-max Aug 17, 2024
d2b1385
Python 3.8 compatibility
zhiltsov-max Aug 17, 2024
c9a5e31
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Aug 19, 2024
621afa7
Add logging into shell command runs, fix invalid redis-cli invocation…
zhiltsov-max Aug 19, 2024
e40ffd1
Merge remote-tracking branch 'origin/zm/job-chunks' into zm/job-chunks
zhiltsov-max Aug 19, 2024
08c9f01
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Aug 19, 2024
92a19f4
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Aug 21, 2024
441d0e7
Allow calling flushall in redis in helm tests
zhiltsov-max Aug 21, 2024
0963f94
Update comment
zhiltsov-max Aug 21, 2024
0d78e63
Update redis cleanup command
zhiltsov-max Aug 21, 2024
f53948d
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Aug 23, 2024
e69f2b7
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Aug 23, 2024
1a9a813
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Aug 23, 2024
e4db8ad
Reuse _get
zhiltsov-max Aug 28, 2024
b1c54f9
Make get_checksum private
zhiltsov-max Aug 28, 2024
5312b00
Add get_raw_data_dirname to the Data model
zhiltsov-max Aug 28, 2024
3c117fe
Make SegmentFrameProvider available in make_frame_provider
zhiltsov-max Aug 28, 2024
98eff81
Remove extra variable
zhiltsov-max Aug 28, 2024
316ec78
Include both cases of CVAT_ALLOW_STATIC_CACHE in CI checks
zhiltsov-max Aug 28, 2024
ebed825
Merge remote-tracking branch 'origin/zm/job-chunks' into zm/job-chunks
zhiltsov-max Aug 28, 2024
92f6083
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Aug 28, 2024
2b6e987
Remove extra import
zhiltsov-max Aug 28, 2024
f67a1a2
Update changelog
zhiltsov-max Sep 5, 2024
d72fe85
Refactor cache keys in media cache
zhiltsov-max Sep 5, 2024
d5bfb88
Refactor selective segment chunk creation
zhiltsov-max Sep 5, 2024
c5a1197
Remove the breaking change in the chunk retrieval API, add a new inde…
zhiltsov-max Sep 6, 2024
a5cf3b7
Update UI to use the new chunk index parameter
zhiltsov-max Sep 7, 2024
069f48c
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Sep 7, 2024
cfdde3f
Update test initialization
zhiltsov-max Sep 7, 2024
843b957
Update changelog
zhiltsov-max Sep 7, 2024
feb92cd
Add backward compatibility for chunk "number" in GT jobs, remove plac…
zhiltsov-max Sep 9, 2024
2424f2b
Update UI to support job chunks with non-sequential frame ids
zhiltsov-max Sep 9, 2024
fe60bdf
Fix job frame retrieval
zhiltsov-max Sep 9, 2024
6ddb6bf
Fix 3d task chunk writing
zhiltsov-max Sep 9, 2024
4fa7b97
Fix frame retrieval in UI
zhiltsov-max Sep 10, 2024
32f1be2
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Sep 10, 2024
0e95b40
Fix chunk availability check
zhiltsov-max Sep 11, 2024
21135b7
Merge remote-tracking branch 'origin/zm/job-chunks' into zm/job-chunks
zhiltsov-max Sep 11, 2024
b311f1e
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Sep 11, 2024
79bb1f7
Remove array comparisons
zhiltsov-max Sep 12, 2024
55a8424
Update validateFrameNumbers
zhiltsov-max Sep 12, 2024
add5ae6
Use builtins for range and binary search, convert frame step into a c…
zhiltsov-max Sep 12, 2024
643d998
Merge remote-tracking branch 'origin/zm/job-chunks' into zm/job-chunks
zhiltsov-max Sep 12, 2024
df90b33
Fix cached chunk indicators in frame player
zhiltsov-max Sep 12, 2024
6ccb7db
Fix chunk predecode logic
zhiltsov-max Sep 13, 2024
1fb68bc
Rename chunkNumber to chunkIndex where necessary
zhiltsov-max Sep 13, 2024
92d0c7a
Fix potential prefetch problem with reverse playback
zhiltsov-max Sep 13, 2024
67c1650
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Sep 13, 2024
3cdc4dc
Move env variable into docker-compose.yml
zhiltsov-max Sep 16, 2024
716042e
Merge remote-tracking branch 'origin/zm/job-chunks' into zm/job-chunks
zhiltsov-max Sep 16, 2024
19279c7
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Sep 16, 2024
bc5ed39
Fix invalid cached chunk display in GT jobs
zhiltsov-max Sep 17, 2024
08ddd28
Fix invalid task preview generation
zhiltsov-max Sep 17, 2024
1d969bd
Refactor CS previews, context image chunk generation, media cache cre…
zhiltsov-max Sep 17, 2024
e2cba8c
Merge remote-tracking branch 'origin/zm/job-chunks' into zm/job-chunks
zhiltsov-max Sep 17, 2024
d135475
Remove extra import
zhiltsov-max Sep 17, 2024
a1638c9
Fix CS preview in response
zhiltsov-max Sep 17, 2024
fc89c01
Add reverse migration
zhiltsov-max Sep 17, 2024
c6e65f6
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Sep 17, 2024
118828a
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Sep 23, 2024
3eea60d
Merge branch 'develop' into zm/job-chunks
zhiltsov-max Sep 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Refactor av video reading
  • Loading branch information
zhiltsov-max committed Aug 12, 2024
commit efbe3a00b69ffbe0f99bf121e06f122750c359c4
195 changes: 93 additions & 102 deletions cvat/apps/engine/media_extractors.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,10 @@
from contextlib import ExitStack, closing, contextmanager
from dataclasses import dataclass
from enum import IntEnum
from typing import Any, Callable, Iterable, Iterator, Optional, Protocol, Sequence, Tuple, TypeVar, Union
from typing import (
Any, Callable, ContextManager, Generator, Iterable, Iterator, Optional, Protocol,
Sequence, Tuple, TypeVar, Union
)

import av
import av.codec
Expand Down Expand Up @@ -505,6 +508,43 @@ def extract(self):
if not self.extract_dir:
os.remove(self._zip_source.filename)

class _AvVideoReading:
@contextmanager
def read_av_container(self, source: Union[str, io.BytesIO]) -> av.container.InputContainer:
if isinstance(source, io.BytesIO):
source.seek(0) # required for re-reading

container = av.open(source)
try:
yield container
finally:
# fixes a memory leak in input container closing
# https://github.com/PyAV-Org/PyAV/issues/1117
for stream in container.streams:
context = stream.codec_context
if context and context.is_open:
context.close()

if container.open_files:
container.close()

def decode_stream(
self, container: av.container.Container, video_stream: av.video.stream.VideoStream
) -> Generator[av.VideoFrame, None, None]:
demux_iter = container.demux(video_stream)
try:
for packet in demux_iter:
yield from packet.decode()
finally:
# av v9.2.0 seems to have a memory corruption or a deadlock
# in exception handling for demux() in the multithreaded mode.
# Instead of breaking the iteration, we iterate over packets till the end.
# Fixed in av v12.2.0.
if av.__version__ == "9.2.0" and video_stream.thread_type == 'AUTO':
exhausted = object()
while next(demux_iter, exhausted) is not exhausted:
pass

class VideoReader(IMediaReader):
def __init__(
self,
Expand Down Expand Up @@ -567,72 +607,45 @@ def iterate_frames(
if self.allow_threading:
video_stream.thread_type = 'AUTO'

exception = None
frame_number = 0
for packet in container.demux(video_stream):
try:
for frame in packet.decode():
if frame_number == next_frame_filter_frame:
if video_stream.metadata.get('rotate'):
pts = frame.pts
frame = av.VideoFrame().from_ndarray(
rotate_image(
frame.to_ndarray(format='bgr24'),
360 - int(video_stream.metadata.get('rotate'))
),
format ='bgr24'
)
frame.pts = pts
frame_counter = itertools.count()
with closing(self._decode_stream(container, video_stream)) as stream_decoder:
for frame, frame_number in zip(stream_decoder, frame_counter):
if frame_number == next_frame_filter_frame:
if video_stream.metadata.get('rotate'):
pts = frame.pts
frame = av.VideoFrame().from_ndarray(
rotate_image(
frame.to_ndarray(format='bgr24'),
360 - int(video_stream.metadata.get('rotate'))
),
format ='bgr24'
)
frame.pts = pts

if self._frame_size is None:
self._frame_size = (frame.width, frame.height)
if self._frame_size is None:
self._frame_size = (frame.width, frame.height)

yield (frame, self._source_path[0], frame.pts)
yield (frame, self._source_path[0], frame.pts)

next_frame_filter_frame = next(frame_filter_iter, None)
next_frame_filter_frame = next(frame_filter_iter, None)

if next_frame_filter_frame is None:
return
if next_frame_filter_frame is None:
return

frame_number += 1
except Exception as e:
if av.__version__ == "9.2.0":
# av v9.2.0 seems to have a memory corruption
# in exception handling for demux() in the multithreaded mode.
# Instead of breaking the iteration, we iterate over packets till the end.
# Fixed in v12.2.0.
exception = e
if video_stream.thread_type != 'AUTO':
break

if exception:
raise exception

def __iter__(self):
def __iter__(self) -> Iterator[Tuple[av.VideoFrame, str, int]]:
return self.iterate_frames()

def get_progress(self, pos):
duration = self._get_duration()
return pos / duration if duration else None

@contextmanager
def _read_av_container(self):
if isinstance(self._source_path[0], io.BytesIO):
self._source_path[0].seek(0) # required for re-reading

container = av.open(self._source_path[0])
try:
yield container
finally:
# fixes a memory leak in input container closing
# https://github.com/PyAV-Org/PyAV/issues/1117
for stream in container.streams:
context = stream.codec_context
if context and context.is_open:
context.close()
def _read_av_container(self) -> ContextManager[av.container.InputContainer]:
return _AvVideoReading().read_av_container(self._source_path[0])

if container.open_files:
container.close()
def _decode_stream(
self, container: av.container.Container, video_stream: av.video.stream.VideoStream
) -> Generator[av.VideoFrame, None, None]:
return _AvVideoReading().decode_stream(container, video_stream)

def _get_duration(self):
with self._read_av_container() as container:
Expand Down Expand Up @@ -717,20 +730,13 @@ def __init__(self, manifest_path: str, source_path: str, *, allow_threading: boo

self.allow_threading = allow_threading

@contextmanager
def _read_av_container(self):
container = av.open(self._source_path)
try:
yield container
finally:
# fixes a memory leak in input container closing
# https://github.com/PyAV-Org/PyAV/issues/1117
for stream in container.streams:
context = stream.codec_context
if context and context.is_open:
context.close()
def _read_av_container(self) -> ContextManager[av.container.InputContainer]:
return _AvVideoReading().read_av_container(self._source_path)

container.close()
def _decode_stream(
self, container: av.container.Container, video_stream: av.video.stream.VideoStream
) -> Generator[av.VideoFrame, None, None]:
return _AvVideoReading().decode_stream(container, video_stream)

def _get_nearest_left_key_frame(self, frame_id: int) -> tuple[int, int]:
nearest_left_keyframe_pos = bisect(
Expand Down Expand Up @@ -763,40 +769,25 @@ def iterate_frames(self, *, frame_filter: Iterable[int]) -> Iterable[av.VideoFra

container.seek(offset=start_decode_timestamp, stream=video_stream)

frame_number = start_decode_frame_number - 1
for packet in container.demux(video_stream):
try:
for frame in packet.decode():
if frame_number == next_frame_filter_frame:
if video_stream.metadata.get('rotate'):
frame = av.VideoFrame().from_ndarray(
rotate_image(
frame.to_ndarray(format='bgr24'),
360 - int(video_stream.metadata.get('rotate'))
),
format ='bgr24'
)

yield frame

next_frame_filter_frame = next(frame_filter_iter, None)

if next_frame_filter_frame is None:
return

frame_number += 1
except Exception as e:
if av.__version__ == "9.2.0":
# av v9.2.0 seems to have a memory corruption
# in exception handling for demux() in the multithreaded mode.
# Instead of breaking the iteration, we iterate over packets till the end.
# Fixed in v12.2.0.
exception = e
if video_stream.thread_type != 'AUTO':
break

if exception:
raise exception
frame_counter = itertools.count(start_decode_frame_number - 1)
with closing(self._decode_stream(container, video_stream)) as stream_decoder:
for frame, frame_number in zip(stream_decoder, frame_counter):
if frame_number == next_frame_filter_frame:
if video_stream.metadata.get('rotate'):
frame = av.VideoFrame().from_ndarray(
rotate_image(
frame.to_ndarray(format='bgr24'),
360 - int(video_stream.metadata.get('rotate'))
),
format ='bgr24'
)

yield frame

next_frame_filter_frame = next(frame_filter_iter, None)

if next_frame_filter_frame is None:
return

class IChunkWriter(ABC):
def __init__(self, quality, dimension=DimensionType.DIM_2D):
Expand Down