fix(video): cleanup temp files on extraction failure#295
fix(video): cleanup temp files on extraction failure#295sairin1202 merged 1 commit intoNevaMind-AI:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses a resource leak in the video frame extraction utilities by adding cleanup logic for temporary files and directories when ffmpeg subprocess operations fail. The implementation adds exception handlers for CalledProcessError and TimeoutExpired to clean up temp resources before re-raising the errors.
Changes:
- Added
shutilimport for recursive directory removal - Introduced
created_temp_fileandcreated_temp_dirflags to track auto-created temporary resources - Added cleanup logic in exception handlers to remove temp files/directories on CalledProcessError and TimeoutExpired
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| shutil.rmtree(output_dir_obj) | ||
| msg = "Video processing timed out" | ||
| logger.exception(msg) | ||
| raise RuntimeError(msg) from e |
There was a problem hiding this comment.
The cleanup logic only handles CalledProcessError and TimeoutExpired exceptions, but there are other exceptions that can be raised within the try block that would leave temp directories uncleaned:
- RuntimeError raised at line 202-203 when output file is not created
- ValueError from float() conversion at line 172 (if duration is invalid)
These exceptions will bypass cleanup and leave the temp directory on disk. Consider adding a broader exception handler (e.g., a bare except clause or catching Exception) to ensure cleanup happens for all error paths, while still re-raising the original exception.
| raise RuntimeError(msg) from e | |
| raise RuntimeError(msg) from e | |
| except Exception: | |
| if created_temp_dir and output_dir_obj.exists(): | |
| shutil.rmtree(output_dir_obj) | |
| logger.exception("Unexpected error during video processing") | |
| raise |
| if output_path is None: | ||
| with tempfile.NamedTemporaryFile(suffix=".jpg", delete=False) as tmp_file: | ||
| output_path = tmp_file.name | ||
| created_temp_file = True | ||
| output_path_obj = VideoFrameExtractor._resolve_output_path(output_path) |
There was a problem hiding this comment.
The created_temp_file flag is set after creating the temp file but before _resolve_output_path is called. If _resolve_output_path raises a ValueError (e.g., if the path starts with '-'), the temp file will be created but the flag won't help cleanup since output_path_obj is not yet defined. Consider moving the flag assignment after output_path_obj is successfully resolved, or add cleanup in a finally block to handle this edge case.
| if output_path is None: | |
| with tempfile.NamedTemporaryFile(suffix=".jpg", delete=False) as tmp_file: | |
| output_path = tmp_file.name | |
| created_temp_file = True | |
| output_path_obj = VideoFrameExtractor._resolve_output_path(output_path) | |
| output_path_obj: Path | None = None | |
| if output_path is None: | |
| with tempfile.NamedTemporaryFile(suffix=".jpg", delete=False) as tmp_file: | |
| output_path = tmp_file.name | |
| created_temp_file = True | |
| try: | |
| output_path_obj = VideoFrameExtractor._resolve_output_path(output_path) | |
| except ValueError: | |
| # If we created a temporary file and resolving the output path fails, | |
| # clean up the temporary file to avoid leaking it. | |
| if created_temp_file and output_path is not None: | |
| try: | |
| Path(output_path).unlink() | |
| except OSError: | |
| # Best-effort cleanup; ignore errors from unlinking. | |
| pass | |
| raise |
| output_dir_obj = VideoFrameExtractor._ensure_safe_cli_path(Path(output_dir)) | ||
| output_dir_obj.mkdir(parents=True, exist_ok=True) |
There was a problem hiding this comment.
The created_temp_dir flag is set after creating the temp directory but before _ensure_safe_cli_path and mkdir are called. If either of these operations raises an exception (e.g., ValueError from _ensure_safe_cli_path if the path starts with '-', or OSError from mkdir), the temp directory will be created but won't be cleaned up since output_dir_obj might not be defined or the exception handling only exists within the try block. Consider moving the flag assignment after output_dir_obj is successfully created, or add cleanup in a finally block to handle this edge case.
| output_dir_obj = VideoFrameExtractor._ensure_safe_cli_path(Path(output_dir)) | |
| output_dir_obj.mkdir(parents=True, exist_ok=True) | |
| try: | |
| output_dir_obj = VideoFrameExtractor._ensure_safe_cli_path(Path(output_dir)) | |
| output_dir_obj.mkdir(parents=True, exist_ok=True) | |
| except Exception: | |
| if created_temp_dir and output_dir is not None: | |
| shutil.rmtree(output_dir, ignore_errors=True) | |
| raise |
| except subprocess.CalledProcessError as e: | ||
| if created_temp_file and output_path_obj.exists(): | ||
| output_path_obj.unlink() | ||
| msg = f"ffmpeg/ffprobe failed: {e.stderr}" | ||
| logger.exception(msg) | ||
| raise RuntimeError(msg) from e | ||
| except subprocess.TimeoutExpired as e: | ||
| if created_temp_file and output_path_obj.exists(): | ||
| output_path_obj.unlink() | ||
| msg = "Video processing timed out" | ||
| logger.exception(msg) | ||
| raise RuntimeError(msg) from e |
There was a problem hiding this comment.
The cleanup logic only handles CalledProcessError and TimeoutExpired exceptions, but there are other exceptions that can be raised within the try block that would leave temp files uncleaned:
- RuntimeError raised at line 100-102 when output file is not created
- ValueError from float() conversion at line 77 (if duration is invalid)
These exceptions will bypass cleanup and leave the temp file on disk. Consider adding a broader exception handler (e.g., a bare except clause or catching Exception) to ensure cleanup happens for all error paths, while still re-raising the original exception.
|
seems good! thx for contribution |
Description
This PR addresses a potential resource leak in
src/memu/utils/video.py.Currently, if the
ffmpegsubprocess fails (e.g., timeout, corrupt file, or missing codec), theRuntimeErroris raised, but the temporary directory and files created before the call are left on the disk.Changes
try...exceptblock.shutil.rmtree(temp_dir)andtemp_file.unlink()in the exception handler to ensure artifacts are removed even when errors occur.Why
To prevent disk clutter and ensure the library behaves like a good citizen, especially in long-running agentic workflows where these errors might accumulate over time.