Naposledy aktivní 1 month ago

Revize 00d48881d640967576b19ab9c8a1a5519d6fc76c

Automating AI Video Upscaling with Real ESRGAN and Docker.md Raw

If you have a library of old, low-resolution videos that you want to restore, manually processing them frame-by-frame or using GUI tools can be tedious. In this post, I'm sharing a "set it and forget it" solution using Docker and Real-ESRGAN.

This setup uses an NVIDIA GPU to automate the entire pipeline: splitting the video into frames, upscaling them with AI, and merging them back into a high-quality video file.

Prerequisites

  • A machine with an NVIDIA GPU.
  • Docker Desktop or Docker Engine installed.
  • NVIDIA Container Toolkit (to allow Docker to access your GPU).

1. The Environment: Dockerfile

First, we need a consistent environment. This Dockerfile builds on top of an NVIDIA base image, installs ffmpeg and vulkan drivers, and pre-downloads the Real-ESRGAN executable so we don't have to download it every time we run the container.

Dockerfile

# Use a base image with CUDA drivers and Ubuntu to support GPU
FROM hub.aiursoft.com/aiursoft/internalimages/nvidia:latest

# Set environment variables to avoid interactive prompts during the build
ENV DEBIAN_FRONTEND=noninteractive

# Update package lists and install required dependencies: wget, unzip, ffmpeg, and vulkan loader
RUN apt-get update && apt-get install -y --no-install-recommends \
    wget \
    unzip \
    ffmpeg \
    libvulkan1 \
    && rm -rf /var/lib/apt/lists/*

# Set the working directory
WORKDIR /app

# Download, unzip, and setup Real-ESRGAN during the image build
# This is more efficient than downloading every time the container starts
RUN wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesrgan-ncnn-vulkan-20220424-ubuntu.zip && \
    unzip realesrgan-ncnn-vulkan-20220424-ubuntu.zip && \
    rm realesrgan-ncnn-vulkan-20220424-ubuntu.zip && \
    chmod +x realesrgan-ncnn-vulkan
		
# Copy the automation script to the container's /app/ directory
COPY process_videos.sh .
# Grant execution permissions to the script
RUN chmod +x process_videos.sh

# Define mount points for external directory mapping
VOLUME /input
VOLUME /output

# Set our processing script as the container's entrypoint
# This script will execute automatically when the container starts
ENTRYPOINT ["/app/process_videos.sh"]

2. The Logic: Processing Script

This bash script handles the heavy lifting. It scans the /input folder for videos, and for every file it finds, it performs three steps:

  1. Extract: Uses ffmpeg to dump all frames to a temp folder.
  2. Upscale: Runs realesrgan-ncnn-vulkan on that folder to create 4x larger frames.
  3. Assemble: Stitches the new frames back together using hardware acceleration (h264_nvenc) if available, or falls back to CPU encoding.

process_videos.sh

#!/bin/bash

# Set input and output directories
INPUT_DIR="/input"
OUTPUT_DIR="/output"

# Ensure the output directory exists
mkdir -p "$OUTPUT_DIR"

echo "Starting video enhancement tasks..."
# Enable nullglob to handle cases where no files match a pattern
shopt -s nullglob

# Find all supported video files
video_files=("$INPUT_DIR"/*.mp4 "$INPUT_DIR"/*.mov "$INPUT_DIR"/*.avi)

# Check if any video files were found
if [ ${#video_files[@]} -eq 0 ]; then
    echo "No supported video files found in $INPUT_DIR (.mp4, .mov, .avi)."
    exit 0
fi

echo "Found ${#video_files[@]} video files. Starting processing..."

# Loop through the found video files
for INPUT_VIDEO in "${video_files[@]}"; do
    # Check if the file exists and is a regular file
    if [ ! -f "$INPUT_VIDEO" ]; then
        continue
    fi

    echo "--- Processing: $INPUT_VIDEO ---"

    # --- Configuration ---
    SCALE=4 # Upscaling scale
    MODEL="realesrgan-x4plus" # Model to use
    GPU_ID=0 # GPU index to use
    REALESRGAN_EXEC="/app/realesrgan-ncnn-vulkan" # Path to Real-ESRGAN executable

    # Create a temporary working directory for frames
    WORK_DIR=$(mktemp -d)
    mkdir -p "$WORK_DIR/out"
    echo "Working directory: $WORK_DIR"

    # Define output video path
    BASE_NAME=$(basename "$INPUT_VIDEO")
    OUTPUT_VIDEO="$OUTPUT_DIR/${BASE_NAME%.*}_enhanced_x${SCALE}.mp4"

    # --- 1. Extract Frames ---
    echo "Step 1: Splitting video into frames..."
    ffmpeg -i "$INPUT_VIDEO" -qscale:v 1 -qmin 1 -vsync 0 "$WORK_DIR/in_%08d.png"
    if [ $? -ne 0 ]; then
        echo "Error: ffmpeg failed to split '$INPUT_VIDEO'."
        rm -rf "$WORK_DIR" # Clean up temp files
        continue # Continue to the next video
    fi

    # --- 2. Process Frames with AI Model ---
    echo "Step 2: Upscaling frames using Real-ESRGAN..."
    $REALESRGAN_EXEC \
        -i "$WORK_DIR" \
        -o "$WORK_DIR/out" \
        -n "$MODEL" \
        -s "$SCALE" \
        -g "$GPU_ID" \
        -f png
    if [ $? -ne 0 ]; then
        echo "Error: Real-ESRGAN failed to process frames for '$INPUT_VIDEO'."
        rm -rf "$WORK_DIR"
        continue
    fi

    # --- 3. Assemble Video ---
    echo "Step 3: Assembling processed frames into video..."
    FPS=$(ffprobe -v error -select_streams v -of default=noprint_wrappers=1:nokey=1 -show_entries stream=r_frame_rate "$INPUT_VIDEO")
    # Attempt to use NVIDIA GPU hardware encoding
    ffmpeg -framerate "$FPS" -i "$WORK_DIR/out/in_%08d.png" \
           -i "$INPUT_VIDEO" -map 0:v:0 -map 1:a:0? -c:a copy \
           -c:v h264_nvenc -preset slow -cq 18 "$OUTPUT_VIDEO"
    if [ $? -ne 0 ]; then
        echo "Warning: NVIDIA hardware encoding (h264_nvenc) failed. Trying CPU encoding (libx264)..."
        # Fallback to CPU encoding if GPU encoding fails
        ffmpeg -framerate "$FPS" -i "$WORK_DIR/out/in_%08d.png" \
               -i "$INPUT_VIDEO" -map 0:v:0 -map 1:a:0? -c:a copy \
               -c:v libx264 -preset medium -crf 22 "$OUTPUT_VIDEO"
    fi

    # --- 4. Cleanup ---
    echo "Step 4: Cleaning up temporary files..."
    rm -rf "$WORK_DIR"

    echo "--- Finished processing: $INPUT_VIDEO. Output file: $OUTPUT_VIDEO ---"
done

echo "All videos processed."

How to Run It

1. Build the Image

In the directory containing the Dockerfile and the process_videos.sh script, run:

docker build -t video-enhancer .

2. Run the Container

Prepare a folder named videos_to_process with your source files and an empty enhanced_videos folder. Then run:

docker run --gpus all --rm \
  -v $(pwd)/videos_to_process:/input \
  -v $(pwd)/enhanced_videos:/output \
  video-enhancer

The container will spin up, process every video in the input folder, save the 4x upscaled versions to the output folder, and then shut down cleanly. Happy upscaling!