Automating AI Video Upscaling with Real ESRGAN and Docker.md

Revize 00d48881d640967576b19ab9c8a1a5519d6fc76c

Automating AI Video Upscaling with Real ESRGAN and Docker.md · 6.5 KiB · Markdown Raw

If you have a library of old, low-resolution videos that you want to restore, manually processing them frame-by-frame or using GUI tools can be tedious. In this post, I'm sharing a "set it and forget it" solution using Docker and Real-ESRGAN. This setup uses an NVIDIA GPU to automate the entire pipeline: splitting the video into frames, upscaling them with AI, and merging them back into a high-quality video file. ### Prerequisites * A machine with an NVIDIA GPU. * Docker Desktop or Docker Engine installed. * **NVIDIA Container Toolkit** (to allow Docker to access your GPU). ----- ## 1\. The Environment: Dockerfile First, we need a consistent environment. This Dockerfile builds on top of an NVIDIA base image, installs `ffmpeg` and `vulkan` drivers, and pre-downloads the Real-ESRGAN executable so we don't have to download it every time we run the container. **Dockerfile** ```dockerfile # Use a base image with CUDA drivers and Ubuntu to support GPU FROM hub.aiursoft.com/aiursoft/internalimages/nvidia:latest # Set environment variables to avoid interactive prompts during the build ENV DEBIAN_FRONTEND=noninteractive # Update package lists and install required dependencies: wget, unzip, ffmpeg, and vulkan loader RUN apt-get update && apt-get install -y --no-install-recommends \ wget \ unzip \ ffmpeg \ libvulkan1 \ && rm -rf /var/lib/apt/lists/* # Set the working directory WORKDIR /app # Download, unzip, and setup Real-ESRGAN during the image build # This is more efficient than downloading every time the container starts RUN wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesrgan-ncnn-vulkan-20220424-ubuntu.zip && \ unzip realesrgan-ncnn-vulkan-20220424-ubuntu.zip && \ rm realesrgan-ncnn-vulkan-20220424-ubuntu.zip && \ chmod +x realesrgan-ncnn-vulkan # Copy the automation script to the container's /app/ directory COPY process_videos.sh . # Grant execution permissions to the script RUN chmod +x process_videos.sh # Define mount points for external directory mapping VOLUME /input VOLUME /output # Set our processing script as the container's entrypoint # This script will execute automatically when the container starts ENTRYPOINT ["/app/process_videos.sh"] ``` ----- ## 2\. The Logic: Processing Script This bash script handles the heavy lifting. It scans the `/input` folder for videos, and for every file it finds, it performs three steps: 1. **Extract:** Uses `ffmpeg` to dump all frames to a temp folder. 2. **Upscale:** Runs `realesrgan-ncnn-vulkan` on that folder to create 4x larger frames. 3. **Assemble:** Stitches the new frames back together using hardware acceleration (`h264_nvenc`) if available, or falls back to CPU encoding. **process\_videos.sh** ```bash #!/bin/bash # Set input and output directories INPUT_DIR="/input" OUTPUT_DIR="/output" # Ensure the output directory exists mkdir -p "$OUTPUT_DIR" echo "Starting video enhancement tasks..." # Enable nullglob to handle cases where no files match a pattern shopt -s nullglob # Find all supported video files video_files=("$INPUT_DIR"/*.mp4 "$INPUT_DIR"/*.mov "$INPUT_DIR"/*.avi) # Check if any video files were found if [ ${#video_files[@]} -eq 0 ]; then echo "No supported video files found in $INPUT_DIR (.mp4, .mov, .avi)." exit 0 fi echo "Found ${#video_files[@]} video files. Starting processing..." # Loop through the found video files for INPUT_VIDEO in "${video_files[@]}"; do # Check if the file exists and is a regular file if [ ! -f "$INPUT_VIDEO" ]; then continue fi echo "--- Processing: $INPUT_VIDEO ---" # --- Configuration --- SCALE=4 # Upscaling scale MODEL="realesrgan-x4plus" # Model to use GPU_ID=0 # GPU index to use REALESRGAN_EXEC="/app/realesrgan-ncnn-vulkan" # Path to Real-ESRGAN executable # Create a temporary working directory for frames WORK_DIR=$(mktemp -d) mkdir -p "$WORK_DIR/out" echo "Working directory: $WORK_DIR" # Define output video path BASE_NAME=$(basename "$INPUT_VIDEO") OUTPUT_VIDEO="$OUTPUT_DIR/${BASE_NAME%.*}_enhanced_x${SCALE}.mp4" # --- 1. Extract Frames --- echo "Step 1: Splitting video into frames..." ffmpeg -i "$INPUT_VIDEO" -qscale:v 1 -qmin 1 -vsync 0 "$WORK_DIR/in_%08d.png" if [ $? -ne 0 ]; then echo "Error: ffmpeg failed to split '$INPUT_VIDEO'." rm -rf "$WORK_DIR" # Clean up temp files continue # Continue to the next video fi # --- 2. Process Frames with AI Model --- echo "Step 2: Upscaling frames using Real-ESRGAN..." $REALESRGAN_EXEC \ -i "$WORK_DIR" \ -o "$WORK_DIR/out" \ -n "$MODEL" \ -s "$SCALE" \ -g "$GPU_ID" \ -f png if [ $? -ne 0 ]; then echo "Error: Real-ESRGAN failed to process frames for '$INPUT_VIDEO'." rm -rf "$WORK_DIR" continue fi # --- 3. Assemble Video --- echo "Step 3: Assembling processed frames into video..." FPS=$(ffprobe -v error -select_streams v -of default=noprint_wrappers=1:nokey=1 -show_entries stream=r_frame_rate "$INPUT_VIDEO") # Attempt to use NVIDIA GPU hardware encoding ffmpeg -framerate "$FPS" -i "$WORK_DIR/out/in_%08d.png" \ -i "$INPUT_VIDEO" -map 0:v:0 -map 1:a:0? -c:a copy \ -c:v h264_nvenc -preset slow -cq 18 "$OUTPUT_VIDEO" if [ $? -ne 0 ]; then echo "Warning: NVIDIA hardware encoding (h264_nvenc) failed. Trying CPU encoding (libx264)..." # Fallback to CPU encoding if GPU encoding fails ffmpeg -framerate "$FPS" -i "$WORK_DIR/out/in_%08d.png" \ -i "$INPUT_VIDEO" -map 0:v:0 -map 1:a:0? -c:a copy \ -c:v libx264 -preset medium -crf 22 "$OUTPUT_VIDEO" fi # --- 4. Cleanup --- echo "Step 4: Cleaning up temporary files..." rm -rf "$WORK_DIR" echo "--- Finished processing: $INPUT_VIDEO. Output file: $OUTPUT_VIDEO ---" done echo "All videos processed." ``` ----- ## How to Run It ### 1\. Build the Image In the directory containing the `Dockerfile` and the `process_videos.sh` script, run: ```bash docker build -t video-enhancer . ``` ### 2\. Run the Container Prepare a folder named `videos_to_process` with your source files and an empty `enhanced_videos` folder. Then run: ```bash docker run --gpus all --rm \ -v $(pwd)/videos_to_process:/input \ -v $(pwd)/enhanced_videos:/output \ video-enhancer ``` The container will spin up, process every video in the input folder, save the 4x upscaled versions to the output folder, and then shut down cleanly. Happy upscaling\!

This setup uses an NVIDIA GPU to automate the entire pipeline: splitting the video into frames, upscaling them with AI, and merging them back into a high-quality video file.

Prerequisites

A machine with an NVIDIA GPU.
Docker Desktop or Docker Engine installed.
NVIDIA Container Toolkit (to allow Docker to access your GPU).

1. The Environment: Dockerfile

First, we need a consistent environment. This Dockerfile builds on top of an NVIDIA base image, installs ffmpeg and vulkan drivers, and pre-downloads the Real-ESRGAN executable so we don't have to download it every time we run the container.

Dockerfile

# Use a base image with CUDA drivers and Ubuntu to support GPU
FROM hub.aiursoft.com/aiursoft/internalimages/nvidia:latest

# Set environment variables to avoid interactive prompts during the build
ENV DEBIAN_FRONTEND=noninteractive

# Update package lists and install required dependencies: wget, unzip, ffmpeg, and vulkan loader
RUN apt-get update && apt-get install -y --no-install-recommends \
    wget \
    unzip \
    ffmpeg \
    libvulkan1 \
    && rm -rf /var/lib/apt/lists/*

# Set the working directory
WORKDIR /app

# Download, unzip, and setup Real-ESRGAN during the image build
# This is more efficient than downloading every time the container starts
RUN wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesrgan-ncnn-vulkan-20220424-ubuntu.zip && \
    unzip realesrgan-ncnn-vulkan-20220424-ubuntu.zip && \
    rm realesrgan-ncnn-vulkan-20220424-ubuntu.zip && \
    chmod +x realesrgan-ncnn-vulkan
		
# Copy the automation script to the container's /app/ directory
COPY process_videos.sh .
# Grant execution permissions to the script
RUN chmod +x process_videos.sh

# Define mount points for external directory mapping
VOLUME /input
VOLUME /output

# Set our processing script as the container's entrypoint
# This script will execute automatically when the container starts
ENTRYPOINT ["/app/process_videos.sh"]

2. The Logic: Processing Script

This bash script handles the heavy lifting. It scans the /input folder for videos, and for every file it finds, it performs three steps:

Extract: Uses ffmpeg to dump all frames to a temp folder.
Upscale: Runs realesrgan-ncnn-vulkan on that folder to create 4x larger frames.
Assemble: Stitches the new frames back together using hardware acceleration (h264_nvenc) if available, or falls back to CPU encoding.

process_videos.sh

#!/bin/bash

# Set input and output directories
INPUT_DIR="/input"
OUTPUT_DIR="/output"

# Ensure the output directory exists
mkdir -p "$OUTPUT_DIR"

echo "Starting video enhancement tasks..."
# Enable nullglob to handle cases where no files match a pattern
shopt -s nullglob

# Find all supported video files
video_files=("$INPUT_DIR"/*.mp4 "$INPUT_DIR"/*.mov "$INPUT_DIR"/*.avi)

# Check if any video files were found
if [ ${#video_files[@]} -eq 0 ]; then
    echo "No supported video files found in $INPUT_DIR (.mp4, .mov, .avi)."
    exit 0
fi

echo "Found ${#video_files[@]} video files. Starting processing..."

# Loop through the found video files
for INPUT_VIDEO in "${video_files[@]}"; do
    # Check if the file exists and is a regular file
    if [ ! -f "$INPUT_VIDEO" ]; then
        continue
    fi

    echo "--- Processing: $INPUT_VIDEO ---"

    # --- Configuration ---
    SCALE=4 # Upscaling scale
    MODEL="realesrgan-x4plus" # Model to use
    GPU_ID=0 # GPU index to use
    REALESRGAN_EXEC="/app/realesrgan-ncnn-vulkan" # Path to Real-ESRGAN executable

    # Create a temporary working directory for frames
    WORK_DIR=$(mktemp -d)
    mkdir -p "$WORK_DIR/out"
    echo "Working directory: $WORK_DIR"

    # Define output video path
    BASE_NAME=$(basename "$INPUT_VIDEO")
    OUTPUT_VIDEO="$OUTPUT_DIR/${BASE_NAME%.*}_enhanced_x${SCALE}.mp4"

    # --- 1. Extract Frames ---
    echo "Step 1: Splitting video into frames..."
    ffmpeg -i "$INPUT_VIDEO" -qscale:v 1 -qmin 1 -vsync 0 "$WORK_DIR/in_%08d.png"
    if [ $? -ne 0 ]; then
        echo "Error: ffmpeg failed to split '$INPUT_VIDEO'."
        rm -rf "$WORK_DIR" # Clean up temp files
        continue # Continue to the next video
    fi

    # --- 2. Process Frames with AI Model ---
    echo "Step 2: Upscaling frames using Real-ESRGAN..."
    $REALESRGAN_EXEC \
        -i "$WORK_DIR" \
        -o "$WORK_DIR/out" \
        -n "$MODEL" \
        -s "$SCALE" \
        -g "$GPU_ID" \
        -f png
    if [ $? -ne 0 ]; then
        echo "Error: Real-ESRGAN failed to process frames for '$INPUT_VIDEO'."
        rm -rf "$WORK_DIR"
        continue
    fi

    # --- 3. Assemble Video ---
    echo "Step 3: Assembling processed frames into video..."
    FPS=$(ffprobe -v error -select_streams v -of default=noprint_wrappers=1:nokey=1 -show_entries stream=r_frame_rate "$INPUT_VIDEO")
    # Attempt to use NVIDIA GPU hardware encoding
    ffmpeg -framerate "$FPS" -i "$WORK_DIR/out/in_%08d.png" \
           -i "$INPUT_VIDEO" -map 0:v:0 -map 1:a:0? -c:a copy \
           -c:v h264_nvenc -preset slow -cq 18 "$OUTPUT_VIDEO"
    if [ $? -ne 0 ]; then
        echo "Warning: NVIDIA hardware encoding (h264_nvenc) failed. Trying CPU encoding (libx264)..."
        # Fallback to CPU encoding if GPU encoding fails
        ffmpeg -framerate "$FPS" -i "$WORK_DIR/out/in_%08d.png" \
               -i "$INPUT_VIDEO" -map 0:v:0 -map 1:a:0? -c:a copy \
               -c:v libx264 -preset medium -crf 22 "$OUTPUT_VIDEO"
    fi

    # --- 4. Cleanup ---
    echo "Step 4: Cleaning up temporary files..."
    rm -rf "$WORK_DIR"

    echo "--- Finished processing: $INPUT_VIDEO. Output file: $OUTPUT_VIDEO ---"
done

echo "All videos processed."

How to Run It

1. Build the Image

In the directory containing the Dockerfile and the process_videos.sh script, run:

docker build -t video-enhancer .

2. Run the Container

Prepare a folder named videos_to_process with your source files and an empty enhanced_videos folder. Then run:

docker run --gpus all --rm \
  -v $(pwd)/videos_to_process:/input \
  -v $(pwd)/enhanced_videos:/output \
  video-enhancer

The container will spin up, process every video in the input folder, save the 4x upscaled versions to the output folder, and then shut down cleanly. Happy upscaling!