SBS to 2D Video Converter with BGM (now version 6.2)

doting.puppet · April 9, 2025, 3:29am

I’ve been working on a project for the past month, and I finally feel like I have something worth sharing with the community.

Sometimes, I just need a 2D version of a VR video, but every software solution I’ve found so far has been more complex than I need. So, I decided to try my hand at creating a Python script that would handle the process according to my needs.

Introducing “SBS to 2D Video Converter with BGM.”

This is essentially a GUI/Frontend for FFMPEG, designed to manage the variables I find most useful when converting VR to 2D.

The script assumes you’re converting MP4 files, and that your machine supports CUDA acceleration. I plan to revise it to support CPU conversion in the future. Currently, it probably only works “properly” in a Windows environment. Unfortunately, I don’t have a Mac to test on, and if you’re using a Linux distro, you probably don’t need my help anyway.

Upon launching the script, a dialog will appear, allowing you to enter values within the following generally usable constraints:

Field of View (100-150, default: 125)
Camera Pitch (-90 to +90)
Initial Duration (0-360 seconds, default: 10 seconds — for quickly drafting settings to see how they look)
Bitrate (1-80 Mbps, default: 6)
Background Audio File (This feature was inspired by Maechoon, who often adds background music to their compilation videos. I’ve come to like it. This probably only works with audio encoded in AAC format. Feel free to try MP3 or WAV. It might work.
Background Audio Volume (default: 5)
Select Eye View (default: left)
Target Video Suffix (default: _2D_Converted)
Append Conversion Settings (Adds current values to the output video so you can easily compare different iterations of FOV, pitch, etc.)
Output Width (default: 1920)
Output Height (default: 1080)

Output Width/Height:
It’s probably best to keep the aspect ratio at 16:9. I haven’t tested other resolutions or ratios extensively.

How to Use:

Select your initial options.
Click “Select Source Video.”
The script will run FFMPEG with the selected settings.
If successful, a popup will appear with a button to view the converted video in your default media player for MP4 files.
If the output looks good, click “Convert Full Video,” and FFMPEG will export the entire clip with your selected settings.
If the video needs adjustments, simply close the success dialog and tweak the values. The “Run Again” option will activate, allowing you to generate another draft.

To Do List:

Looking into other audio formats for BGM mixing
Adding more simple camera controls such as Yaw and Roll

Disclaimer:

I’m not a professional Python user, and I don’t fully understand FFMPEG, so I can’t guarantee that this script will meet all your needs. However, since it’s written in Python, it’s fully editable for end-users. I’ve tried to document what each section of the code does, but I might have missed a few details.

Convert_SBS_to_2D_Mix_BGM_GUI_v6.2.py

import os
import shutil
import subprocess
import threading
import tkinter as tk
from tkinter import filedialog, messagebox

# GUI application for converting SBS 3D videos to 2D with optional background music
class VideoConverterApp:
    def __init__(self, master):
        # Create the menu bar
        menubar = tk.Menu(master)

        # File menu
        file_menu = tk.Menu(menubar, tearoff=0)
        file_menu.add_command(label="Reset", command=self.reset_to_defaults)
        file_menu.add_separator()
        file_menu.add_command(label="Exit", command=master.quit)
        menubar.add_cascade(label="File", menu=file_menu)

        # Help menu
        help_menu = tk.Menu(menubar, tearoff=0)
        help_menu.add_command(label="Dependencies", command=self.show_help_dialog)
        menubar.add_cascade(label="Help", menu=help_menu)

        master.config(menu=menubar)
        self.last_input_path = None
        self.last_settings = None
        self.master = master
        master.title("SBS to 2D Video Converter with BGM")

        # Default parameter values
        self.default_fov = 125
        self.default_pitch = 0
        self.default_duration = 10
        self.default_volume = 5
        self.default_tgt_suffix = "_2D_Converted"
        self.default_width = 1920
        self.default_height = 1080
        self.default_bitrate = 6

        # UI state variables
        self.eye_view = tk.StringVar(value="left")  # Select which half of SBS to use
        self.append_settings = tk.BooleanVar(value=False)  # Append settings to filename

        # GUI layout and input fields
        tk.Label(master, text="Field of View (100-150):").grid(row=0, sticky='e')
        self.fov_entry = tk.Entry(master)
        self.fov_entry.insert(0, str(self.default_fov))
        self.fov_entry.grid(row=0, column=1)

        tk.Label(master, text="Camera Pitch (-90 to +90):").grid(row=1, sticky='e')
        self.pitch_entry = tk.Entry(master)
        self.pitch_entry.insert(0, str(self.default_pitch))
        self.pitch_entry.grid(row=1, column=1)

        tk.Label(master, text="Clip Duration (0-360s, 0 = full video):").grid(row=2, sticky='e')
        self.duration_entry = tk.Entry(master)
        self.duration_entry.insert(0, str(self.default_duration))
        self.duration_entry.grid(row=2, column=1)

        tk.Label(master, text="Bitrate (1-80 Mbps):").grid(row=3, sticky='e')
        self.bitrate_entry = tk.Entry(master)
        self.bitrate_entry.insert(0, str(self.default_bitrate))
        self.bitrate_entry.grid(row=3, column=1)

        tk.Label(master, text="Background Audio File:").grid(row=4, sticky='e')
        self.bg_audio_path = tk.Entry(master)
        self.bg_audio_path.grid(row=4, column=1)
        tk.Button(master, text="Browse", command=self.browse_audio).grid(row=4, column=2)

        tk.Label(master, text="Background Audio Volume %:").grid(row=5, sticky='e')
        self.bg_volume = tk.Entry(master)
        self.bg_volume.insert(0, str(self.default_volume))
        self.bg_volume.grid(row=5, column=1)

        tk.Label(master, text="Select Eye View:").grid(row=6, sticky='w', columnspan=2)
        tk.Radiobutton(master, text="Left Eye", variable=self.eye_view, value="left").grid(row=7, column=0, sticky='w')
        tk.Radiobutton(master, text="Right Eye", variable=self.eye_view, value="right").grid(row=7, column=1, sticky='w')

        tk.Label(master, text="Target Video Suffix:").grid(row=8, sticky='e')
        self.tgt_suffix = tk.Entry(master)
        self.tgt_suffix.insert(0, self.default_tgt_suffix)
        self.tgt_suffix.grid(row=8, column=1)

        tk.Checkbutton(master, text="Append Conversion Settings", variable=self.append_settings).grid(row=9, column=0, columnspan=2, sticky='w')

        tk.Label(master, text="Output Width:").grid(row=10, sticky='e')
        self.width_entry = tk.Entry(master)
        self.width_entry.insert(0, str(self.default_width))
        self.width_entry.grid(row=10, column=1)

        tk.Label(master, text="Output Height:").grid(row=11, sticky='e')
        self.height_entry = tk.Entry(master)
        self.height_entry.insert(0, str(self.default_height))
        self.height_entry.grid(row=11, column=1)

        tk.Button(master, text="Select Source Video", command=self.select_video).grid(row=12, columnspan=3, pady=(10, 0))
        self.run_again_button = tk.Button(master, text="Run Again", command=self.run_again, state=tk.DISABLED)
        self.run_again_button.grid(row=13, columnspan=3, pady=(0, 10))

    def browse_audio(self):
        # Opens file dialog to select audio file
        path = filedialog.askopenfilename(filetypes=[("Audio files", "*.aac *.mp3 *.wav")])
        if path:
            self.bg_audio_path.delete(0, tk.END)
            self.bg_audio_path.insert(0, path)

    def select_video(self):
        # Opens file dialog to select a video file
        path = filedialog.askopenfilename(filetypes=[("MP4 files", "*.mp4")])
        if path:
            self.last_input_path = path
            self.run_again_button.config(state=tk.NORMAL)
            self.process_video(path)

    def get_video_duration(self, video_path):
        # Uses ffprobe to get the full duration of the input video
        try:
            result = subprocess.run(
                ["ffprobe", "-v", "error", "-show_entries",
                 "format=duration", "-of", "default=noprint_wrappers=1:nokey=1", video_path],
                stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True
            )
            return float(result.stdout.strip())
        except Exception as e:
            messagebox.showerror("Duration Error", f"Could not fetch video duration:\n{str(e)}")
            return None

    def build_ffmpeg_command(self, input_path, output_path, duration, fov, pitch, width, height, bg_audio, bg_volume, bitrate, eye_view):
        # Constructs the ffmpeg command with all necessary filters and settings
        command = ['ffmpeg', '-y', '-hwaccel', 'cuda', '-ss', '0', '-t', str(duration), '-i', input_path]

        # Add audio mixing if a background audio file is provided
        if bg_audio:
            command.extend(['-i', bg_audio])
            audio_filter = f"[1:a]volume={bg_volume}[a1];[0:a][a1]amix=inputs=2:duration=first[aout]"
        else:
            audio_filter = "[0:a]anull[aout]"

        crop_x = "0" if eye_view == "left" else "iw/2"  # Select half of SBS view

        command.extend([
            '-vf', f'crop=w=iw/2:h=ih:x={crop_x}:y=0,v360=hequirect:flat:in_stereo=2d:out_stereo=2d:iv_fov=180:ih_fov=180:d_fov={fov}:pitch={pitch}:yaw=0:roll=0:w={width}:h={height}:interp=lanczos:reset_rot=1',
            '-map', '0:v:0',
            '-c:v', 'h264_nvenc',
            '-b:v', f"{bitrate}M",
            '-pix_fmt', 'yuv420p',
            '-filter_complex', audio_filter,
            '-map', '[aout]',
            '-c:a', 'aac',
            '-b:a', '192k',
            '-metadata', 'comment="SBS to 2D Video Converter with BGM"',
            output_path
        ])
        return command

    def show_success_dialog(self, output_path):
        # Dialog shown after successful conversion
        def open_file():
            try:
                os.startfile(output_path)
            except AttributeError:
                subprocess.run(['open' if sys.platform == 'darwin' else 'xdg-open', output_path])

        def convert_full_video():
            if self.last_settings:
                full_duration = self.get_video_duration(self.last_settings["input_path"])
                if full_duration:
                    self.process_video(self.last_settings["input_path"], override_duration=full_duration)

        success_win = tk.Toplevel(self.master)
        success_win.title("Success")
        tk.Label(success_win, text=f"Converted successfully: {output_path}").pack(padx=10, pady=10)
        tk.Button(success_win, text="Open File", command=open_file).pack(pady=(0, 5))
        tk.Button(success_win, text="Convert Full Video", command=convert_full_video).pack(pady=(0, 10))

    def show_console_output(self, process, output_path):
        # Show the ffmpeg stdout output in real time in a new window
        console_win = tk.Toplevel(self.master)
        console_win.title("FFmpeg Progress")
        text = tk.Text(console_win, wrap='word', height=25, width=100)
        text.pack(fill='both', expand=True)

        def read_output():
            for line in iter(process.stdout.readline, ''):
                if not line:
                    break
                text.insert(tk.END, line)
                text.see(tk.END)
            process.stdout.close()
            process.wait()
            if process.returncode == 0:
                self.show_success_dialog(output_path)
            else:
                messagebox.showerror("FFmpeg Error", f"Process failed with code {process.returncode}.")

        threading.Thread(target=read_output, daemon=True).start()

    def reset_to_defaults(self):
        # Reset all input fields to default values
        self.fov_entry.delete(0, tk.END)
        self.fov_entry.insert(0, str(self.default_fov))

        self.pitch_entry.delete(0, tk.END)
        self.pitch_entry.insert(0, str(self.default_pitch))

        self.duration_entry.delete(0, tk.END)
        self.duration_entry.insert(0, str(self.default_duration))

        self.bitrate_entry.delete(0, tk.END)
        self.bitrate_entry.insert(0, str(self.default_bitrate))

        self.bg_audio_path.delete(0, tk.END)

        self.bg_volume.delete(0, tk.END)
        self.bg_volume.insert(0, str(self.default_volume))

        self.eye_view.set("left")
        self.tgt_suffix.delete(0, tk.END)
        self.tgt_suffix.insert(0, self.default_tgt_suffix)

        self.append_settings.set(False)

        self.width_entry.delete(0, tk.END)
        self.width_entry.insert(0, str(self.default_width))

        self.height_entry.delete(0, tk.END)
        self.height_entry.insert(0, str(self.default_height))

    def show_help_dialog(self):
        messagebox.showinfo("Dependencies", 
            "This tool requires the following dependencies: "
            "- FFmpeg (must be accessible via system PATH) "
            "- CUDA-compatible GPU"
        )

    def run_again(self):
        # Re-run with the last used video path
        if self.last_input_path:
            self.process_video(self.last_input_path)

    def process_video(self, input_path, override_duration=None):
        # Main function to process the video and start ffmpeg
        try:
            if not shutil.which("ffmpeg"):
                raise EnvironmentError("FFmpeg is not installed or not in PATH.")

            fov = int(self.fov_entry.get())
            if not (100 <= fov <= 150): raise ValueError("FOV")
            pitch = int(self.pitch_entry.get())
            if not (-90 <= pitch <= 90): raise ValueError("Pitch")
            width = int(self.width_entry.get())
            height = int(self.height_entry.get())
            duration = override_duration if override_duration is not None else int(self.duration_entry.get())
            if override_duration is None:
                if not (0 <= duration <= 360):
                    raise ValueError("Duration")
                if duration == 0:
                    full = self.get_video_duration(input_path)
                    if full is None:
                        return  # Error already shown
                    duration = int(full)
            bg_volume = float(self.bg_volume.get()) / 100.0
            tgt_suffix = self.tgt_suffix.get()
            bg_audio = self.bg_audio_path.get().strip() or None
            bitrate = int(self.bitrate_entry.get())
            if not (1 <= bitrate <= 80): raise ValueError("Bitrate")

            eye_view = self.eye_view.get()
            append_settings = self.append_settings.get()

            if bg_audio and not os.path.isfile(bg_audio):
                raise FileNotFoundError("The background audio file was not found.")

            self.last_settings = {
                "input_path": input_path,
                "fov": fov,
                "pitch": pitch,
                "width": width,
                "height": height,
                "bg_audio": bg_audio,
                "bg_volume": bg_volume,
                "tgt_suffix": tgt_suffix,
                "eye_view": eye_view
            }

            source_folder = os.path.dirname(input_path)
            converted_folder = os.path.join(source_folder, 'Converted')
            os.makedirs(converted_folder, exist_ok=True)

            # Generate output filename with optional parameter summary
            base_filename = os.path.splitext(os.path.basename(input_path))[0] + tgt_suffix
            if append_settings:
                base_filename += f"_FOV-{fov}_Pitch-{pitch}_Time-{duration}_View-{eye_view}_Bitrate-{bitrate}"
            base_filename += ".mp4"

            output_path = os.path.join(converted_folder, base_filename)

            # Build and execute ffmpeg command
            command = self.build_ffmpeg_command(
                input_path, output_path, duration, fov, pitch, width, height,
                bg_audio, bg_volume, bitrate, eye_view
            )

            process = subprocess.Popen(
                command,
                stdout=subprocess.PIPE,
                stderr=subprocess.STDOUT,
                text=True,
                bufsize=1,
                universal_newlines=True
            )

            self.show_console_output(process, output_path)

        except ValueError as ve:
            messagebox.showerror("Invalid Input", f"Please check your {ve} input.")
        except FileNotFoundError as fnfe:
            messagebox.showerror("File Error", str(fnfe))
        except EnvironmentError as ee:
            messagebox.showerror("Environment Error", str(ee))
        except Exception as e:
            messagebox.showerror("Error", f"Something went wrong:\n{str(e)}")

# Entry point to launch the GUI
def launch_gui():
    root = tk.Tk()
    app = VideoConverterApp(root)
    root.mainloop()

if __name__ == '__main__':
    launch_gui()

Rose · April 9, 2025, 4:48am

The major lacking feature is VR2Normal’s transitions; being able to export with pans and zooms.
Would love a simplified GUI for that feature specifically and I don’t feel that encoding SBS to 2D without pans and zooms has ever been too daunting a task.

VladTheImplier · April 9, 2025, 1:45pm

Should make a second post with an example and put it in Software

doting.puppet · April 9, 2025, 2:00pm

Sounds good, I’ll do that.

It needs a fair amout of polishing up. I have all kinds of potentially breaking code in here still.

EDIT: On the plus side, I get to see Hikaru Nagi all day long as I work on this…

doting.puppet · April 9, 2025, 4:33pm

Unfortunately, adding panning and zooming is still beyond my skill set.

To be fair, the origin of this script was to help me with the conversion of specific POV scenes in JAV-VR.