Skip to content
Snippets Groups Projects
Commit 6e93712b authored by Muhammad Hazimi Bin Yusri's avatar Muhammad Hazimi Bin Yusri
Browse files

init commit

parents
Branches
No related tags found
No related merge requests found
Showing
with 578 additions and 0 deletions
File added
File added
File added
File added
File added
File added
File added
README.md 0 → 100644
# Immersive Audio-Visual VR Scene Reproduction
This project aims to reconstruct 3D geometry and acoustic properties of environments from a single 360° image for plausible audio-visual VR reproduction. It builds on the seminal work of Dr. Hansung Kim and extends research done by University of Southampton 4th year CS and EE students.
## Project Structure
The repository is structured as follows:
- `360monodepth/`: Submodule for monocular 360° depth estimation (Docker-based)
- `Atiyeh-RIR-evaluation-Matlab`: Matlab related scripts for audio analysis
- `AVVR-Papers/`: Related research papers
- `edgenet360/`: Submodule for mesh generation (WSL-based)
- `Data/`: Directory for input images
- `Output/`: Directory for generated meshes in .obj format
- `Intern-logs/`: Weekly logs from internship work including AudioResult excel
- `Internship-Report.pdf`: 10-week internship technical report
- `material_recognition/`: Submodule for material recognition using Dynamic Backward Attention Transformer
- `RIR_Analysis`: Python notebook for sine sweep and deconvolution by Mona
- `scripts/`: Automation and integration scripts
- `Unity/`:
- `AV-VR/`: Main Unity project folder, extending GDP work
- `S3A/`: Dr. Hansung's original Unity project for reference (Steam Audio integration, sound source positioning)
## Key Files
- `scripts/config.ini`: Modify the value in this file to fit system
- `scripts/GUI.py`: Main script to be run after following the setup instructions
- `AVVR-Papers/report.pdf`: 23/24 GDP's report
- `Manual.docx` / `Manual.pdf`: User manual provided by the GDP group
- `Intern-logs/Internship-Report.pdf`: 10-week internship technical report
- `.gitignore`: Lists files and directories to be ignored by Git
- `.gitmodules`: Defines submodule configurations
## Getting Started (Setup)
1. Clone the project repository:
```cmd
git clone https://git.soton.ac.uk/gdp-project-4/AVVR-Pipeline-GDP4.git
cd AVVR-Pipeline-GDP4
```
2. Update submodules:
```cmd
git submodule update --init --recursive
```
3. Set up environments:
a. For material recognition/DBAT (uses conda):
```cmd
cd material_recognition\Dynamic-Backward-Attention_Transformer
conda env create -f environment.yml
```
Download pre-trained [checkpoints](https://drive.google.com/file/d/1ov6ol7A4NU8chlT3oEwx-V51gbOU7GGD/view?usp=sharing) and also [swin_tiny_patch4_window7_224.pth](https://storage.openvinotoolkit.org/repositories/open_model_zoo/public/2022.1/swin-tiny-patch4-window7-224/?sort_by=NEW2OLD) into folder checkpoints
```cmd
mkdir checkpoints\dpglt_mode95\accuracy checkpoints\swin_pretrain
```
b. For blenderFlip.py (uses conda):
```cmd
cd scripts
conda env create -f unity_conda_env.yml
```
c. For edgenet360 (uses WSL):
- Install WSL and Anaconda following these instructions: https://info.stat.cmu.edu/index.php?title=Windows_Subsystem_for_Linux_(WSL)_and_Python
- Make sure wsl thats called in cmd is the one with anaconda installed
- Then create the tf2 environment:
```cmd
cd edgenet360
conda env create -f tf2_new_env.yml
```
- Download the weights from [here](https://gitlab.com/UnBVision/edgenet360/-/tree/master/weights?ref_type=heads) and put in edgenet360/weights folder if its not already there
d. For 360monodepth (uses Docker):
- Install Docker
- Build and run the Docker container:
```cmd
cd 360monodepth
docker build -t 360monodepth .
docker run -it --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=0 360monodepth sh -c "cd /monodepth/code/python/src; python3 main.py --expname test_experiment --blending_method all --grid_size 8x7"
```
4. Configure paths:
Edit scripts/config.ini to set binary directories for Anaconda in Windows and WSL.
5. Run the main ML pipeline:
```cmd
cd scripts
python GUI.py
```
6. GUI choices
![GUI](Intern-Logs/Readme_Images/GUI.png)
- Tick Create depth map, tick include Top for mesh (.obj) with ceiling.
- Choose image from different scenes folder (KT, ST, UL, MR, LR) in edgenet360/Data
- The pipeline should run for about 5-15 minutes depending on the system spec.
- Resulting output .obj will be in edgenet360/Output folder with name final_output_scene_mesh.obj
Refer to Manual.pdf for detailed prerequisites and setup instructions for the ML pipeline if needed and Unity VR rendering.
## Pipeline Overview
![image](Intern-Logs/Readme_Images/Pipeline-Overview.png)
## Video Demonstration
[KT and ST scene demo](https://youtu.be/CDhq749k1hQ)
## Contributing
To contribute to this project:
1. Fork the necessary submodules if you are making changes to them.
2. Create an issue describing the changes you propose.
3. Submit a pull request referencing the issue.
Please ensure your code adheres to the project's coding standards and includes appropriate documentation.
## Acknowledgements
This work is based on Dr. Hansung Kim's research at the University of Southampton and extends the Group Design Project by 4th year EE students.
For more information on the foundational work, please visit:
- [3D Kim VR Research (EUSIPCO)](http://3dkim.com/research/VR/EUSIPCO.html)
- [3D Kim VR Research Main Page](http://3dkim.com/research/VR/index.html)
[Github repo link for previous work TBD]
## Future Work
- Enhance monodepth depth image to fit better with EdgeNet360
- Remove unnecessary files to reduce git repository size
- Export the whole pipeline into a single executable file without need for prerequisite and setups (ambitious goal)
## License
Currently, this project is under the MIT License. However, considering that it builds upon existing research work, we are reviewing the most appropriate license that respects the original contributions while allowing for further research and development.
Note: The license may be subject to change in the future to better align with research and collaborative requirements.
\ No newline at end of file
# Read Docker image ID from config.ini
$configPath = "..\config.ini"
$imageNameOrId = (Get-Content $configPath | Where-Object { $_ -match "imageId=" }).Split('=')[1].Trim()
# Docker container name or ID
$containerNameOrId = docker ps --filter "ancestor=$imageNameOrId" --format "{{.ID}}"
# Path to the file you want to copy to the container
$localFilePath = "rgb.jpg"
# Destination path inside the container
$containerDestinationPath = "/monodepth/data/erp_00/"
# Delete any existing images from input
docker exec -it $containerNameOrId bash -c "find $containerDestinationPath -type f \( -name '*.jpg' -o -name '*.png' \) -exec rm -f {} +"
# Copy the file to the Docker container
docker cp "$localFilePath" "$containerNameOrId`:$containerDestinationPath"
"../../../data/erp_00/$localFilePath None" | Out-File -FilePath "erp_00_data.txt" -Encoding ascii
docker exec -it $containerNameOrId rm /monodepth/data/erp_00_data.txt
docker cp "erp_00_data.txt" "$containerNameOrId`:/monodepth/data/"
Remove-Item erp_00_data.txt
\ No newline at end of file
# Read Docker image ID from config.ini
$configPath = "..\config.ini"
$imageNameOrId = (Get-Content $configPath | Where-Object { $_ -match "imageId=" }).Split('=')[1].Trim()
# Grabbing the running container ID for the executed image
$containerId = docker ps --filter "ancestor=$imageNameOrId" --format "{{.ID}}"
docker exec -it $containerId bash -c "cd code/python/src && python3 main.py --expname test_experiment --blending_method all --grid_size 8x7"
\ No newline at end of file
# Read Docker image ID from config.ini
$configPath = "..\config.ini"
$imageNameOrId = (Get-Content $configPath | Where-Object { $_ -match "imageId=" }).Split('=')[1].Trim()
# Get the directory of the current script
$scriptDir = Split-Path -Parent $MyInvocation.MyCommand.Path
# Calculate paths
$rootDir = Split-Path -Parent (Split-Path -Parent $scriptDir)
$outputDir = Join-Path $rootDir "edgenet360\Data\Input"
# Docker container name or ID
$containerNameOrId = docker ps --filter "ancestor=$imageNameOrId" --format "{{.ID}}"
$outputPath = Join-Path $outputDir "depth_e.png"
Remove-Item $outputPath -ErrorAction SilentlyContinue
docker cp "$containerNameOrId`:/monodepth/results/test_experiment/000_360monodepth_midas2_frustum.png" $outputPath
\ No newline at end of file
invoke-expression 'cmd /c start powershell -Command { .\triggerinteractive.ps1}'
sleep 10
Start-Process powershell "-File .\dockercopy.ps1" -Wait
Start-Process powershell "-File .\executemonodepth.ps1" -Wait
Start-Process powershell "-File .\extractDepthMap.ps1" -Wait
Start-Process powershell "-File .\stopDockerContainer.ps1" -Wait
\ No newline at end of file
scripts/360monodepthexecution/rgb.jpg

5.1 MiB

# Read Docker image ID from config.ini
$configPath = "..\config.ini"
$imageNameOrId = (Get-Content $configPath | Where-Object { $_ -match "imageId=" }).Split('=')[1].Trim()
# Docker container name or ID
$containerNameOrId = docker ps --filter "ancestor=$imageNameOrId" --format "{{.ID}}"
# Stops the container
docker stop $containerNameOrId
# Removes the container
docker rm $containerNameOrId
\ No newline at end of file
# Read Docker image ID from config.ini
$configPath = "..\config.ini"
$imageNameOrId = (Get-Content $configPath | Where-Object { $_ -match "imageId=" }).Split('=')[1].Trim()
# Run image in interactive mode
docker run -it $imageNameOrId bash
\ No newline at end of file
NAME: EPN3D
N_POINTS: 2048
CATEGORY_FILE_PATH: ./data/EPN3D/EPN3D.json
PARTIAL_POINTS_PATH: ./data/EPN3D/%s/partial/%s.npy
COMPLETE_POINTS_PATH: ./data/EPN3D/%s/complete/%s.npy
\ No newline at end of file
import tkinter as tk
import tkinter.filedialog
import subprocess
import sys
import time
from threading import Thread
import shutil
import os
# Get the directory of the current script
SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
# Get the root directory (AVVR-Pipeline-Internship)
ROOT_DIR = os.path.dirname(SCRIPT_DIR)
file_path = None
createDepth = "0"
def shift_image_selection():
# This function can be used if you want to perform any action when the checkbox is clicked
pass
def copy_intermediary_outputs():
source_folder = os.path.join(ROOT_DIR, "edgenet360", "Data", "Input")
destination_folder = os.path.join(ROOT_DIR, "edgenet360", "Output")
files_to_copy = ["depth_e.png", "enhanced_depth_e.png", "material.png", "rgb.png"]
for file_name in files_to_copy:
source_path = os.path.join(source_folder, file_name)
destination_path = os.path.join(destination_folder, file_name)
try:
shutil.copy(source_path, destination_path)
print(f"Copied {file_name} to {destination_folder}")
except FileNotFoundError:
print(f"Warning: {file_name} not found in {source_folder}")
def select_Image(event):
global file_path
file_path = tkinter.filedialog.askopenfilename()
file_path = os.path.normpath(file_path)
select_button.configure(text="Selected", bg="red")
label.configure(text="Image is selected. Press run to create scene.")
def depthmap_creation():
print("Checked ", check.get())
def stanfordRoom_selection():
if checkStanford.get() == 1:
global stanford_frame
stanford_frame = tk.Frame(window)
stanford_frame.pack(fill=tk.X, padx=5, pady=5)
global labelRoomArea
labelRoomArea = tk.Label(stanford_frame, text="Please Input Room Area: ")
labelRoomArea.pack(side="left")
global stanford_text
stanford_text = tk.Entry(stanford_frame)
stanford_text.pack(side="left", fill=tk.X, expand=True)
else:
stanford_frame.pack_forget()
select_button.pack(side="top", fill=tk.X, expand=True, padx=5, pady=5)
run_button.pack(side="top", fill=tk.X, expand=True, padx=5, pady=5)
def run_Image(event):
if checkStanford.get() == 0:
label.configure(text="Pipeline is running. Creating scene...", height=15)
else:
label.configure(text="Pipeline is running for Stanford2D3D dataset. Creating scene...", height=15)
labelRoomArea.configure(text="Room Area Running : ")
stanford_text.configure(state="disabled")
select_button.pack_forget()
run_button.pack_forget()
depth_check.pack_forget()
include_top_check.pack_forget()
stanford_check.pack_forget()
shift_image_check.pack_forget()
threading()
def runProcess():
global file_path
include_top_option = "y" if include_top.get() == 1 else ""
shift_image_option = "y" if shift_image.get() == 1 else ""
try:
if checkStanford.get() == 0:
combined_bat = os.path.join(SCRIPT_DIR, "combined.bat")
print(f"Attempting to run: {combined_bat}")
print(f"With arguments: {file_path}, {str(check.get())}, {include_top_option}, {shift_image_option}")
p = subprocess.Popen(
[combined_bat, file_path, str(check.get()), include_top_option, shift_image_option],
stdout=sys.stdout)
p.communicate()
else:
temp = os.path.split(file_path)
suffices = temp[-1].split("_")
camera_pos = str(suffices[1])
room_name = suffices[2] + "_" + suffices[3]
room_area = stanford_text.get()
print(room_area, room_name, camera_pos)
combined_stanford_bat = os.path.join(SCRIPT_DIR, "combined_stanford.bat")
p = subprocess.Popen(
[combined_stanford_bat, file_path, camera_pos, str(room_area), room_name],
stdout=sys.stdout)
p.communicate()
copy_intermediary_outputs()
label.configure(text="Pipeline execution complete, check output folder.")
except Exception as e:
print(f"An error occurred: {e}")
label.configure(text=f"An error occurred: {e}")
try:
labelRoomArea.pack_forget()
stanford_text.pack_forget()
except Exception as e:
print(e)
def threading():
thread1 = Thread(target=runProcess)
thread1.start()
window = tk.Tk()
window.title("Immersive VR scene creator")
check = tk.IntVar()
checkStanford = tk.IntVar()
include_top = tk.IntVar()
shift_image = tk.IntVar()
label = tk.Label(
text="Please Input a RGB image for scene creation",
foreground="black",
background="white",
width=50,
height=10,
)
select_button = tk.Button(
text="Select",
width=50,
height=5,
bg="green",
fg="white",
)
run_button = tk.Button(
text="Run",
width=50,
height=5,
bg="green",
fg="white",
)
depth_check = tk.Checkbutton(window, text='Create a depth map(360 MonoDepth)',variable=check, onvalue=1, offvalue=0, command=depthmap_creation)
stanford_check = tk.Checkbutton(window, text='Run for stanford2D3D dataset',variable=checkStanford, onvalue=1, offvalue=0,command=stanfordRoom_selection )
include_top_check = tk.Checkbutton(window, text='Include Top in Mesh', variable=include_top, onvalue=1, offvalue=0)
shift_image_check = tk.Checkbutton(window, text='Shift input image', variable=shift_image, onvalue=1, offvalue=0, command=shift_image_selection)
label.pack()
depth_check.pack()
stanford_check.pack()
include_top_check.pack()
shift_image_check.pack()
select_button.pack()
run_button.pack()
select_button.bind('<Button-1>', select_Image)
run_button.bind('<Button-1>', run_Image)
window.mainloop()
\ No newline at end of file
"""Script used to automate flippin mesh normals such that mesh is visible on unity"""
import bpy
import pathlib
import sys
from _ctypes import ArgumentError
def generate_flipped_mesh(obj_file_path):
# Delete the default cube object
bpy.ops.object.select_all(action = 'DESELECT')
bpy.data.objects['Cube'].select_set(True)
bpy.ops.object.delete()
# import mesh
str_filepath = obj_file_path.as_posix()
bpy.ops.wm.obj_import(filepath = str_filepath, import_vertex_groups = True)
# flip normals
bpy.ops.object.editmode_toggle()
bpy.ops.mesh.select_all(action = 'SELECT')
bpy.ops.mesh.flip_normals()
# translate object to "floor" height
for obj in bpy.context.selected_objects:
mtx_w = obj.matrix_world
z_diff = min((mtx_w @ v.co).z for v in obj.data.vertices)
mtx_w.translation.z -= z_diff
# Save the object
output_file_path = obj_file_path.with_stem("final_output_scene_mesh")
str_filepath = output_file_path.as_posix()
bpy.ops.wm.obj_export(filepath = str_filepath,
export_material_groups = True)
def get_filepath(input_str):
candidate = pathlib.Path(*input_str).resolve()
if candidate.exists():
return candidate
else:
raise ArgumentError("File not found.")
if __name__ == "__main__":
# get filepath to obj file
if len(sys.argv) > 2:
raise ArgumentError("Supply one argument (folder path) or none.")
obj_file_path = get_filepath(sys.argv[1:])
generate_flipped_mesh(obj_file_path)
@echo off
SETLOCAL ENABLEDELAYEDEXPANSION
:: Input parameters
set inputFilePath=%~1
set depthVar=%2
set includeTop=%3
set shiftImage=%4
:: Read config file
for /f "tokens=1,2 delims==" %%a in (config.ini) do (
if "%%a"=="condaDir" set condaDir=%%b
if "%%a"=="wslAnacondaDir" set wslAnacondaDir=%%b
if "%%a"=="materialEnv" set materialEnv=%%b
if "%%a"=="edgeNetEnv" set edgeNetEnv=%%b
if "%%a"=="unityEnv" set unityEnv=%%b
)
:: Directory structure
set scriptDir=%~dp0
set workingDir=%scriptDir%..
set monoDepthDir=%workingDir%\scripts\360monodepthexecution
set outputDir=%workingDir%\edgenet360\Output
set materialRecogDir=%workingDir%\material_recognition\Dynamic-Backward-Attention-Transformer
set edgeNetDir=%workingDir%\edgenet360
:: File paths
set checkpointFile=%materialRecogDir%\checkpoints\dpglt_mode95\accuracy\epoch=126-valid_acc_epoch=0.87.ckpt
set shiftedImage=%scriptDir%shifted_t.png
set monoDepthImage=%monoDepthDir%\rgb.jpg
:: Ensure the input file exists
if not exist "%inputFilePath%" (
echo Error: Input file does not exist: %inputFilePath%
exit /b 1
)
:: Check if the checkpoint file exists
if not exist "%checkpointFile%" (
echo Error: Checkpoint file not found: %checkpointFile%
echo Please ensure the checkpoint file is in the correct location.
exit /b 1
)
:: Shift the image if the option is selected
if /I "%shiftImage%"=="y" (
echo Shifting the input image...
call "%condaDir%\condabin\activate.bat" base
python "%scriptDir%shifter.py" "%inputFilePath%" "%shiftedImage%"
call "%condaDir%\condabin\deactivate.bat"
set "processFilePath=%shiftedImage%"
) else (
set "processFilePath=%inputFilePath%"
)
echo Processing file: %processFilePath%
:: Copy the input file to 360monodepthexecution directory as rgb.jpg
copy "%processFilePath%" "%monoDepthImage%" /Y
if errorlevel 1 (
echo Error: Failed to copy file to 360monodepthexecution directory.
exit /b 1
)
:: Run depth estimation if required
if %depthVar%==1 (
echo Running depth estimation...
pushd "%monoDepthDir%"
powershell.exe -File "masterscript.ps1"
popd
)
:: Splitting 360 Image
echo Splitting 360 image...
pushd "%materialRecogDir%"
call "%condaDir%\condabin\activate.bat" base
python split_img.py "%processFilePath%"
call "%condaDir%\condabin\deactivate.bat"
popd
:: Running Material Recognition
echo Running material recognition...
pushd "%materialRecogDir%"
call "%condaDir%\condabin\activate.bat" %materialEnv%
python train_sota.py --data-root "./datasets" --batch-size 1 --tag dpglt --gpus 1 --num-nodes 1 --epochs 200 --mode 95 --seed 42 --test "%checkpointFile%" --infer "%materialRecogDir%/split_output/"
if errorlevel 1 (
echo Error: Material recognition failed. Please check the output above for more details.
call "%condaDir%\condabin\deactivate.bat"
exit /b 1
)
call "%condaDir%\condabin\deactivate.bat"
popd
:: Combining Material Recognition output
pushd "%materialRecogDir%"
echo Combining material recognition output...
call "%condaDir%\condabin\activate.bat" base
python "%materialRecogDir%\combine_img.py"
call "%condaDir%\condabin\deactivate.bat"
popd
:: Run EdgeNet
echo Running EdgeNet...
pushd "%edgeNetDir%"
set "includeTopFlag="
if /I "%includeTop%"=="y" set "includeTopFlag=--include_top y"
wsl bash -c "source %wslAnacondaDir%/activate %edgeNetEnv% && python enhance360.py Input depth_e.png rgb.png enhanced_depth_e.png && python infer360.py Input enhanced_depth_e.png material.png rgb.png Input %includeTopFlag%"
wsl bash -c "source %wslAnacondaDir%/deactivate"
popd
:: Run mesh splitting
echo Running mesh splitting...
pushd "%edgeNetDir%"
call "%condaDir%\condabin\activate.bat" base
python "%edgeNetDir%\replace.py"
call "%condaDir%\condabin\deactivate.bat"
popd
:: Run Blender flip
echo Running Blender flip...
pushd "%edgeNetDir%"
call "%condaDir%\condabin\activate.bat" %unityEnv%
python "%scriptDir%blenderFlip.py" "%outputDir%\Input_prediction_mesh.obj"
call "%condaDir%\condabin\deactivate.bat"
popd
:: Clean up temporary files
::if exist "%shiftedImage%" del "%shiftedImage%"
::if exist "%monoDepthImage%" del "%monoDepthImage%"
echo Processing complete.
ENDLOCAL
pause
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment