NeeCo: Image Synthesis of Novel Instrument States Based on Dynamic and Deformable 3D Gaussian Reconstruction

Published in IEEE TRANSACTIONS ON MEDICAL IMAGING, 2024

Visualization of NeeCo Model

Computer vision-based technologies significantly enhance the automation capabilities of robotic-assisted minimally invasive surgery by advancing tool tracking, detection, and localization. However, the lack of high-quality labeled image datasets constrains these techniques, which require large amounts of data for training. The high-dynamic surgical scenario poses a considerable challenge to the image synthesis methods. This research introduces a novel method using 3D Gaussian Splatting to overcome the scarcity of surgical image datasets. We propose a dynamic 3D Gaussian model to represent dynamic surgical scenes, enabling the rendering of surgical instruments from unseen viewpoints and deformations with real tissue backgrounds. Utilizing a dynamic training adjustment strategy, we address challenges posed by poorly calibrated camera poses from real-world dynamic scenes. Additionally, we propose a method based on dynamic Gaussians for automating the generation of annotations for our synthetic data. For evaluation of the method, we construct a new dataset with 7 scenes 14,000 frames recording tool and camera motion, as well as an articulation of the tool jaw, with a background of an ex-vivo porcine model. Using this dataset, we synthetically replicate the deformed instrument of ground truth data, allowing direct comparisons of synthetic image quality. Experimental results illustrate that our method generates photo-realistic labeled image datasets (29.87 PSNR). We further compare the performance of three U-Net and YOLO models trained on real, synthetic, and mixed synthetic images, respectively, by assessing their performance on an unseen real-world image dataset. Our results show that the performance of models trained on synthetic images and real images differs by less than 1.5% across various metrics, while the model trained on the mixed synthetic dataset shows an improvement in model performance by nearly 10%.