HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation

Fast and Accurate 4D Modeling of Large Multi-Camera Sequences

Abstract : Recent advances in acquisition and processing technologies lead to the fast growth of a major branch in media production: volumetric video. In particular, the rise of virtual and augmented reality fuels an increased need for content suitable to these new media including 3D contents obtained from real scenes, as the ability to record a live performance and replay it from any given point of view allows the user to experience a realistic and immersive environment.This manuscript aims at presenting the problem of 4D shape reconstruction from multi-view RGB images, which is one way to create such content. We especially target real life performance capture, containing complex surface details. Typical challenges for these capture situations include smaller visual projection areas of objects of interest due to wider necessary fields of view for capturing motion; occlusion and self-occlusion of several subjects interacting together; lack of texture content typical of real-life subject appearance and clothing; or motion blur with fast moving subjects such as sport action scenes. An essential and still improvable aspect in this matter is the fidelity and quality of the recovered shapes, our goal in this work.We present a full reconstruction pipeline suited for this scenario, to which we contributed in many aspects. First, Multi-view stereo (MVS) based methods have attained a good level of quality with pipelines that typically comprise feature extraction, matching stages and 3D shape inference. Interestingly, very recent works have re-examined stereo and MVS by introducing features and similarity functions automatically inferred using deep learning, the main promise of this type of method being to include better data-driven priors. We examine in a first contribution whether these improvements transfer to the more general and complex case of live performance capture, where a diverse set of additional difficulties arise. We then explain how to use this learning strategy to robustly build a shape representation, from which can be extracted a 3D model. Once we obtain this representation at every frame of the captured sequence, we discuss how to exploit temporal redundancy for precision refinement by propagating shape details through adjacent frames. In addition to being beneficial to many dynamic multi-view scenarios this also enables larger scenes where such increased precision can compensate for the reduced spatial resolution per image frame. The source code implementing the different reconstruction methods is released to the public as open source software.
Keywords : Kinovis Modeling 4d
Complete list of metadata

Cited literature [141 references]  Display  Hide  Download

Contributor : Abes Star :  Contact
Submitted on : Wednesday, March 18, 2020 - 5:07:07 PM
Last modification on : Sunday, April 17, 2022 - 3:26:30 AM


Version validated by the jury (STAR)


  • HAL Id : tel-02435385, version 2



Vincent Leroy. Fast and Accurate 4D Modeling of Large Multi-Camera Sequences. Artificial Intelligence [cs.AI]. Université Grenoble Alpes, 2019. English. ⟨NNT : 2019GREAM042⟩. ⟨tel-02435385v2⟩



Record views


Files downloads