Scene Representation and 3D Mesh Coding
For a widespread
use of 3D video objects in interactive applications, the scene
description needs to be standard conform. Since MPEG-4 already provides
a number of functionalities for synthetic 3D objects, we used MPEG-4
SNHC elements for geometry and texture description. Furthermore, the
developed view-dependent multi-texturing was intergrated into MPEG-4
AFX, so that the entire scene description is done in MPEG-4. For
transmission purposes, the different parts of the developed scene
description have to be coded. Here, we investigated available
technology and implemented a 3D mesh coding scheme for compression of
the object geometry.
Scene Representation
If a 3D video
object
has been constructed from a number of original cameras, its 3D geometry
and texture information from all cameras can be assembled into a
standardized scene description, as shown in Fig.1..
Fig.
1: Scene Description Overview
Here, the scene
description contains the geometry information as a mesh
or wireframe seuence with single meshes for each time instance.
Furthermore, a number of original textures is added together with the
associated camera vectors to enable view dependent object rendering.
All components are described by the MultiTexture node that was
integrated into MPEG-4 AFX to provide standard conformity. Fig.1 also
shows the underlying geometry with D3DMC as newly developed mesh coder
and the state-of-the-art video codec H.264/AVC for efficient video
coding.
3D Mesh Coding
In the developed
scene description, the geometry consists of a mesh
sequence with time intervalls of constant mesh topology, if the
objects's 3D motion is limited. The number of meshes of such time
intervalls are grouped as groups-of-meshes (GOMs) with constant
wireframe connectivety. To exploit the spatial and temporal coherences
in such GOMs, we developed a mesh predictive mesh coding structure, and
named it D3DMC (differential 3D mesh coding).
Fig. 2:
Original mesh (left) and
reconstruction error and error distribution using 3DMC at 274 kBit/s
(middle) and D3DMC at 253 kBit/s (right).
Fig. 2 shows a
block diagram of the encoder. It contains MPEG-4 3DMC as
fallback mode that is enabled through the Intra/Inter switch that is
fixed to either one per 3D mesh. This Intra mode is used for instance
when the first mesh of a GOM is encoded, i.e. when no prediction from
previously transmitted meshes is used. Additionally, the Intra mode can
also be assigned by the encoder control in any other case, i.e. if the
prediction error is too large. This fallback modus provides backwards
compatibility to 3DMC and ensures that the new algorithm can never be
worse than the state-of-the-art.
The new predictive mode is a classical DPCM-loop with arithmetic coding
of the residuals. First the previous decoded mesh is subtracted from
the current mesh to be encoded. This step can only be done if
time-consistent meshes with the same connectivity are available, and
therefore we have constrained the mesh extraction process as described
above. In the next step, a spatial clustering algorithm is applied to
the difference vectors, in order to compute one representative for a
number of vectors. Finally the residual signal is passed to an
arithmetic coder for further lossless compression.
Fig. 3: Original mesh (left) and
reconstruction error and error distribution using 3DMC at 274 kBit/s
(middle) and D3DMC at 253 kBit/s (right).
Fig. 3 illustrates
the reconstruction error. Left is again
the original. The other images show the reconstruction error of the
standard MPEG-4 3DMC coder (middle) and D3DMC (right). Both color codes
have a maximum value of
0.03. The reconstruction error for D3DMC very low values smaller then
0.0075, as indicated by the blue and green colors in Fig. 3 right.
|