Abstract : In teleconferencing applications, animated 3D heads can replace the usual video channels. This offers high compression opportunities, as well as the freedom of virtual spaces : one can compose a virtual place on screen, where 3D representations are debating. This thesis introduces an ad-hoc rendering algorithm, that can be applied to photorealistic 3D heads, at the expense of slightly limited viewing angles. Fast renderings are achieved on simple computers as well as 2D virtual machines. An automatic control architecture of the virtual cameras and the broadcasted view is also proposed. Cameras produce synthetic views, and react to speaking events. Switching between various (partial) camera views intends to let the debate look more attractive and intelligible. More simultaneous participants can take part, without their image size being lowered too much. The manual-interface-free automatic scheme enables the user to naturally talk and concentrate on the discussions. The previous parts have been implemented in a prototype. Several sound-scene scenarios have been experimented, playing with the image and sound association in a spatial-sound environment, where events from or outside the viewed area can be simulated. An hybrid video-based and 3D-based solution to the face-animating problem is defended as well. Partial images of eyes, mouth and eyebrows regions from live performance are inlaid on the clone surface texture. That way, it's up to the spectator to interpret the broadcasted video-like expressions. Another prototype has been built to test the real-time visual empathy.