Résumé en ligne des vidéos complexes, recherche interactive des images par le contenu, et aide au diagnostic médical basé sur l’analyse des images

Walid Barhoumi

Résumé

The research work within the framework of the university habilitation is articulated around the analysis of visual information, and is organized according to three axes : ■ Video analysis: this area revolves around the analysis and interpretation of complex videos. In particular, we were interested in automatic tools for the compact and representative description of video content. In this context, we proposed to select on the fly a relatively small number of keyframes that summarize the salient visual content of a video shot. The proposed technique is based on the spatial segmentation of each frame in order to detect important events, so that each keyframe presents an important event. However, in the case of a moving camera, modeling the background requires compensation for camera movement. To do this, we have proposed a multi-primitive approach for image alignment whose main originality lies in the use of region matching in order to pre-estimate the movement of the camera and to limit by the following the search space of potential counterparts when matching points of interest. The quality of the experimental results has enabled us to propose numerous downstream applications (tracking of moving objects, invisible watermarking of videos, etc.). Nevertheless, by analyzing the results produced by our mosaicing approach on long videos, we were able to observe that summarizing all the content of the video into a single mosaic can cause a huge size of this mosaic due to the fact that the distortion of the motion model grows rapidly with any change in the rotation angle or scale factor of the camera. From there, we proposed to summarize the visual content of a complex video in multiple mosaics, and this by optimizing the online choice of the reference image in order to reduce the size of the mosaics. ■ Search for visual information: this second axis deals with the indexing and search of images by content in large general databases. The objective is to reduce the semantic gap and this by proceeding at the region or even object level, while giving the possibility of learning by using techniques of looping relevance. Indeed, based on the assumption that any region could be useful in the search process, all regions of the image were considered. From there, a complete graph is defined relative to each image, where each node represents a coarsely segmented fuzzy region and it is evaluated in terms of two properties: the low-level descriptors of the region and its weight evaluating its importance within the 'picture. Furthermore, spatial information is incorporated into the graph structure by characterizing each edge between two regions by two quintuplets illustrating the inter-region spatial arrangements in both directions. In addition, the user has the possibility of carrying out iterations of looping of positive and/or negative relevance to get as close as possible to his need. In this perspective, we have introduced a mechanism which is based on the adaptation of the weights of the regions of the query image according to the feedbacks of the user. In addition, assuming that co-segmentation can provide effective solutions for the search and classification of images by content, especially during relevance looping iterations, we explored this field of research. Indeed, we proposed to integrate the spatial information in order to avoid false detections and the effects of noise, and this via the fuzzy classification of the local entropy in order to reduce the ambiguity of membership of a pixel. to a bin of a histogram. ■ Medical imaging: the objectives of the work in this area are mainly diagnostic assistance based on medical images and the automatic annotation of digital patient files. Indeed, three classes of architectures have been used in the literature of image-based medical diagnostic aid, namely those based on classification, those based on registration and those based on the search for images by content. . These three approaches have often been studied separately despite their commonalities and complementarities. From there, we proposed an architecture that jointly merges the three approaches by the theory of evidence, and this by considering the opinion of each approach as being an uncertain source on the malignancy of the organ studied. Furthermore, in the context of melanoma diagnosis, we have proposed a structural method for the automatic detection of the pigment network in dermatoscopic images. The main contribution lies in the fuzzy evaluation of the degree to which a hole belongs to the pigment network, which made it possible to keep as many candidates as possible and to postpone the decision until more information was obtained. In addition, with the aim of optimizing the quality of segmentation based on the texture of images, we started by proposing a segmentation technique which highlights two levels of texture perception. To do this, after over-segmentation of the image, a statistical analysis is carried out to determine the nature of the texture of each region (fine vs. coarse). Then, a wavelet analysis is carried out by adapting the choice of the subband to the nature of the texture. Furthermore, in the context of computer-assisted surgery, we were interested in the reconstruction of 3D models from 2D data (multi-views or multi-modalities). We proposed an improvement to the voxel coloring method. In addition to the integration of geometric information using hysteresis thresholding which takes into consideration the connectivity of the colored voxels, our contribution lay in the dynamic and fully automatic choice of thresholds.

Les travaux de recherche dans le cadre de l’habilitation universitaire se sont articulés autour de l’analyse de l’information visuelle, et s’organisent selon trois axes : ■ Analyse des vidéos : cet axe s’articule autour de l’analyse et l’interprétation des vidéos complexes. En particulier, nous nous sommes intéressés aux outils automatiques pour la description compacte et représentative des contenus des vidéos. Dans ce cadre, nous avons proposé de sélectionner à la volée un nombre relativement restreint d’images-clés qui résument le contenu visuel saillant d’un plan vidéo. La technique proposée est basée sur la segmentation spatiale de chaque image afin de détecter les événements importants, de sorte que chaque image-clé présente un événement important. Toutefois, dans le cas d’une caméra mobile, la modélisation du fond nécessite la compensation du mouvement de la caméra. Pour ce faire, nous avons proposé une approche multi-primitive pour l’alignement d’images dont l’originalité principale réside dans l’utilisation de l’appariement des régions afin de pré-estimer le mouvement de la caméra et de limiter par la suite l’espace de recherche des homologues potentiels lors de l’appariement des points d’intérêt. La qualité des résultats expérimentaux nous a permis de proposer de nombreuses applications en aval (suivi des objets mobiles, tatouage invisible des vidéos, etc.). Néanmoins, en analysant les résultats produits par notre approche de mosaicing sur de longues vidéos, nous avons pu constater que la synthèse de tout le contenu de la vidéo en une seule mosaïque peut provoquer une taille énorme de cette mosaïque à cause du fait que la déformation du modèle de mouvement croît rapidement avec tout changement de l’angle de rotation ou du facteur d’échelle de la caméra. De là, nous avons proposé de résumer le contenu visuel d’une vidéo complexe en multiple mosaïques, et ceci en optimisant le choix en ligne de l’image de référence afin de réduire la taille des mosaïques. ■ Recherche de l’information visuelle : ce deuxième axe traite l’indexation et la recherche des images par le contenu dans des grandes bases généralistes. L’objectif est de réduire le fossé sémantique et ceci en procédant au niveau région, voire objet, tout en donnant la possibilité d’apprentissage en faisant recours aux techniques de bouclage de pertinence. En effet, fondé sur l’hypothèse que n’importe quelle région pourrait être utile dans le procédé de recherche, toutes les régions de l’image ont été considérées. De là, un graphe complet est défini relativement à chaque image, où chaque nœud représente une région floue grossièrement segmentée et il est évalué en termes de deux propriétés : les descripteurs de bas niveau de la région et son poids évaluant son importance au sein de l’image. En outre, l’information spatiale est incorporée dans la structure de graphe en caractérisant chaque arête entre deux régions par deux quintuplets illustrant les dispositions spatiales inter-régions dans les deux sens. Par ailleurs, l’utilisateur a la possibilité de réaliser des itérations de bouclage de pertinence positif et/ou négatif pour se rapprocher au mieux de son besoin. Dans cette perspective, nous avons introduit un mécanisme qui se base sur l’adaptation des poids des régions de l’image requête en fonction des rétroactions de l’utilisateur. En outre, partant de l’hypothèse que la cosegmentation peut apporter des solutions efficaces pour la recherche et la classification des images par le contenu, notamment pendant les itérations de bouclage de pertinence, nous avons exploré ce champ de recherche. En effet, nous avons proposé d’intégrer l’information spatiale afin d’éviter les fausses détections et les effets du bruit, et ceci via la classification floue de l’entropie locale afin de réduire l’ambiguïté d’appartenance d’un pixel à un bin d’un histogramme. ■ Imagerie médicale : les objectifs des travaux sur cet axe sont principalement l’aide au diagnostic basé sur les images médicales et l’annotation automatiques des dossiers numériques des patients. En effet, trois classes d’architectures ont été utilisées dans la littérature de l’aide au diagnostic médical basé sur les images, à savoir celles basées sur la classification, celles basées sur le recalage et celles basées sur la recherche des images par le contenu. Ces trois approches ont été souvent étudiées séparément malgré leurs points communs et leurs complémentarités. De là, nous avons proposé une architecture qui fusionne conjointement les trois approches par la théorie de l’évidence, et ceci en considérant l’avis de chaque approche comme étant une source incertaine sur la malignité de l’organe étudiée. En outre, dans le contexte du diagnostic du mélanome, nous avons proposé une méthode structurale pour la détection automatique du réseau de pigment dans les images dermatoscopiques. La principale contribution réside dans l’évaluation floue du degré d’appartenance d’un trou au réseau de pigment, ce qui a permis de garder le maximum de candidats et de repousser la décision jusqu’à l’obtention de plus amples informations. En plus, dans l’objectif d’optimiser la qualité de la segmentation basée sur la texture des images, nous avons commencé par proposer une technique de segmentation qui met en exergue deux niveaux de perception de la texture. Pour cela, après la sur-segmentation de l’image, une analyse statistique est effectuée afin de déterminer la nature de la texture de chaque région (fine vs. grossière). Ensuite, une analyse par ondelettes est réalisée en adaptant le choix de la sousbande à la nature de la texture. Par ailleurs, dans le cadre de la chirurgie assistée par ordinateur, nous nous sommes intéressés à la reconstruction des modèles 3D à partir des données 2D (multi-vues ou multi-modalités). Nous avons proposé une amélioration de la méthode de coloration des voxels. En plus de l’intégration de l’information géométrique en utilisant un seuillage par hystérésis qui prend en considération la connexité des voxels colorés, notre contribution résidait dans le choix dynamique et entièrement automatique des seuils.

Online summary of complex videos, interactive content-based retrieval of images, and assistance in medical diagnosis based on image analysis

Résumé en ligne des vidéos complexes, recherche interactive des images par le contenu, et aide au diagnostic médical basé sur l’analyse des images

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager