Abstract : Today some embedded systems still do not integrate their own floating-point unit, for area, cost, or energy consumption constraints. However, this kind of architectures is widely used in application domains highly demanding on floating-point calculations (multimedia, audio and video, or telecommunications). To compensate this lack of floating-point hardware, floating-point arithmetic has to be emulated efficiently through a software implementation. This thesis addresses the design and implementation of an efficient software support for IEEE 754 floating-point arithmetic on embedded integer processors. More specifically, it proposes new algorithms and tools for the efficient generation of fast and certified programs, allowing in particular to obtain C codes of very low latency for polynomial evaluation in fixed-point arithmetic. Compared to fully hand-written implementations, these tools allow to significantly reduce the development time of floating-point operators. The first part of the thesis deals with the design of optimized algorithms for some binary floating-point operators, and gives details on their software implementation for the binary32 floating-point format and for some embedded VLIW integer processors like those of the STMicroelectronics ST200 family. In particular, we propose here a uniform approach for correctly-rounded roots and their reciprocals, and an extension to division. Our approach, which relies on the evaluation of a single bivariate polynomial, allows higher ILP-exposure than previous methods and turns out to be particularly efficient in practice. This work allowed us to produce a fully revised version of the FLIP library, leading to significant gains compared to the previous version. The second part of the thesis presents a methodology for automatically and efficiently generating fast and certified C codes for the evaluation of bivariate polynomials in fixed-point arithmetic. In particular, it consists of some heuristics for computing highly parallel, low-latency evaluation schemes, as well as some techniques to check if those schemes remain efficient on a real target, and accurate enough to ensure correct rounding of the underlying operator implementations. This approach has been implemented in the software tool CGPE (Code Generation for Polynomial Evaluation). We have used our tool to quickly generate and certify significant parts of the codes of FLIP.