Skip to Main content Skip to Navigation


Aroua Briki 1
Lab-STICC - Laboratoire des sciences et techniques de l'information, de la communication et de la connaissance
Abstract : Error correcting codes i.e. LDPC (Low Density Parity Check) and Turbo-codes are the foundation of communication. Standards like digital video broadcasting (DVB), high-speed wired links (ADSL...), wireless accesses (WiMAX, Wifi...), and telecommunications systems (HSPA, LTE...) all rely on it. LDPC and Turbo Codes are well-known, near Shannon limit, coding/decoding approaches that are able to achieve very low bit error rates for low Signal-to-Noise Ratio (SNR) applications. Decoding principle is based on message passing algorithm in which different processing elements iteratively exchange information in order to improve the error correction performance of these codes. In order to design high data rate applications, parallel architecture are needed. To increase memory bandwidth, main memory is divided into different memory banks to provide concurrent parallel access to all the processing elements. This allows to reduce the latency and thus to increase the throughput of decoders. Typical parallel decoder architecture includes processing elements connected through a dedicated Interconnection Network to memory banks and a dedicated Control Unit that drives the architecture. The network interleaves the data exchanged by the processing elements according to a rule named interleaving law or permutation law defined by the standard or the application to design. Unfortunately, depending on both interleaving law and memory mapping (i.e. data placement in memory banks), different processing elements may try to simultaneously access the same memory bank which results in memory conflicts. Three kinds of solution exist to avoid or minimize memory access conflicts: (1) Define an interleaving law that automatically maps data in different memory banks so that all processing elements can access them without any conflict at each time instance, but only when the designer is free to choose the permutation law; (2) Simply store data elements in different memory banks without considering conflicting accesses and then use different complex topologies and additional buffers to mange conflicts on runtime. This increases the cost and latency of the system. (3) Use memory mapping algorithms to map data in different memory banks so that each processing elements can access them without any conflict. This kind of approach results in a non-optimized architectures. In addition, ROM resources are needed in the controller to store the memory mapping (i.e. control words to address memory banks and to drive interconnection networks). Unfortunately, cost of the controller is not considered for optimization in any state of the art approaches. Our proposed approach is based on two main steps: first, starting from the set of constraints (i.e. the interleaving law, the parallelism and the targeted interconnection network), it generates a conflict free memory mapping (through a mapping algorithm based on memory constraint relaxation) and an Address Conflict Graph ACG. At this step, the data are assigned to memory banks (i.e. Bank Mapping), but their addresses in banks are still unknown (i.e. Address Mapping is not yet defined). In ACG, vertices represent data accesses in their respective memory banks, and edges represent conflicts between these accesses (e.g. write a new data at a currently used address). The memory mapping is performed thanks to a dedicated heuristic that aims to both generate a conflict free memory mapping and minimize the number of conflict in the generated ACGs. Indeed, when two data are stored in the same memory bank, and if their respective "lifetimes" in this memory bank are overlapping, two different memory locations (i.e. addresses) are required. Hence the bank mapping step strongly impacts the final ACGs, by generating more or less address conflicts. Then, memory mapping information and ACG are used during the Address Mapping step to explore the memory addressing in order to optimize the final cost of the memory controller. Finally, our design flow generates the final VHDL architecture with a conflict free memory mapping and the associated optimized control units. 8 Our approach has been applied to explore the design space of several test cases. The resulting architectures respect the designer architectural constraints in any case and the controllers are strongly optimized.
Complete list of metadatas

Cited literature [42 references]  Display  Hide  Download
Contributor : Cyrille Chavet <>
Submitted on : Tuesday, January 14, 2014 - 5:39:22 PM
Last modification on : Wednesday, October 14, 2020 - 4:09:34 AM
Long-term archiving on: : Tuesday, April 15, 2014 - 4:28:02 PM


  • HAL Id : tel-00931009, version 1


Aroua Briki. UNE NOUVELLE APPROCHE DE PLACEMENT DE DONNEES EN MEMOIRE : APPLICATION A LA CONCEPTION D'ARCHITECTURES D'ENTRELACEURS PARALLELES. Traitement du signal et de l'image [eess.SP]. Université de Bretagne Sud, 2013. Français. ⟨tel-00931009⟩



Record views


Files downloads