Abstract : Performance requirements and constraints on design costs and power consumption still require that significant parts of Digital Signal Processing (DSP) systems are implemented using dedicated hardware blocks. To deal with the SoC challenges, designers use system-level description, co-design techniques and re-use IP cores. Unfortunately, the main problem when re-using pre-designed hardware accelerator arises from their integration and more particularly from the communication features. System integrator can use standard interface such as VCI proposed by VSIA. However in DSP applications, in addition to the protocol aspects, SoC designer has also to synchronize the components and to buffer data to ensure the system behavior and to meet timing constraints. IPs are indeed delivered at the RTL level that is, following the VSIA taxonomy, the highest abstraction level for synthesizable IP models (soft cores). However, such a description may be parameterizable, it relies on a fixed architectural model with very restricted customization capabilities. This lack of flexibility of RTL IPs is especially true for the communication unit whose sequence orders and timing requirements are set. IP are hence connected to the SoC bus through specific interfaces (wrappers) that adapt the system communication features to the IP requirements. Unfortunately, this adaptation increases the final SoC area and also decreases system performance. In some cases, the I/O timing requirements cannot be respected due to the wrapper overhead and can cause the SoC design to fail.
We propose an approach based on high-level synthesis techniques under constraints to design the behavioral IP specification. Hence, we aim to optimally synthesize the IP by taking into account, in its specification, the system integration constraints: application rate, technology, bus format, I/O timing properties specified by timing frames of transfers. We consider variable but bounded timing constraints in real time DSP applications to handle non-determinism on transfer times that can originate from (1) computation performed by other system components and/or (2) transfer delay and protocol overhead.
Our methodology proposes to raise the abstraction level of IP synthetizable models by introducing the concept of behavioral IP, described as an algorithm and specified using HDL language. Starting from the system description and its architecture model, the integrator, for each bus or port that connects the IP to SoC components, refines and specifies I/O protocols, data sequence orders and timing information of transfers. The virtual component specification is modeled by a Signal Flow Graph SFG. We first generate an intermediate Algorithmic Constraint Graph ACG from the operator latencies and the data dependencies expressed in the SFG. Having described the IP behavior and the IP design constraints in a formal model, we then analyze the feasibility between the rate, data dependencies of the algorithm and technological constraints. This analysis checks the ACG for positive cycles to ensure that the constraint graph is feasible without considering input arrival dates. In order to support the features of communication architectures specific to DSP application, we define a formal model named IOCG (IO Constraint Graph) that supports expressing of integration constraints for each bus (id. port) that connects the IP to the SoC components. It allows (1) to specify transfer related timing constraints such as ordered transactions, relative timing specification, min-max delay, (2) to include architecture features and (3) to express non determinism in the data transfer time. Finally we generate a Global Constraint Graph (GCG) by merging the ACG with the IOCG graph. Merging is done by mapping the vertices and associated constraints of IOCG onto input and output vertices set of ACG. A minimum timing constraint on output vertices (earliest date for data transfer) of the IOCG are transformed into the GCG in maximum timing constraints (latest date for data computation/production). With the formal description of the set of constraints, we analyze the consistency of the IP design constraints according to the algorithm ones. Consistency analysis refers to the dynamic behavior of the GCG graph. The entry point of the IP design task is the GCG that is used to synthesize the processing, memory, control and communication units that compose the IP architecture.
We applied our method to DSP and telecommunication applications. A first experiment was carried out on a FFT example. With the experimental conditions, the optimization of operators is among 20% and that of the registers of 7%, compared to a HLS approach ignoring I/O constraints. A second experiment uses a Discrete Cosine Transform DCT to compare the results, obtained by applying the approach of integration we proposed, with the results of the wrapperbased methods. For the considered example, the communication register gain varies from -2% to 88% for a constant I/O rate. The last experiment carried out in industrial partnership, has shown the applicability of our methodology on a complex behavioral IP (Maximum A posteriori MAP) in an application of real time Turbo decoding.