# Conception et test de systèmes CMOS fiables et tolérants aux pannes T. Calin #### ▶ To cite this version: T. Calin. Conception et test de systèmes CMOS fiables et tolérants aux pannes. Micro et nanotechnologies/Microélectronique. Institut National Polytechnique de Grenoble - INPG, 1999. Français. NNT: . tel-00163765 # HAL Id: tel-00163765 https://theses.hal.science/tel-00163765 Submitted on 18 Jul 2007 **HAL** is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire **HAL**, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. # INSTITUT NATIONAL POLYTECHNIQUE DE GRENOBLE # $\underline{THESE}$ pour obtenir le grade de #### **DOCTEUR DE l'INPG** Spécialité : Microélectronique préparée au laboratoire **TIMA** dans le cadre de **l'Ecole Doctorale** # "ELECTRONIQUE, ELECTROTECHNIQUE, AUTOMATIQUE, TELECOMMUNICATIONS, SIGNAL" présentée et soutenue publiquement par # **Teodor CALIN** le 8 novembre 1999 #### Titre: # Conception et test de systèmes CMOS fiables et tolérants aux pannes Directeur de thèse : # **Michael NICOLAIDIS** \_\_\_\_\_ #### **JURY** | M. Pierre GENTIL | Président | |-----------------------|-------------| | M. Jean GASIOT | Rapporteur | | M. Pierre MARCHAL | Rapporteur | | M. Laroussi BOUZAIDA | Examinateur | | M. Raoul VELAZCO | Examinateur | | M. Michael NICOLAIDIS | Examinateur | ## Remerciements Les travaux de recherche présentés dans ce mémoire ont été réalisés au sein de l'équipe de Systèmes Intégrés Sûrs du Laboratoire TIMA, dirigé par Monsieur Bernard COURTOIS. Je remercie tout d'abord Monsieur Bernard COURTOIS de m'avoir accueilli dans son laboratoire. Je remercie Monsieur Michel NICOLAIDIS pour avoir dirigé mes travaux et pour la confiance qu'il m'a accordé durant ces années de recherche. Que Monsieur Raoul VELAZCO soit assuré de ma reconnaissance pour sa précieuse contribution : son soutien constant, sa participation efficace et son amitié. J'adresse mes sincères remerciements à Monsieur Laroussi BOUZAIDA, Manager of Test Solutions à ST Microelectronics, pour l'intérêt qu'il porte à ces travaux en acceptant de faire partie du jury de thèse. Mes remerciements vont également à Messieurs Jean GASIOT, Professeur à l'Université Montpellier II, et Pierre MARCHAL, Expert de Recherches à CSEM Neuchatel, pour leur lecture attentive et critique du mémoire, en tant que rapporteurs de cette thèse. Je remercie vivement Monsieur Pierre GENTIL, Professeur à l'Institut National Polytechnique de Grenoble, pour l'intérêt qu'il porte à ces travaux et pour avoir accepté de présider le jury de thèse. Ce travail de thèse n'aurait pas été aussi enrichissant sans la présence de nombreuses personnes créant un environnement scientifique et humain de qualité : merci à tous les membres du laboratoire. #### RESUME Cette thèse propose des nouvelles méthodes de conception et de test des systèmes CMOS intégrés, permettant d'augmenter la fiabilité et la tolérance aux pannes en technologies submicroniques profonds, et répondre à l'augmentation des défauts non-décelables au test de fabrication et à la sensibilité accrue aux aléas dus aux rayons cosmiques. Pour améliorer la détection de fautes dans les circuits CMOS complexes, des capteurs de courant intégrés à haute vitesse et sensibilité fonctionnant sous faible tension d'alimentation sont proposés. Les algorithmes de mesure de courants $I_{DDQ}$ , développés parallèlement, sont analysés et optimisés en synergie avec des techniques de conception à faible consommation. L'utilisation de capteurs de courant a été étendue à un test en-ligne qui permet de détecter les fautes permanentes dans les applications critiques, et de corriger les erreurs dans les mémoires SRAM par codage de parité. Cette approche a été validée par des tests sous rayonnement sur des circuits prototypes. Une stratégie de conception de circuits CMOS immunes aux aléas indépendante de la technologie utilisée a été ensuite développée, basée sur des techniques de redondance locale. Sa validation expérimentale par des tests sous rayonnement a été effectuée sur des circuits prototypes réalisés en technologies CMOS commerciales 1,2,0,8 et 0,25 microns. L'analyse des techniques de durcissement implantées a été faite à l'aide de méthodes de test intégré et en utilisant des équipements laser aux impulsions. Des mécanismes d'erreurs et une sensibilité aux aléas liés à la topologie ont été mis en évidence et caractérisés. En réponse, on a élaboré des règles de conception spécifiques, conduisant à un durcissement topologique aux aléas. Une bibliothèque de cellules séquentielles durcies a été développée, en vue de son utilisation dans un modem ASIC dédié au satellite expérimental NANOSAT qui sera mis en orbite en 2001. ## PRÉSÉNTATION ÉTENDUE DE LA THÉSE Les technologies CMOS VLSI actuelles permettent de fabriquer des circuits intégrés complexes comportant plusieurs centaines de millions de transistors, fonctionnant à des tensions toujours plus faibles et à des vitesses d'horloge dans le domaine du GHz. L'utilisation de ces dispositifs aux densités d'intégration toujours plus élevées entraîne l'utilisation de champs électriques importants, ce qui peut affecter la fiabilité des circuits et augmente également leur sensibilité à la contamination, aux variations dimensionnelles, aux bruits et aux perturbations électriques. Le coût et la complexité des méthodes de détection des défauts et de diagnostic croît proportionnellement avec la dimension des puces et la densité d'intégration. D'autre part, les progrès réalisés dans la précision des méthodes de fabrication ne parviennent pas à rester en phase avec l'augmentation du taux de défauts induits par l'intégration à l'échelle submicronique. C'est pourquoi la détection effective des défauts induits par la fabrication au moyen de tests électriques devient de plus en plus difficile avec l'accroissement de la complexité des circuits [75][77][95]. Par ailleurs, les C.I CMOS fabriqués dans le domaine largement submicronique (DSM), doivent également satisfaire à une réduction des marges d'utilisation en matière de timing, de puissance et de bruit etc. Etant donné qu'ils arrivent à surmonter les tests de fabrication avec un taux de défauts "cachés" toujours plus grand, ces effets combinés peuvent conduire à un élargissement sensible du domaine des taux de défauts : fautes transitoires, dégradation de performances et défauts catastrophiques. Les fautes transitoires sont principalement dues à des cadences de fonctionnement critiques, au bruit et à l'effet de radiations induites par le boîtier ou l'environnement [22][24]. Ces types de défauts augmentent de manière significative avec la densité d'intégration, l'augmentation des vitesses de fonctionnement et les tensions d'alimentation toujours plus faibles, ce qui conduit à des effets potentiellement destructifs dans la conception de systèmes intégrés sur une puce. En conséquence, à la fois les défauts permanents cachés, induits par de petits défauts de fabrication, et les défauts transitoires associés à des marges technologiques et en performances toujours plus étroites, deviennent une menace pour les applications des circuits CMOS DSM. Les mécanismes de défaut "soft", conduisant à des niveaux de tension intermédiaires et les fautes de délai ne sont pas détectées ou difficiles à détecter par les méthodes de test en tension. Tout ceci peut avoir un impact significatif sur les produits hautes performances, les technologies de développement, les rendements de production et la fiabilité des systèmes. Dans ces conditions, l'utilisation systématique de méthodologies complémentaires de test, telles que le test IDDQ, constitue un passage obligé pour les CMOS DSM afin de détecter efficacement les erreurs "soft". D'autre part, l'amélioration du rendement basée sur la redondance dès le stade de la conception, ainsi que l'implémentation d'architectures de circuits tolérants aux fautes, deviennent à la fois obligatoires et économiquement justifiables pour les applications utilisant des systèmes intégrés sur puces de hautes performances et haute fiabilité. Si l'on considère le test du courant de repos (IDDQ) comme un moyen essentiel de détecter les fautes softs pour les technologies CMOS submicroniques, on aboutit à deux résultats conflictuels : d'une part, il est nécessaire d'améliorer la précision de la mesure du courant de repos pour cerner au mieux l'ensemble des mécanismes de création de défauts. D'autre part, la détection des erreurs part le test de l'IDDQ est réduite de manière drastique suite à l'augmentation du courant de fuite en dessous de la tension de seuil des dispositifs sans défauts, qui masque les faibles courants engendrés par les défauts. Ce travail de recherche propose une approche non euphémique de la conception de VLSI CMOS, en ce qui concerne le test, la fiabilité et la tolérance aux fautes, dans lequel la complexité des moyens engagés pour la détection/diagnostic des systèmes, et le coût élevé de la redondance embarquée, doivent constituer une incitation à trouver des solutions meilleures et plus efficaces afin d'atteindre ces objectifs fondamentaux, et non pas une motivation pour accepter des produits à coût élevé, de faible qualité et fiabilité. Plusieurs voies stratégiques sont explorées afin d'aboutir à la conception de dispositifs CMOS submicroniques fiables et tolérants aux fautes. - techniques de contrôle de courant sur puce, à la fois pour un test IDDQ "off-line" efficace, et la détection concurrente de défauts transitoires "on-line", - conception de systèmes CMOS immunes aux aléas, pour les applications spatiales ou en environnement radiatif, utilisant des approches indépendantes de la technologie, basées sur la redondance. - techniques de validation de systèmes, de diagnostic des défauts et de redondance répartie, utilisant des mécanismes de simulation d'injection de défauts sur la puce et, - analyse de sensibilité à la topologie d'architectures CMOS tolérantes aux fautes, utilisant une stimulation externe par un faisceau laser pulsé focalisé avec précision. Nous avons exploré des stratégies de conception de systèmes utilisant des approches à contraintes multiples, associant la conception faible consommation et le IDDQ intégré. Elles reposent sur l'utilisation de circuits de commutation de l'alimentation et de contrôleurs de courant embarqués. Leur coût peut ainsi être partagé, ce qui rend l'implémentation système plus rentable, l'efficacité et les performances étant améliorées. Notre recherche est concentrée également sur la conception de mémoires immunes aux aléas pour les applications nucléaires et spatiales, et ses effets de synergie en ce qui concerne la tolérance aux fautes, la fiabilité et une fonctionnalité système étendue. Dans notre étude, les techniques de conception logiques basées sur la redondance sont corrélées avec les stratégies de conception physiques basées sur la topologie et dont l'objet est d'améliorer l'immunité. Des possibilités fonctionnelles élargies peuvent être ajoutées, telles que des fonctions de codage et de sécurité, basées sur la conception de cellules SRAM redondantes, la reconfiguration du système et des stratégies de contrôle adaptatif des performances et de la consommation. L'un de nos principaux objectifs est de développer des stratégies de conception pour le test, la fiabilité et la tolérance aux fautes qui permettent également d'élargir les fonctionnalités système. Deux approches de test intégré sont considérées afin d'assurer la tolérance aux fautes transitoires : l'une utilise l'injection de fautes en mode courant et l'autre est basée sur le contrôle logique des éléments redondants. L'injection externe de fautes transitoires est utilisée ensuite pour la qualification du durcissement aux aléas. Cette méthode utilise un faisceau laser pulsé et focalisé d'intensité contrôlée pour stimuler des transitions logiques afin d'identifier et de qualifier les zones sensibles aux aléas de la topologie des circuits. On peut résumer comme suit la contribution à la connaissance apportée par cette thèse : - 1. Conception de capteurs de courant intégrés pour les technologies CMOS largement submicroniques, utilisables à basse tension, vitesse et sensibilité de détection élevées. - 2. Développement et validation d'algorithmes de contrôle des courants IDDQ à base de capteurs intégrés asynchrones pour la conception de mémoires SRAM tolérantes aux aléas, utilisant la détection de fautes transitoires et le test IDDQ adaptatif. - 3. Conception d'architectures de systèmes CMOS immunes aux aléas et tolérants aux fautes transitoires, utilisant des techniques de redondance locale. Validation sur des circuits prototypes élaborés en technologies CMOS 1,2 0,8 et 0,25 µm. - 4. Conception d'une bibliothèque de cellules séquentielles durcies aux aléas utilisant une technologie CMOS $0.6~\mu m$ . - 5. Conception et application de stratégies de test et de validation sur puce d'architectures CMOS tolérantes aux fautes utilisant des générateurs d'impulsions de courant intégrés. - 6. Analyse de la sensibilité aux aléas liée à la topologie de bascules et points mémoire durcis grâce à l'emploi d'un faisceau laser pulsé. - 7. Développement d'un modèle topologique du phénomène de collection de charges aux paires des nœuds sensibles dans les points mémoire durcis aux aléas par redondance locale. Dans la suite, les 6 chapitres composant la thèse seront brièvement décrits. Le chapitre 1 présente une analyse des limitations et des contraintes liées au test IDDQ et aux techniques de durcissement aux aléas en CMOS submicronique et décrit l'objet des travaux de cette thèse. La conception des capteurs de courant intégrés est détaillée au chapitre 2. Des architectures typiques de systèmes testables et tolérants aux pannes sont présentées au chapitre 3. Deux circuits prototypes sont décrits : un multiplieur autocontrôlable à capteurs de courant synchrones intégrés et une mémoire RAM statique tolérante aux aléas utilisant des capteurs de courant asynchrones. Le chapitre 4 introduit les architectures de systèmes CMOS tolérants aux fautes transitoires utilisant des techniques de redondance locale. On décrit la conception de circuits prototypes utilisant deux nouvelles approches : des points mémoire à double contre-réaction et des éléments de mémorisation à redondance dynamique. Les résultats de leur validation par test aux ions lourds sont présentés et analysés. Le chapitre 5 décrit des méthodes de diagnostic et qualification des systèmes CMOS tolérants aux fautes par faisceau laser pulsé et focalisé. Des mécanismes de génération d'aléas associés à la topologie sont identifiés et analysés et nous proposons par la suite des techniques d'optimisation de la conception physique pour le durcissement aux aléas. Le chapitre 6 présente les conclusions générales des nos travaux et les perspectives d'utilisation de ces résultats dans la recherche et les applications industrielles et spatiales. #### Chapitre 2 Ce chapitre présente les principes de base concernant l'utilisation de capteurs de courant intégrés pour le test IDDQ de circuits CMOS, et la conception de capteurs de courant optimisés pour le CMOS submicronique. De nouvelles approches d'utilisation des techniques de test de courant intégré sont décrites : le test IDDQ adaptatif, la synergie avec la conception faible consommation, le test IDDQ différentiel. La méthodologie de test de circuits intégrés CMOS basée sur l'observation des courants d'alimentation en repos assure une couverture de fautes élevée, notamment pour les défauts d'oxydes et des interconnexions induisant des niveaux de tension intermédiaires non détectés par les tests logiques conventionnels. Le test IDDQ des circuits CMOS submicroniques complexes impose des contraintes spécifiques, liées d'une part à l'augmentation des courants de repos (Fig. 1), et d'autre part à la diminution des niveaux de courants des fautes à détecter. Cependant, la précision de mesure du courant IDDQ décroît et le temps de test augmente avec la diminution des tensions d'alimentation et l'accroissement des courants dynamiques de fonctionnement. Figure 1. Augmentation des courants de fuite en CMOS submicronique (Données: Keshavarzi et al. 1998) Pour résoudre les problèmes de discernement de fautes IDDQ, différentes alternatives ont été proposées : - l'utilisation des circuits CMOS à tensions de seuil multiples et à tensions $V_T$ adaptatives pour réduire les courants de fuite durant le test IDDQ - Le partitionnement et l'isolation des tensions d'alimentation des différents blocs fonctionnels du circuit - Le test IDDQ à basse température - L'utilisation de techniques de mesure différentielle des courants IDDQ. Le potentiel le plus élevé permettant d'accroître l'efficacité du test IDDQ est obtenu en utilisant des capteurs de courant intégrés. Le coût élevé de cette méthode et la dégradation de performances du circuit qu'elle induit ont restreint son utilisation aux domaines spécifiques d'application dont les contraintes de coût sont supplantées par les contraintes de fiabilité et tolérance aux pannes, par exemple dans les domaines médical et nucléaire. Notre approche décrite dans ce chapitre est orientée d'une part pour optimiser la conception des capteurs de courant intégrés afin d'assurer le meilleur compromis entre la précision de mesure, le temps de test, l'effort de conception et la facilité de calibration et d'utilisation, et d'autre part pour minimiser le coût d'implémentation. #### Modélisation et conception optimisée de capteurs de courant intégrés Un capteur de courant intégré ("Built-In Current Sensor" ou BICS) est constitué de trois éléments, comme présenté dans la Figure 2 : - Un élément de détection de courant et de conversion courant/tension - Un comparateur au seuil $V_{\text{REF}}$ - Un commutateur pour court-circuiter le capteur et permettre le passage des courants dynamiques durant la phase de transition logique du circuit. Il peut être défini comme un élément d'interconnexion d'alimentation intelligent comprenant un interrupteur à fonction de mesure de courant de repos en état déconnecté. Figure 2. Capteur de courant intégré (BICS) : schéma-bloc Un circuit CMOS est caractérisé par deux régimes de fonctionnement : commutation et repos. Cependant, l'analyse des capteurs de courant intégrés met en évidence l'existence d'un troisième régime intermédiaire de fonctionnement : la phase de transition/rétablissement entre la commutation et le repos. Ce troisième régime est caractérisé par la stabilité de l'état logique du circuit et par la diminution progressive des courants transitoires de chargement/déchargement des nœuds internes du circuit. Afin d'optimiser la conception et les paramètres de fonctionnement du capteur de courant, nous avons également utilisé la modélisation du BICS en tant que générateur de courant pour chacun des trois régimes, comme représenté Figure 3. Figure 3. Modélisation du circuit et du capteur de courant intégré utilisant des sources de courant pour les trois régimes de fonctionnement Le passage de courants dynamiques $I_{TR}$ durant la phase de transition logique du circuit établit par l'activation du commutateur BS est représenté par le générateur de courant $I_{B1}$ . Le circuit de détection de courant $I_{SSQ}$ en régime de repos a comme modèle un générateur de courant de valeur $I_{LIM}$ égale à la limite de test. Finalement, pour la phase de rétablissement du circuit caractérisée par le courant exponentiel $I_{DECAY}$ , on définit un régime de commutation retardée ("delayed bypass") pour le générateur de courant IB2, et un mode de fonctionnement dédié spécifiquement pour le circuit BICS. Le modèle du circuit BICS ainsi obtenu est en effet une impédance contrôlée dynamiquement que nous représentons par des générateurs de courant dans chacun des trois régimes de fonctionnement. Cette approche nous permet de définir un appariement rigoureux du fonctionnement du CUT et du BICS afin d'assurer une estimation et simulation précise des deux modèles et d'optimiser ensuite les paramètres de conception, dimensionnement et synchronisation du BICS. Afin de réduire le temps de test IDDQ, tout en diminuant le bruit de commutation du capteur et en optimisant la précision de mesure, nous avons donc utilisé un deuxième circuit de by-pass à commutation retardée (caractérisé par le courant $I_{B2}$ pour le BICS) durant le passage progressif du régime dynamique au régime de repos (caractérisé par $I_{DECAY}$ pour le CUT). Le commutateur de by-pass du BICS pour le régime dynamique de transition du CUT est l'élément le plus contraignant, puisque ayant un impact déterminant sur le coût du BICS et sur les performances du circuit. Notre approche prend en compte des partitions de circuit aux tailles, complexité et caractéristiques de fonctionnement différentes. Nous décrivons et analysons au Chapitre 2 une méthode de conception de commutateurs modulables à transistors MOS. La mise au point des algorithmes de dimensionnement et de placement topologique est basée sur l'estimation des courants dynamiques moyens et de crête de chaque partition ou sous-partition, ainsi que des conditions limites du niveau de bruit et de la dégradation de vitesse admissibles. L'utilisation d'emplacements différents et des topologies optimisées pour le premier commutateur d'une part, et pour le commutateur retardé et le comparateur d'autre part, nous permet d'optimiser encore les régimes de commutation et de mesure et d'assurer une diminution supplémentaire du niveau de bruit généré. Deux techniques de mesure de courant sont analysées au sein du chapitre 2, l'une à conversion I/V résistive, l'autre à conversion I/V capacitive. Les capteurs résistifs linéaires ou ceux aux caractéristiques non-linéaires paramètrées précédemment décrits utilisant des diodes, transistors et miroirs de courant, permettent de réaliser une conversion I/V rapide. Ils sont généralement utilisés en combinaison avec des comparateurs synchrones, dont le temps de réponse est de l'ordre de quelques nanosecondes. Cependant, ils nécessitent que l'état de repos soit établi avant la phase de mesure. En revanche, les capteurs de courant capacitifs assurent la conversion I/V, suite à l'intégration des courants IDDQ et permettent l'obtention de meilleures précisions de mesure, compte tenu de l'utilisation de capacitances d'intégration précises et de valeurs élevées, aux dépens de l'augmentation du temps de mesure. L'intégration de courant peut être optimisée en utilisant seulement la capacité parasite intrinsèque au nœud d'alimentation virtuelle en tant qu'élément de conversion I/V, compte tenu de la mise en place d'une procédure de test/calibration différentielle utilisant une source de courant de référence. L'existence d'une capacité parasite et d'un courant de fuite intrinsèque difficiles à estimer et compenser pour chaque partition du circuit à l'étape de la mesure de courant, réduit considérablement la vitesse et la précision de fonctionnement des BICS. Afin de compenser cet inconvénient, nous avons opté pour un modèle plus réaliste du capteur de courant, tel qu'une combinaison I-C entre la capacité parasite $C_S$ et une source de courant $I_{LIM}$ fournie par le circuit de by-pass retardé BS2 du BICS, comme le montre la Figure 4. Figure 4. Circuit BICS à double commutation de by-pass L'architecture du BICS à double commutateur nous permet de réduire le bruit induit par la commutation du premier commutateur BS1, et par suite d'augmenter la vitesse et la précision de mesure de courant. Le seuil du comparateur $V_{REF}$ est imposé par le niveau maximal de bruit admissible au nœud d'alimentation virtuelle. Le capteur de courant équivalent est un circuit $R_S$ - $C_S$ , dont la résistance est égale à : $$R_S = V_{REF}/I_{LIM}$$ Le comparateur doit également satisfaire aux contraintes spécifiques des circuits CMOS submicroniques : fonctionnement à faible tension d'alimentation, faible bruit et offset en entrée, tout en assurant une vitesse de réponse élevée et une précision accrue. #### Comparateurs synchrones et asynchrones à contrôle de courant Le Chapitre 2 décrit deux familles de comparateurs rapides optimisés pour les capteurs de courant intégrés en CMOS submicronique. La première famille de comparateurs, à fonctionnement synchrone, utilise un amplificateur de sense ("sense amplifier") à contrôle de courant et contre-réaction régénérative. Un seul signal d'horloge est nécessaire pour réaliser les fonctions de précharge et d'évaluation, comme le montre la Figure 5. Le circuit a une sensibilité réduite aux variations de procèdes : un appariement de 10% des gains des transistors PMOS de l'amplificateur différentiel d'entrée représente un offset du comparateur de 1,6 mV. Nous avons validé les performances du comparateur sur un BICS synchrone implanté dans un prototype de multiplieur autocontrôlable en ligne décrit au Chapitre 3. Figure 5. Comparateur synchrone à contrôle de courant La deuxième famille de comparateurs est basée sur une configuration asynchrone à miroirs de courants commandés. Les comparateurs à deux miroirs de courant décrits précédemment ([139],[143],[148]) imposent un niveau de tension d'entrée élevé pour leur fonctionnement, limité par la tension de seuil des transistors NMOS. Une nouvelle structure de comparateur à trois miroirs de courant est proposée, qui élimine cet inconvénient. Le circuit fonctionne avec des signaux d'entrée et de sortie de type courant ou tension. Le comparateur, présenté Figure 6, utilise une paire de miroirs de courant M1-M2, contrôlé en source du miroir NMOS M1 à l'aide d'un signal externe (V<sub>IN</sub> ou I<sub>IN</sub>) et d'un troisième miroir M3. Figure 6. Comparateur à miroirs de courant contrôlé en source. Le seuil de commutation est fourni par l'une des deux relations suivantes : $$\begin{split} &I_{IN}\!>I_{LIM}-I_{B}\\ &V_{IN}\!>V_{COMP}+\Delta V \end{split}$$ L'activation du comparateur est réalisée suite au contrôle du courant de polarisation $I_{\text{BIAS}}$ . La précision et la vitesse du comparateur sont déterminées par le rapport $I_{LIM}/I_{BIAS}$ des courants des miroirs M2 et M3, ainsi que par les paramètres d'entrée $C_S$ , $I_{IN}$ et $V_S$ . Le seuil de détection des courants de fautes IDDQ est maintenu ainsi à des valeurs en dessous de 0,1 V. Le nouveau comparateur satisfait également aux contraintes de fonctionnement à faible tension et faible consommation. En effet, le fonctionnement du comparateur de la Figure 6 est assuré même en descendant la tension d'alimentation à prés de deux fois la tension de seuil des transistors MOS. Le miroir de courant M3 est commuté dynamiquement et représente à la fois le deuxième commutateur du by-pass et un générateur $I_{LIM}$ définissant le seuil du comparateur. #### Synergie conception faible consommation - test IDDQ intégré Le coût absolu de l'implémentation du test IDDQ intégré étant significatif, son impact sur le coût du produit peut être diminué en le partageant avec d'autres fonctions supplémentaires dont les éléments constitutifs sont des commutateurs d'isolation des tensions d'alimentation et des comparateurs de courant intégrés. En ce sens, une synergie est proposée entre la conception faible consommation et le test IDDQ intégré. Les commutateurs des tensions d'alimentation, qui représentent la contribution majeure au coût d'implémentation des BICS, peuvent être utilisés pour contrôler la distribution des tensions d'alimentation aux différents modules fonctionnels du circuit, afin de réaliser des régimes de fonctionnement faible consommation. L'utilisation des comparateurs de courant intégrés est également proposée dans le chapitre suivant pour la détection en ligne de fautes transitoires dans les applications à haute sécurité et fiabilité. #### Test IDDQ à seuils multiples Une méthode innovante permettant d'accroître la flexibilité du test IDDQ et d'étendre la fonctionnalité du circuit BICS est présentée en fin de deuxième chapitre. Elle consiste à utiliser des comparateurs à seuils multiples pour fournir une information codée sur plusieurs bits pour un ensemble prédéfini de gammes de courants IDDQ. Cette information pourra ensuite être utilisée pour le diagnostic des blocs fonctionnels des systèmes intégrés dont les possibilités d'investigation sont limitées. Une des applications envisagée est la caractérisation détaillée des cœurs IP réutilisables à testabilité et possibilités de diagnostic limitées, qui font couramment l'objet de transferts et de migrations technologiques. D'autres applications incluent la sélection dynamique des limites de test IDDQ et l'optimisation adaptative des séquences de test. Les comparateurs à sorties multiples sont des structures parallèles de comparateurs partageant le même signal d'entrée et ayant un fonctionnement corrélé, à seuils de détection progressifs avec des rapports prédéfinis. La figure 7 montre une configuration de comparateur à trois seuils de courant à sorties parallèles. Figure 7. Triple comparateur de courant à sorties parallèles Un circuit prototype conçu afin de valider les nouvelles architectures de capteurs de courant intégrés est brièvement décrit en fin de chapitre 2. Le coût en surface de l'intégration du test IDDQ est de 8.5%, et la dégradation de vitesse de commutation induite est de seulement 1,4%. La vitesse de test IDDQ au seuil de 5 µA est de 2,5 MHz pour une tension d'alimentation de 3V. Plusieurs stratégies de test IDDQ sont explorées afin d'accroître la capacité de discriminer les faibles courants de faute IDDQ, utilisant une double mesure de courant IDDQ pour deux circuits ou blocs fonctionnels adjacents ou à deux températures et tensions d'alimentation différentes. #### Chapitre 3. Ce chapitre décrit des techniques de contrôle de courant pour le test IDDQ en ligne et la détection/correction de fautes transitoires dans les systèmes CMOS soumises aux perturbations. Ces techniques permettent l'extension de l'utilisation des capteurs de courant intégrés durant la vie complète du circuit, que ce soit pour le test en ligne, la détection et correction de pannes transitoires ou pour le contrôle adaptatif des performances. Deux approches différentes sont analysées : d'une part l'utilisation de capteurs de courant synchrones pour le test en ligne des circuits combinatoires afin de détecter les fautes physiques, les fautes transitoires liées aux bruits et également les fautes de délai, et d'autre part l'utilisation de capteurs de courant asynchrones pour les circuits séquentiels, en occurrence pour la détection et la correction d'aléas logiques dans les mémoires RAM statiques de haute densité. #### Le test en ligne à capteurs de courant intégrés Nous avons développé un prototype expérimental de circuit combinatoire contrôlé par des circuits BICS synchrones, afin d'évaluer d'une part l'efficacité de la détection de fautes par le contrôle en ligne de courant, et d'autre part le coût d'implantation et l'impact sur la vitesse de fonctionnement, la dissipation de puissance et l'immunité au bruit. Un multiplieur 8 bits autocontrôlable à codage double-rail suivi par un circuit checker en sortie pour la détection de fautes logiques est utilisé comme véhicule de test. Les deux modules, le multiplieur et le checker double-rail, sont contrôlés par deux circuits BICS synchrones, comme le montre la Figure 8. Un circuit de simulation de fautes par l'intermédiaire de ponts à transistors NMOS contrôlés par deux signaux externes FI1, FI2 est inséré dans la structure interne du multiplieur sur le chemin de délai maximal. Ce circuit nous permet de tester la sensibilité de détection des ponts résistifs et des fautes de délai. L'analyse des performances du circuit montre que la vitesse maximale d'opération du circuit est de 25 MHz. L'impact sur la vitesse de fonctionnement du multiplieur est limité à 20%, pour un seuil de détection de courants de faute de 60 µA. L'activation synchrone des deux capteurs de courant est pipelinée pour le multiplieur et le checker. Figure 8. Schéma-bloc du multiplieur autocontrôlable à test de courant en ligne Pour la deuxième approche, une structure générique de système à contrôle autonome du test de courant intégré est adoptée, basée sur la détection de transitions logiques pour discriminer l'état actif du circuit de l'état de repos et pour contrôler par la suite le fonctionnement des circuits BICS. Les capteurs de courant asynchrones sont caractérisés par une phase de mesure à temps indéfini, permettant la détection en ligne des fautes transitoires liées aux perturbations des courants d'alimentation au repos induites par les aléas logiques dus au rayonnement. Mémoires RAM CMOS tolérantes aux aléas à capteurs de courant intégrés La réduction d'échelle des transistors dans les technologies CMOS submicroniques avancées entraîne l'augmentation quadratique de la sensibilité aux aléas liés au rayonnement, de sorte que cela affecte également les applications terrestres avec un taux d'aléas non-négligeable. En revanche, l'efficacité des techniques de durcissement aux aléas est réduite de manière dramatique en CMOS submicronique, où elle a également un impact majeur sur la vitesse, la dissipation et la densité d'intégration. D'autre part, des études expérimentales ([22][23]) montrent que les fautes transitoires représentent plus de 80% des fautes des systèmes intégrés. La plupart des causes des fautes transitoires générées par des sources internes ou externes (bruit et couplages électromagnétiques, etc.) peuvent être éliminées, sauf les effets du rayonnement. L'utilisation de systèmes de mémoires à codes détecteurs et correcteurs d'erreurs [33] est limitée par les contraintes imposées par les périodes de latence d'erreurs. En effet, pour les systèmes à mémoires étendues à taux d'accès élevé, l'intervalle de temps augmente entre deux cycles successifs de détection d'erreurs. La probabilité d'aléas multiples sur le même mot augmente, rendant ainsi inefficace le code correcteur d'erreurs. L'utilisation de capteurs de courant pour la détection et la correction d'aléas dans les mémoires RAM CMOS a été proposée en [31]. Elle utilise des circuits BICS asynchrones pour détecter l'aléa dans chaque colonne de mémoire, et le codage de parité pour la correction d'erreurs. A partir de cette approche, nous avons développé une structure de circuit BICS asynchrone pour assurer la tolérance aux fautes transitoires dans les mémoires CMOS submicroniques rapides à haute densité d'intégration. Nous avons validé l'approche sur une mémoire prototype conçue et soumise aux tests de rayonnement. L'étude approfondie des caractéristiques de fonctionnement des mémoires RAM statiques CMOS présentée en détail au sein du chapitre 3 montre que la matrice de points mémoire permet l'implémentation de circuits BICS à haute sensibilité et faible coût. La dissipation de puissance dynamique liée au fonctionnement est fournie essentiellement par les circuits de contrôle et d'adressage. Les faibles courants dynamiques dans la matrice de points mémoire, liés à l'accès d'une seule cellule par colonne en écriture ou lecture, sont comparés aux courants transitoires significatifs et de très courte durée générés suite à l'ionisation d'impact d'une particule énergétique et le basculement intempestif d'une cellule mémoire. La simulation électrique avec SPICE d'une cellule mémoire en écriture et lecture ainsi que pour les aléas montre que la charge équivalente de l'impulsion de courant transitoire lié à l'aléa est supérieure à la charge équivalente des impulsions de courant d'écriture, et comparable à celle des courants de lecture. Le bus d'alimentation de chaque colonne mémoire est isolé et contrôlé par deux circuits BICS pour détecter les courants transitoires induits par les aléas (voir Fig. 9). Les courants des aléas positifs, liés au basculement de $V_{SS}$ à $V_{DD}$ de la jonction de drain d'un transistor PMOS en état bloqué, sont plus significatifs sur la ligne d'alimentation $V_{SS}$ , et donc plus facilement détectés par le circuit BICS<sub>L</sub>. Les courants des aléas négatifs, liés au basculement de $V_{DD}$ à $V_{SS}$ de la jonction de drain d'un transistor NMOS en état bloqué, sont plus significatifs sur la ligne d'alimentation $V_{DD}$ . Ils seront donc plus facilement détectés par le circuit BICS<sub>H</sub>. Figure 9. Schéma-bloc d'une colonne mémoire contrôlée par deux circuits BICS Chacun des deux circuits RC de détection intègre des impulsions de courant brèves ayant une amplitude de plusieurs mA et une durée inférieure à la nanoseconde. Afin d'éviter les fausses alarmes liées aux courants d'accès en écriture ou lecture, la sensibilité des circuits BICS est diminuée pour chaque colonne durant l'intervalle de temps d'accès, en diminuant les résistances équivalentes $R_H$ , $R_L$ à l'aide de transistors de bypass. Les capacitances parasitiques CH, CL assurent l'intégration des impulsions transitoires de courant, réduisant l'amplitude et augmentant la durée de la perturbation au nœud d'alimentation virtuelle $V_{DD}'/V_{SS}'$ . Deux comparateurs permettent la détection rapide des perturbations dépassant le seuil de référence et contrôlent un latch d'erreur asynchrone. Par la suite, l'activation d'un signal global d'erreur déclenche une interruption du système en cas d'aléas. Une séquence de détection d'erreur identifie la colonne affectée. Elle est suivie d'une routine de lecture des mots de cette colonne afin de rétablir l'information correcte pour toutes les lignes à erreurs de parité. Une mémoire RAM prototype (Fig. 10) a été conçue en technologie CMOS/épi de 0,8 µm et utilisée pour valider et optimiser la technique de détection et correction d'aléas décrite. Figure 10. Schéma-bloc de la mémoire RAM CMOS statique tolérante aux aléas L'isolation des lignes d'alimentation virtuelle $V_{DD}$ ', $V_{SS}$ ' par rapport au substrat de silicium pour l'insertion des capteurs de courant représente un coût significatif en surface (5...10%) pour une technologie à deux niveaux de métallisation. Cependant, l'utilisation de technologies CMOS denses à plus de trois niveaux d'interconnexions en métal et contacts empilés rend ce coût négligeable. L'isolation du substrat réduit également la sensibilité du circuit au "latchup". La polarisation dynamique interne du substrat après l'isolation permet ensuite d'augmenter les performances des mémoires RAM réalisées en technologies CMOS submicroniques profondes, comme décrit au chapitre 2. Les matrices mémoire à bus central d'alimentation et amplificateurs de sense médians permettent une implantation compacte d'une ligne de paires des circuits BICS. Une matrice RAM à bus d'alimentation périphérique utilise deux lignes de circuits BICS placées aux deux extrémités de la matrice, l'une lié au V<sub>DD</sub> et adjacente aux circuits de précharge, l'autre liée au V<sub>SS</sub> et adjacente aux amplificateurs de sense. Pour une colonne SRAM de 512 points mémoire, la surface rajoutée par un BICS dual à topologie centrale est de moins de 1,6%, et pour deux BICS à topologie périphérique insérés aux extrémités de 2,7%. Les tests du prototype aux ions lourds ont montré une détection et correction efficace des aléas. Un taux élevé de plus de 80% de fausses alarmes a été également détecté. Leur mécanisme est lié aux perturbations dues aux charges collectées en dessous du seuil d'aléas. La calibration du seuil de détection d'aléas des circuits BICS nous permet de minimiser le taux de fausses alarmes, sans pour autant les éliminer, car la distribution des charges collectées est continue autour du seuil critique Q<sub>C</sub>. En effet, la perturbation générée par un aléa à charge critique est inférieure à la perturbation induite suite à une fausse alarme due à une charge collectée inférieure. Des simulations mixtes électriques et de dispositif en 2D et 3D avec l'outil DAVINCI, effectuées en coopération avec le laboratoire IXL de Bordeaux, montrent que le processus d'évacuation de charges du nœud d'impact est plus rapide en cas de basculement du point mémoire qu'en cas de fausse alarme. D'autre part, des fluctuations aléatoires importantes des taux d'aléas et des fausses alarmes sont liées à la distribution en énergie des particules incidentes dans un environnement radiatif défini, aux distributions des points d'impact et des moments d'impact par rapport aux cycles de fonctionnement du RAM et aux variations statistiques de sensibilité de chaque cellule RAM. Ceci nous a permis un meilleur étalonnage des circuits BICS et une compréhension détaillée des taux des fausses alarmes. Afin de caractériser en détail et de manière déterministe et quantitative le fonctionnement des circuits BICS du circuit RAM prototype, nous avons développé une technique d'injection d'impulsions de courant d'amplitude et durée programmables dans des cellules mémoire prédéfinies. Dans ce but, nous avons implanté dans la mémoire prototype un module d'autotest et d'étalonnage en ligne des circuits BICS composé de deux générateurs d'impulsions de courant dont on peut sélectionner la polarité, l'amplitude (0 ... 3mA) et la durée (0 ... 3ns) (voir Figure 11). Figure 11. Schéma du générateur d'impulsions de courant programmable Nous avons "émulé" ainsi le phénomène d'aléa dans deux cellules mémoire de manière simultanée et indépendante. Cette méthode nous a permis d'évaluer d'une part le seuil de sensibilité des circuits BICS et d'autre part l'efficacité de l'algorithme de correction. Le contrôle de l'instant de déclenchement du générateur d'impulsions nous a permis également de simuler le comportement asynchrone des aléas logiques par rapport au signal d'horloge et de caractériser par la suite l'immunité des cellules mémoire aux aléas dynamiques. L'étalonnage des résistances de détection de courant R<sub>H</sub>, R<sub>L</sub> à travers le contrôle du transistor de by-pass nous permet d'adapter dynamiquement le seuil de sensibilité des BICS afin de détecter les aléas dans la même colonne accédée en écriture ou lecture, suite à la superposition des courants des perturbations induites par les aléas aux courants d'écriture/lecture. Cependant, la simulation d'aléas par l'injection d'impulsions de courant à travers les deux générateurs rajoutés ne peut pas fournir une estimation correcte de la charge critique équivalente, car d'une part les capacités parasites rajoutées au point mémoire modifient son comportement aux aléas, et d'autre part la durée et le profil d'injection de charges au nœud liés au courant généré sont inadéquats pour simuler les phénomènes de collection de charges suite à l'ionisation d'impact d'une particule. Nous avons démontré ainsi de manière expérimentale la fiabilité et la robustesse de l'algorithme de détection et correction d'aléas pour toute la gamme de tensions d'alimentation du circuit prototype, entre 3,5 et 6V. Cependant, la calibration reproductible du seuil de détection dans tous les régimes de fonctionnement et le contrôle de son évolution avec la dose cumulée nécessite la mise au point d'algorithmes de contrôle adaptatif de la sensibilité des BICS. L'utilisation de seuils de détection différents nous permet de caractériser et d'évaluer l'environnement radiatif à travers les variations dynamiques des taux d'aléas et des fausses alarmes. Des stratégies d'ordonnancement des tâches critiques et de reconfiguration peuvent être implémentées suite à l'évaluation dynamique des distributions énergétiques des taux des particules et au test en ligne pour la détection des fautes permanentes. #### Chapitre 4. Ce chapitre présente des techniques de conception de systèmes CMOS tolérants aux fautes transitoires dues aux aléas en utilisant des points mémoire à redondance locale. Ces techniques nous permettent d'éviter la conception et la calibration pointue des capteurs de courant décrit au chapitre précédent, tout en assurant une implémentation rapide et fiable du durcissement aux aléas en technologies CMOS commerciales. Nous avons adapté et optimisé ces techniques d'une part pour la synthèse automatique des mémoires RAM immunes aux aléas en CMOS submicronique avancé, et d'autre part pour l'implémentation de circuits séquentiels complexes aux contraintes sévères de latence des fautes, vitesse de commutation et sûreté de fonctionnement. Au début du chapitre nous décrivons brièvement l'environnement radiatif dans l'espace et son influence sur la fiabilité des systèmes CMOS avancés. Trois mécanismes d'interaction du rayonnement avec les circuits CMOS en silicium sont pertinents : les dislocations d'atomes, l'ionisation et l'interaction nucléaire avec les neutrons et les particules chargées. Les effets permanents cumulatifs résultant de ces interactions sont les variations des paramètres électriques des transistors, tels que la tension de seuil, les courants de fuite et la mobilité de porteurs. Ils sont dus notamment au piégeage de charges dans l'oxyde et à la génération d'états d'interface. Leur impact sur les applications embarquées est lié à une dégradation des performances et à la réduction du temps de mission. Les effets transitoires intempestifs sont les aléas et le "latch-up", qui induisent des pertes d'information et des phénomènes de dissipation thermique à caractère potentiellement catastrophique et destructif. Nous analysons les mécanismes de base des aléas logiques liés aux impacts des particules à haute énergie dans les zones sensibles des circuits CMOS. Nous décrivons brièvement ensuite les techniques de modélisation et de prédiction des taux d'aléas en orbite pour les applications spatiales, et les méthodes utilisées pour tester la tenue aux aléas des circuits CMOS durcis. Ces méthodes sont basées sur la simulation de l'environnement de l'espace en accélérateur de particules, afin d'évaluer leur performance et fiabilité dans des conditions terrestres. Elles ont été utilisées pour la validation des prototypes conçus dans le cadre de cette thèse. Le caractère aléatoire des aléas en temps et espace est lié à la distribution des énergies, des trajectoires, des quantités de charge déposées et des moments d'impact des particules, ce qui rend particulièrement difficile, complexe et coûteuse la simulation des caractéristiques de l'environnement radiatif, tout en limitant sévèrement sa précision. Les paramètres suivants définissent la tenue d'un circuit CMOS aux aléas : - a) Le seuil LET ("Linear Energy Transfer") d'une particule incidente, qui mesure l'énergie déposée sur l'unité de longueur de son trajet dans un matériau, - b) La charge critique $Q_C$ collectée au nœud sensible suite à l'ionisation d'impact qui génère un aléa dans le circuit, - c) La section efficace sensible aux aléas, qui représente le rapport entre le nombre d'aléas détectés et le flux d'ions monoénergétiques incidents sur l'unité de surface. Nous décrivons brièvement ensuite les principales techniques utilisées pour durcir les circuits CMOS face aux aléas, leurs avantages et leur limitations. Des améliorations au niveau du procédé de fabrication, tels l'utilisation de substrats isolants ou de substrats épitaxiés minces à dopage élevé, permettent de diminuer l'ionisation et la collection de charges et d'augmenter par la suite le seuil LET et la charge critique. Leur coût est généralement élevé et l'efficacité réduite avec la diminution d'échelle des transistors en CMOS submicronique. Les techniques de conception durcie des circuits CMOS face aux aléas sont classées en trois groupes différents : - a) techniques de durcissement au niveau système pour assurer son fonctionnement correct en présence des aléas ("upset tolerance"), - b) techniques de durcissement paramétrique pour diminuer la sensibilité des nœuds du circuit aux aléas ("upset hardening"), - c) techniques de durcissement logique utilisées pour éviter les aléas dus aux quantités de charge arbitraires déposées aux nœuds sensibles singuliers ("upset immunity"). Elles permettent d'obtenir les différents niveaux de protection de l'électronique embarquée de manière à assurer la fiabilité, les performances et la sûreté de fonctionnement de l'application pendant toute la durée de la mission spatiale. La tolérance aux aléas au niveau système est assurée généralement à travers le codage de l'information en utilisant des codes détecteurs et correcteurs d'erreurs. Des processeurs et des algorithmes spécialisés permettent d'effectuer soit une exploration périodique exhaustive des matrices mémoire et des registres système, soit une séquence de localisation et correction d'erreurs déclenchée au moment de l'aléa par des capteurs de courant intégrés, comme montré au chapitre précédent. La tolérance aux aléas au niveau système peut être également basée sur des stratégies de redondance massive, en l'occurrence en utilisant la redondance modulaire triple et des circuits de vote majoritaire. Cependant, le coût d'implémentation est très élevé, et la fiabilité du système repose sur une conception durcie des circuits de vote. Les techniques de durcissement paramétrique, utilisées pour diminuer la sensibilité aux aléas des circuits CMOS, sont couramment basées sur une combinaison de méthodes de conception et de procédés de fabrication, afin de minimiser le coût en surface et d'optimiser les performances. Le durcissement résistif (voir Figure 12), consiste à rajouter des résistancess de couplage entre les inverseurs du latch CMOS pour retarder la propagation de la perturbation du nœud d'impact et permettre ainsi au nœud non affecté de rétablir l'état correct à la fin de cette perturbation. Figure 12. Durcissement résistif aux aléas d'une cellule mémoire CMOS. Les multiples combinaisons des résistances de découplage sont indiquées. Le durcissement résistif, couramment utilisé pour la fabrication des circuits CMOS durcis destinés aux applications nucléaires et spatiales, assure l'immunité aux aléas pour une gamme limitée de perturbations qui peut correspondre aux contraintes de l'environnement radiatif en application. Cependant, les délais de propagation des perturbations rajoutés au circuit diminuent le temps d'accès en écriture, notamment à basse température. Les éléments RC rajoutés sont constitués soit des jonctions/canaux de transistors à différentes géométries rajoutés au "layout", soit d'éléments parasites ou résultant des changements de propriétés des matériaux. En l'occurrence, les résistances de découplage sont réalisées grâce aux processus spécifiques de dopage permettant l'augmentation contrôlée de la résistivité des interconnexions en polysilicium. L'efficacité du durcissement résistif diminue de manière drastique en CMOS submicronique. Elle est due à la diminution des capacités des nœuds, de la charge critique et de la tension d'alimentation. Cela impose l'utilisation de résistances de découplage de valeurs toujours plus élevées, ce qui affecte de manière significative les rendements de fabrication et les performances du circuit. Les techniques de durcissement logique utilisent la redondance locale des points mémoire pour assurer l'immunité totale aux aléas. Elles sont basées sur la duplication de l'information stockée et l'utilisation de circuits de contre-réaction pour le rétablissement de l'état correct. Le durcissement logique représente une solution optimale pour les applications critiques à haute complexité et vitesse de traitement, implémentées en technologie CMOS submicronique. Notre approche consiste à développer et optimiser les techniques de redondance locale pour une conception rapide et fiable de systèmes CMOS submicroniques immunes aux aléas en assurant notamment les caractéristiques suivantes : - Impact minime sur les performances du circuit - Conception indépendante du processus de fabrication - Minimisation du coût en surface - Stabilité des performances, fiabilité et sûreté en fonctionnement avec la dose cumulée - Conversion simple, automatique des circuits existants en version immune aux aléas. Les différentes structures de bascules CMOS redondantes immunes aux aléas sont présentées et analysées au sein du chapitre 4. Elles sont regroupées en deux classes: - Bascules asymétriques utilisant deux circuits latch différents, l'un pour l'accès et l'autre assurant l'immunité aux aléas (voir Fig. 13) - Bascules symétriques utilisant deux circuits latch durcis similaires aux fonctions d'accès parallèle et d'immunité (voir Fig. 14). Deux techniques de base sont utilisées dans ces approches pour rendre un circuit latch immune aux aléas : - L'utilisation d'inverseurs NMOS et PMOS, permettant d'éviter la collection de charges en sortie pour une polarisation zéro des jonctions d'impact drain/substrat. - L'utilisation d'inverseurs et circuits de contre-réaction asymétriques, pseudo-PMOS (P-fort/N-faible) ou pseudo-NMOS (N-fort/P-faible), évitant ainsi l'état intermédiaire en sortie à travers un rapport de taille adéquat entre les deux transistors en conduction. Les transistors PMOS dominants sont ombrés sur les figures 13 et 14. Figure 13. Bascules CMOS asymétriques au latch PMOS immune aux aléas : (a) Bascule de Rockett [62] et (b) Bascule HIT1 [61] L'utilisation des inverseurs asymétriques augmente la taille du circuit, la puissance dissipée, et la sensibilité des performances aux variations de process, température, tension d'alimentation et dose cumulée. Les effets de la dose cumulée sur les paramètres des transistors (les courants de fuite élevés, les variations des tensions de seuil et de la transconductance) modifient les caractéristiques d'asymétrie des inverseurs, ce qui peut conduire à la perte de l'immunité aux aléas. La dominance des transistors à grande taille est relative et évolue dans le temps avec la dose cumulée à travers les phénomènes suivants : - la baisse de la transconductance et l'augmentation de la tension de seuil PMOS, - la diminution des tensions de seuil et l'augmentation de la transconductance et du courant de fuite des transistors NMOS. Ces phénomènes sont également dépendants de l'état logique des transistors, induisant un effet d'empreinte de l'état logique stocké à longue durée en environnement radiatif. Les inverseurs NMOS et PMOS ont également le désavantage d'induire des niveaux logiques dégradés et des courants de fuite importants. Figure 14. Bascules CMOS symétriques immunes aux aléas utilisant deux circuits latch pour l'accès et le stockage: (a) Bascule Whitaker [71] et (b) Bascule Liu [63] Nous avons développé une nouvelle structure de bascule CMOS redondante durcie aux aléas qui repose sur un nouveau principe de durcissement afin d'éliminer ces inconvénients. La nouvelle structure DICE ("Dual Interlocked storage Cell") utilise quatre inverseurs CMOS à contre-réaction croisée formant deux anneaux antiparallèles NMOS et PMOS et quatre demibascules (voir Fig. 15). L'immunité aux aléas de chacun des quatre nœuds symétriques est réalisée à travers la double contre-réaction des deux nœuds adjacents, l'un d'eux contrôlant l'état de conduction de l'un des transistors, et l'autre contrôlant l'état de blocage du transistor complémentaire. La fonction de stockage et celle d'isolation sont réalisées en groupes alternés de deux circuits demi-bascule. Le dimensionnement et les variations de paramètres des transistors ont une très faible influence sur les performances de la bascule DICE. Cela nous permet d'assurer une implémentation rapide et fiable des points mémoire DICE sans modifier la taille de transistors des circuits préexistants. Figure 15. Isolation des demi-bascules CMOS actives par double contre-réaction (a,b). Bascule DICE immune aux aléas (c) La fusion de deux points mémoire adjacents et la réconfiguration des interconnexions en structure DICE est montrée en Fig. 16 au niveau schématique et en Fig. 17 au niveau "layout". La conversion DICE des matrices des points mémoire SRAM CMOS est simple et prête à l'automatisation. Elle consiste à modifier les interconnexions, sans impact sur les performances mais en diminuant au moitié la capacité de stockage. La perte en surface due au rajout des interconnexions croisées, visible en Fig. 17, est éliminée dans les technologies CMOS submicroniques aux contacts empilés et niveaux multiples de métallisation. Afin de valider ces nouvelles méthodes de conception durcie aux aléas, nous avons conçu des circuits prototypes en technologies CMOS commerciales de 1,2 µm, 0,8 µm et 0,25 µm et qui sont des mémoires RAM statiques et des blocs de registres à décalage. Nous décrivons l'implémentation de matrices de mémoire et de registres DICE sur les trois circuits prototypes et fournissons les résultats des tests de validation aux ions lourds. Les détails d'implémentation des prototypes sont présentés en annexes B,C et D. Les tests des deux premiers prototypes, réalisés en technologies CMOS de 1,2 et 0,8 µm sur substrat épitaxial, ont montré une limitation de la tenue aux aléas des cellules mémoire durcis pour des énergies des particules incidentes de 50 MeV cm²/mg. Cette limitation des performances est analysée ensuite à l'aide de tests supplémentaires utilisant des impulsions laser pour simuler de manière déterministe l'effet de l'impact des particules sur les zones sensibles du circuit. Des mécanismes d'aléa liés à la topologie des transistors ont été mis ainsi en évidence. Ces mécanismes sont liés aux phénomènes de basculement d'état logique des deux nœuds stockant la même information dans les deux circuits latch d'une cellule durcie, suite à la collection de charges dues à l'impact d'une seule particule. Figure 16. Conversion de schéma de cellules SRAM adjacentes (a) en configuration DICE (b) Figure 17. Conversion du layout de cellules SRAM adjacentes (a) en configuration DICE (b) Des stratégies de durcissement adéquates ont été élaborées afin de contrecarrer ces phénomènes d'aléas "duaux". Les tests des circuits durcis au faisceau laser ont été effectués en collaboration avec l'Aerospace Corporation à Los Angeles, en Californie, et avec le laboratoire IXL à Bordeaux. Nôtre analyse détaillée des résultats obtenus est présentée au chapitre 5. Le test des fonctions redondantes est essentiel pour évaluer et assurer la sûreté en fonctionnement des systèmes intégrés complexes utilisant des architectures DICE dans des applications critiques. En effet, les mécanismes des fautes permanentes aux nœuds singuliers des cellules mémoire DICE sont masqués par la redondance et ne sont pas couverts par les tests de mémoire conventionnels. En revanche, l'existence de ces fautes permanentes détruit l'immunité aux aléas. Nous avons développé une méthode simple et fiable de test afin de détecter la vulnérabilité aux aléas des cellules mémoire DICE en cas de faute permanente. Cette méthode réalise l'émulation des fautes logiques équivalentes aux aléas utilisant l'écriture partielle de l'un des deux latch de la cellule DICE. Afin de réaliser cette écriture partielle, les décodeurs d'adresse sont modifiés pour permettre l'accès indépendant à chacune des paires des nœuds d'une cellule mémoire durcie. Une extension de cette méthode permet l'implémentation d'architectures des mémoires multiport et le développement des algorithmes de codage, synchronisation et contrôle d'accès pour assurer l'intégrité et la sécurité des informations. Elle a un faible coût d'implémentation et un niveau de sécurité élevé, et assure une couverture totale des fautes permanentes avec un pattern de test compact, sans affecter les performances dynamiques du circuit. Le troisième circuit prototype, réalisé en technologie CMOS de 0,25 $\mu$ m en coopération avec le CERN, à Genève, et testé aux ions lourds au cyclotron de 88" du Lawrence Berkeley Laboratories, en Californie, a montré une tenue aux aléas supérieure à 89 MeV cm²/mg. Nous avons élaboré ensuite des techniques de conception de bascules DICE pour les circuits séquentiels dédiés aux modules de contrôle des systèmes intégrés CMOS submicroniques, et nous avons développé une bibliothèque de cellules séquentielles durcies aux aléas en technologie CMOS de 0,6 µm. Les spécifications préliminaires des éléments de la bibliothèque de cellules durcies sont fournis en annexe E. #### Chapitre 5. La qualification des circuits durcis aux environnements radiatifs, à travers des tests aux ions lourds en accélérateur, a un coût élevé et une durée excessivement longue. Elle ne permet pas le diagnostic des causes de la sensibilité aux aléas et d'identification des remèdes correspondants. Dans ce bût, nous avons mis au point une méthode de test déterministe, utilisant des impulsions laser picoseconde focalisés sur la surface du silicium, ce qui permet de simuler des phénomènes d'ionisation équivalents à ceux générés par l'impact d'ions lourds. Nous avons pu ainsi mettre en évidence et analyser les mécanismes d'aléas spécifiques aux topologies particulières du circuit, en des endroits précises et à des instants bien définis. En début de chapitre, nous donnons les résultats de l'étude des caractéristiques de l'ionisation laser du silicium. La corrélation linéaire entre l'énergie des impulsions laser et le seuil LET des ions lourds est présentée, et nous montrons les principales différences liées à la profondeur limitée de pénétration du rayon laser, à son arrêt par les couches opaques des interconnexions métalliques, à la distribution radiale de la densité d'ionisation, et au diamètre de l'empreinte du faisceau laser sur la surface du circuit. Une estimation comparative des effets des rayons laser et des ions lourds montre une équivalence entre une particule énergétique de LET = 5MeV cm²/mg et une impulsion laser de 1 GW/cm², de longueur d'onde $\lambda$ = 1,06 $\mu m$ , ayant une d'empreinte de 1 $\mu m^2$ et une énergie de référence de 100 pJ. Une précision de positionnement de 0,1 $\mu m$ du faisceau laser au site d'impact, ainsi que le changement pratique et rapide du seuil LET équivalent suite au contrôle de l'intensité des impulsions laser, assurent une bonne précision de l'analyse diagnostic des zones sensibles du circuit. Un premier ensemble de tests laser a été effectué sur un circuit prototype existant, comprenant quatre matrices de registres durcis aux aléas qui utilisent les différentes techniques de redondance précédemment décrites. La qualification de ce circuit aux ions lourds est limitée aux énergies inférieures au LET de 25 MeV cm²/mg à cause de sa sensibilité au "latchup". Les tests laser effectués en coopération avec l'Aerospace Corporation, en Californie, nous ont permis d'identifier, dans une première étape, les zones sensibles et les mécanismes de "latchup", puis de caractériser la tenue aux aléas des différentes structures des bascules CMOS durcis pour une gamme étendue d'énergies d'impact. Les caractéristiques des ces tests au laser sont précisées : longueur d'onde du faisceau de 600 nm, durée des impulsions de 10 ps, précision de positionnement 0,1 $\mu$ m, diamètre de la zone d'impact 2 $\mu$ m. Nous avons utilisé une séquence de deux algorithmes de test : un algorithme de détection de zones sensibles aux aléas et au "latchup" et un algorithme de caractérisation de la sensibilité aux aléas des zones sensibles détectées : - L'algorithme de détection consiste à explorer la surface du circuit en mode répétitif, à vitesse élevée, avec un incrément constant d'énergie d'impact. - L'algorithme de caractérisation identifie ensuite l'étendue de chaque zone sensible et les seuils d'énergie laser qui déclenchent l'événement. Cette mesure de seuil d'énergie laser est effectuée en différents endroits autour du site sensible détecté. On incrément faiblement l'intensité du faisceau laser en mode déclenché, afin de détecter le seuil de basculement. Grâce à la structure répétitive du circuit et à une bonne reproductibilité des mesures, les variations des seuils d'énergie des sites équivalents aux différents points mémoire sont inférieures à +/- 1 pJ pour les tests effectués. Cette étude a tout d'abord conduit à identifier les zones sensibles, puis à caractériser le mécanisme de "latchup". L'analyse permet de préciser les limites des méthodes conventionnelles de protection au "latchup" dans la conception CMOS digitale, pour les applications dans un environnement radiatif. On constate que l'utilisation des contacts de substrat espacées ne suffit pas à contrecarrer le phénomène de thyristor parasite (voir Fig.18). Ceci permet, par suite, de quantifier les critères d'espacement, d'emplacement et de densité d'insertion des prises caisson et substrat. Figure 18. Zone sensible au "latchup" (bord A-A) dans la cellule mémoire de Rockett. La flèche indique le site d'impact laser et les jonctions activées du thyristor parasite Les séquences de test aux impulsions laser pour la détection et la localisation d'aléas dans les matrices des registres durcis ont mis en évidence un mécanisme d'aléa "dual", lié à la collecte simultanée de charges aux paires de nœuds sensibles d'une cellule mémoire redondante, supposée théoriquement immune à tout basculement intempestif. La cellule HIT1 (voir Fig. 19) présente quatre pics de sensibilité aux aléas, identifiés pour les deux paires de jonctions simultanément sensibles, A-Q\* et B-Q (voir Fig. 20). L'analyse, par simulation, des charges critiques aux deux paires des nœuds met en évidence une sensibilité asymétrique des deux nœuds, avec un rapport élevé $Q_{CP}/Q_{CS} > 4$ entre la charge critique du nœud primaire (A, B) et celle du nœud secondaire (Q\*, Q). Ce rapport de sensibilités est lié à la dimension des transistors, à la capacitances des nœuds, aux types d'inverseurs (NMOS, PMOS ou CMOS) et à la structure de contre-réaction. Figure 19. Les quatre sites sensibles aux aléas dans la cellule HIT1. Sites 1 et 4 correspondent à l'état "low" aux nœuds Q et B. Sites 2 et 3 correspondent à l'état "high aux nœuds Q et B. Figure 20. Le schéma électrique de la cellule HIT1. Les courbes de sensibilité des paires de nœuds sensibles (voir Fig. 21) mettent en évidence la séparation du plan en deux régions : - celle où les combinaisons des quantités de charges collectées correspondent à des impacts sans aléas (en bas et à gauche), - celle où les combinaisons des quantités de charges collectées correspondent à des impacts avec aléas (en haut à droite). L'effet de la dose cumulée induit un phénomène de déplacement des courbes caractéristiques présentées sur la figure 21, vers les zones de charges moindres, ce qui conduit à un affaiblissement progressif de la tenue aux aléas. Figure 21. Distribution des charges critiques pour l'aléa "dual", dans les cellules mémoire HIT1 et LIU. Cette méthode d'analyse rapide, précise et peu onéreuse par rapport aux tests aux ions lourds conduit à dresser des cartes détaillées de sensibilité pour les zones sensibles aux impacts ayant des topologies représentatives, afin de permettre d'identifier les meilleures méthodes de durcissement topologique. Nous avons ainsi une possibilité de validation complète et fiable des circuits durcis en technologie CMOS submicronique, à condition toutefois que la taille de l'empreinte du faisceau laser assure une résolution acceptable d'excitation par rapport aux tailles des transistors, afin de permettre l'analyse précise des sensibilités topologiques. Des simulations électriques et de dispositif avec HSPICE et PISCES/LUMINOUS ont confirmé les résultats des expérimentations au laser. Cette recherche des mécanismes d'aléas "duaux" liés à la topologie a ensuite permis de trouver les solutions optimales, en terme de coût et de performances, pour assurer un degré d'immunité aux aléas élevé dans les applications critiques. L'utilisation d'une méthode d'espacement et d'isolation des paires des nœuds simultanément sensibles, typiquement (mais non uniquement) par insertion de diffusions de polarisation de substrat, permet de réduire à des niveaux insignifiants la probabilité de collecter des charges critiques simultanément en deux nœuds, suite à l'impact d'une seule particule énergétique. L'analyse topologique des la sensibilité des cellules DICE aux aléas duaux, à l'aide de tests au laser pulsé, montre l'existence d'un nombre de paires des nœuds sensibles double par rapport à celui des cellules redondantes asymétriques. Nous avons également identifié des paires de nœuds sensibles aux aléas "duaux", dans la structure des transistors d'accès. Nous avons étudié en détail deux topologies caractéristiques, pour lesquelles nous avons développé et analysé des méthodes spécifiques de durcissement. L'analyse du durcissement de cellules DICE optimisées en technologie CMOS épitaxié de 0,8 µm, présentant des paires des drains sensibles écartés et isolés, a mis en évidence un mécanisme inédit d'aléas, lié à une forte ionisation d'impact à la jonction commune caisson/substrat (voir Fig. 22). Le seuil caractéristique d'énergie laser, pour ce mécanisme d'aléa, est très élevé, dépassant les énergies équivalentes au LET des particules les plus énergétiques. Cependant, ce seuil diminue de façon significative, dans le cas des technologies CMOS submicroniques avancées. Un remède possible à ce phénomène consiste à utiliser une technique "Split Well" d'isolation des caissons, pour les deux transistors. Figure 22. Mécanisme d'aléas dual, suite à l'ionisation de la jonction caisson/substrat, dans une cellule DICE optimisée (Métaux non représentés) : $$\label{eq:DICE_OPT_0} \begin{split} DICE\_OPT\_0,\, DICE\_OPT\_1: Zones \, d'impact \, laser \, (bord \, du \, caisson \, N) \\ D1-D2,\, D1^*-D2^* = Drains \, sensibles \, ; \, BBD = Diffusion \, d'isolation \, ("Bulk \, Biasing \, Diffusion") \end{split}$$ Nous avons ensuite développé un modèle topologique de collecte des charges prenant en compte les phénomènes qui sont déterminants pour les aléas "duaux": la diffusion et la conduction des transistors bipolaires parasites. Ce modèle permet un calcul rigoureux et réaliste des charges collectées aux nœuds sensibles du circuit, pour tous les points d'impact équivalents et leur angle d'incidence correspondant. Il utilise des fonctions de rendement de collecte de charges, $f_c$ , pour chaque jonction dont on connaît les paramètres électriques, technologiques et dimensionnels. Les tests au laser permettent d'évaluer les variations de $f_c$ en fonction des distances équivalentes $X_d$ des points d'impact pour une aire, un périmètre, un dopage et une profondeur de jonction données. On définit ensuite une fonction d'amplification de collecte des charges, liée au phénomène de conduction du transistor bipolaire parasite. Le phénomène de collecte concurrente, qui se produit simultanément aux deux nœuds de drain, et l'influence des autres éléments des circuits adjacents, interviennent en tant que fonctions multiplicatives de collecte de charges. La figure 23 montre schématiquement les mécanismes typiques d'aléas duaux. Figure 23. Mécanismes de collection de charges à double nœud : (a) Diffusion + "Drift/funneling" (conduction plasma), (b) Diffusion, (c) Diffusion + Amplification bipolaire, (d) Diffusion + Conduction directe du canal MOS On peut donc déduire des résultats des tests laser sur des circuits prototypes dédiés les paramètres du modèle de la collecte des charges, pour une technologie donnée. Ce modèle conduit à définir un ensemble de règles de conception pour le durcissement topologique aux aléas, ce qui permet alors de mettre en œuvre des méthodes d'analyse topologique et d'optimisation du durcissement, ou, encore, d'inclure, dans les outils CAO de synthèse de bibliothèques de cellules et dans les outils de translation technologique des circuits durcis, des algorithmes d'espacement et d'isolation des transistors. Les recherches futures porteront sur la caractérisation détaillée des aléas dynamiques dans les circuits CMOS submicroniques à taux élevé d'activité logique, grâce à des tests synchrones, avec des impulsions laser déclenchés en corrélation temporelle avec les impulsions d'horloge du circuit. Cette thèse apporte une contribution au développement des techniques de conception de systèmes CMOS submicroniques fiables et tolérants aux pannes. Plusieurs méthodes ont été analysées, essayées et optimisées : le test IDDQ intégré, la détection et la correction en ligne des fautes transitoires, la conception CMOS immune aux aléas, la validation et la qualification des architectures CMOS redondantes et tolérantes aux pannes. Le but était de remédier aux principales sources d'erreurs dans les systèmes CMOS : les fautes paramétriques et les fautes transitoires. L'étude a montré combien il est difficile de maîtriser les contraintes de fiabilité et de tolérance aux pannes en CMOS submicronique avancé, et a mis en évidence la nécessité d'adopter une synergie de méthodes pour contrecarrer les effets complexes liés à l'environnement et à l'évolution des technologies de fabrication. L'utilisation de capteurs de courant intégrés à conception optimisée améliore la détection de fautes tout en respectant les objectifs de performances en vitesse et en dissipation. De plus, nous proposons une technique de modélisation des capteurs de courant intégrés, en tant qu'éléments d'interconnexions d'alimentation contrôlés. La conception des BICS pour la détection périodique, ou en ligne, de fautes permanentes et transitoires peut bénéficier des techniques de conception automatique et de la synthèse des capteurs de courant intégrés considérés comme macrocellules paramétrées, et rendre négligeable leur impact sur les performances du circuit. Des techniques adaptatives d'étalonnage sont nécessaires pour optimiser la sensibilité et diminuer les effets des fausses alarmes. Des méthodes de durcissement aux aléas de circuits CMOS submicroniques avancés, utilisant la redondance locale, sont décrites dans cette thèse. Elles ont été validées expérimentalement par des tests aux ions lourds, et optimisées grâce à des tests de caractérisation au laser. Cela ouvre la voie à une réduction importante du temps d'élaboration des applications et des ressources nécessaires à la validation des circuits durcis aux aléas, tout en minimisant l'influence du durcissement sur les performances en vitesse, consommation, immunité au bruit et coût de mise en œuvre. Une bibliothèque de cellules séquentielles durcies a été développée, en coopération avec le CNM à Barcelone, en vue de son utilisation à la conception d'un modem ASIC dédié au satellite expérimental espagnol NANOSAT. Ces travaux permettent le développement d'outils génériques de synthèse et de conversion directe de "layout" pour les mémoires RAM CMOS et les cellules de bibliothèque durcies aux aléas, en architecture redondante. ## **CONTENTS** | 1. | INTRODUCTION | 1 | |----|---------------------------------------------------------------------------------------------------------------------------|----------| | 2. | ON-CHIP CURRENT SENSORS FOR SUBMICRON CMOS | 7 | | | 2.1 I <sub>DDQ</sub> Testing: Performance and Limitations | 7 | | | 2.2 Built-In Current Sensors For I <sub>DDO</sub> Testing | 9 | | | 2.2.1 Bypass Switch Design | 14 | | | 2.2.2 Dual Bypass Circuit Configuration | 15 | | | 2.2.3 BICS Comparator Design | 17 | | | 2.2.4 Sense Amplifier Design for BICS Implelementation | 21 | | | 2.2.4.1 Voltage-Mode Sense Amplifier | 21 | | | 2.2.4.2 Source-Controlled Sense Amplifier | 22 | | | 2.2.4.3 Current-Controlled Sense Amplifier | 23 | | | 2.2.4.4 Buffered Latch Sense Amplifier | 25 | | | 2.2.5 Current-Mode Amplifier as BICS Comparator | 26 | | | <ul><li>2.2.6 Current-Mode Comparator Design for BICS</li><li>2.2.6.1 Source-Controlled CMOS Current Comparator</li></ul> | 27<br>28 | | | 2.2.6.2 Multiple-Output BICS Circuit | 35 | | | 2.2.6.3 Multistage Source-Controlled Current Comparator | 36 | | | 2.3 Prototype Circuit Design | 40 | | | 2.4 IDDQ Measurement Algorithm and Hardware Cost | 42 | | | 2.5 Dual Measurement Techniques for I <sub>DDQ</sub> Testing | 45 | | | 2.6 Conclusions | 47 | | | | | | 3. | ON-LINE CURRENT MONITORING TECHNIQUES | 49 | | | 3.1 Introduction | 49 | | | 3.2 High-Speed BICS Circuit Design | 50 | | | 3.3 Current-Monitored Self-Checking Multiplier Design | 52 | | | 3.4 Transient Fault Detection in CMOS Static RAM | 55 | | | <ul><li>3.4.1 On-Chip Current Testing in Static RAMs</li><li>3.4.2 Transient Fault Tolerant SRAM Design</li></ul> | 56<br>56 | | | 3.4.2.1 Transient Currents in Static RAM Cells | 57 | | | 3.4.2.2 Upset-Induced Currents | 58 | | | 3.4.2.3 Read/Write Currents | 60 | | | 3.4.3 Asynchronous BICS Design | 61 | | | 3.4.4 Design Optimization of the SEU-Tolerant SRAM | 67 | | | 3.4.5 Mixed-Mode Simulation of Memory Cell Upset | 72 | | | 3.4.6 Test and Characterization Techniques Using Current Injection | 75 | | | 3.4.7 Radiation Test Results | 80 | | | 3.5 Conclusions | 82 | | 4 | EALH # TOLED ANT OMOG A DOLLTES CHARLES | | | 4. | FAULT-TOLERANT CMOS ARCHITECTURES | 0.4 | | | USING LOCAL REDUNDANCY TECHNIQUES | 84 | | | 4.1 Introduction | 84 | | | 4.2 Radiation-Induced Reliability Failures in Deep Submicron CMOS | 85 | | | 4.2.1 Space Radiation Environment for Microelectronics | 86 | | | 4.2.2 Radiation Effects on Advanced CMOS ICs | 86 | |----|--------------------------------------------------------------------------------------------------------------------------------------|------------| | | 4.2.3 SEU Modeling and Rate Prediction | 89 | | | 4.2.4 SEU Testing | 91 | | | 4.2.5 Upset-Tolerant Design vs. Transient Fault Tolerance | 92 | | | 4.3 SEU Hardening Techniques | 93 | | | 4.3.1 CMOS Circuit Design for SEU Hardness | 93 | | | 4.3.2 CMOS Logic Design for SEU Immunity | 95 | | | 4.4 SEU Immune Redundant Latch Using Dual Node Control | 103 | | | 4.4.1 Dual Interlocked Storage Cell Design | 103 | | | 4.4.2 Memory Array Configuration Using DICE Cells | 107 | | | 4.4.3 DICE Latch Design | 109 | | | 4.4.4 Dual-Port Memory and Register Design Using DICE Cell | 109 | | | 4.5 Upset-Immune Flip-Flop Design Using Timed Access Techniques | 111 | | | 4.5.1 Flip-Flop Design with Sequential Access Control | 111 | | | 4.5.2 Prototype Chip Description and Radiation Test Results | 113 | | | 4.5.3 Upset-Immune Sequential Cell Library Design | 114 | | | 4.6 Conclusions | 116 | | _ | DIA CNOCIC AND OLIA LIEICATION TECTING | | | Э. | DIAGNOSIS AND QUALIFICATION TESTING OF FAULT-TOLERANT CMOS ARCHITECTURES | 118 | | | OF PAULT-TOLERANT CMOS ARCHITECTURES | 110 | | | 5.1 Introduction | 118 | | | 5.2 Pulsed Laser Testing | 119 | | | 5.2.1 Laser Generated Charge Tracks in Silicon | 120 | | | 5.2.2 Laser vs. Ion Charge Collection | 123 | | | 5.3 SEE Hardness Analysis using a Pulsed Laser | 123 | | | 5.3.1 Pulsed Laser Analysis of HIT Chip | 124 | | | 5.3.2 Latchup Diagnostic | 127 | | | 5.3.3 SEU Sensitivity Analysis | 129 | | | 5.4 Design Analysis and Optimization of SEU-Hardened CMOS Latches | 136 | | | 5.4.1 Upset Sensitivity Analysis | 136 | | | 5.4.2 Prototype Circuit Description | 137 | | | 5.4.3 Heavy Ion and Pulsed Laser Test Results | 138 | | | 5.4.4 Upset-Sensitive Topologies in Non-Optimized DICE Cells | 139 | | | 5.4.5 Topology Optimization for Upset Immunity | 140 | | | <ul><li>5.4.6 Upset Mechanism in the Optimized DICE Cell</li><li>5.4.7 Topological Modeling of Dual-Node Charge Collection</li></ul> | 141 | | | 5.4.7 Topological Modeling of Dual-Node Charge Collection 5.5 Conclusion | 142<br>146 | | | 5.5 Conclusion | 140 | | 6. | CONCLUDING REMARKS AND FUTURE WORK | 148 | | | | | | R | EFERENCES | 150 | | A | NNEXES | | | | A. SEU-TOLERANT SRAM USING CURRENT MONITORING | | | | B. SEU-TOLERANT SRAM USING DICE REDUNDANT CELL | | | | C. UPSET-HARDENED SHIFT REGISTER ARRAY | | | | D. SEU-IMMUNE REGISTER CELL IN 0.25µm CMOS | | | | | | E. UPSET-HARDENED CMOS FLIP-FLOP LIBRARY # **List of Figures** | Fig. 1.1 CMOS subthreshold leakage increase with scaling | 2 | |------------------------------------------------------------------------------------------------------------------------------------|----------------------| | Fig. 1.2 Multiple-constraint design techniques in deep submicron CMOS | 4 | | Fig. 2.1 BICS-monitored circuit under test | 10 | | Fig. 2.2 Simplified power model for a CMOS circuit partition monitored by BICS | 10 | | Fig. 2.3 Dual bypass circuit principle (a) and schematic (b) | 16 | | Fig. 2.4 Single and dual bypass switch operation waveforms | 17 | | Fig. 2.5 CMOS Inverter Operation as a Voltage Amplifier | 19 | | Fig. 2.6 Conventional voltage latch sense amplifier for BICS | 22 | | Fig. 2.7 Source-controlled current mode sense amplifier | 22 | | Fig. 2.8 BICS Circuit Design Using a Current-Controlled Sense Amplifier | 24 | | Fig. 2.9 Buffered Latch Sense Amplifier | 26 | | Fig. 2.10 Basic Current Mode Amplifier Schematic | 27 | | Fig. 2.11 CMOS Current Comparator Operating as BICS | 27 | | Fig. 2.12 Low Input Voltage Current Mirror Design [142] | 28 | | Fig. 2.13 Source-Controlled Current Comparator circuit | 29 | | Fig. 2.14 Operating output characteristics of the current-mode comparator | 31 | | Fig. 2.15 Current comparator with enable control | 32 | | Fig. 2.16 Current Gain Characteristics of the Source-Controlled Comparator | 33<br>33 | | Fig. 2.17 Propagation Delay Characteristics of the Source-Controlled Comparator | 33<br>34 | | Fig. 2.18 Source-Controlled Current Comparator circuit operating as BICS | 3 <del>4</del><br>37 | | Fig. 2.19 Multiple-Output Comparator with Sequential Output Stages Fig. 2.20 Multiple-Output Parallel Comparator Configuration | 39 | | Fig. 2.21 Multiple-Output Faraner Comparator Configuration Fig. 2.21 Multiple-Output Sequential/Parallel Comparator Configuration | 39 | | Fig. 2.22 BICS placement options on prototype circuit layout | 39<br>41 | | Fig. 2.23 Simulated BICS performance characteristics | 42 | | Fig. 3.1. Generic circuit architecture for on-line current monitoring | 51 | | Fig. 3.2. Block diagram of the current-monitored self-checking multiplier | 52 | | Fig. 3.3 Microphotograph of the self-checking multiplier prototype using BICS | 53 | | Fig. 3.4 Transient currents induced by (a) positive and (b) negative upset currents | 58 | | Fig. 3.5 Current and voltage vaweforms for the critical positive and negative upsets | 60 | | Fig. 3.6 Transient supply currents in memory cells during read/write operation | 61 | | Fig. 3.7 Dual BICS circuit block diagram | 62 | | Fig. 3.8 Schematic of the dual asynchronous BICS circuit | 63 | | Fig. 3.9 Current mirror controlled by transient upset current I <sub>u</sub> | 64 | | Fig. 3.10 SPICE simulation of BICS operation for positive and negative upset currents | 65 | | Fig. 3.11 SRAM supply noise variation with upset current amplitude | 66 | | Fig. 3.12 Simulated BICS delay characteristics for positive and negative upset detection | 66 | | Fig. 3.13 Block diagram of the upset-tolerant CMOS SRAM | 67 | | Fig. 3.14 Memory column with asynchronous BICS circuit | 68 | | Fig. 3.15 Power supply isolation for current monitoring in CMOS SRAM column | 69 | | Fig. 3.16 Circuit area microphotograph showing the asynchronous current sensors | 70 | | Fig. 3.17 Operating waveforms of the current-monitored SRAM cell under upsets | 71 | | Fig. 3.18 Chip photograph of the current-monitored SRAM prototype | 72 | | Fig. 3.19 SRAM cell with current sensing elements used for mixed mode simulation | 74 | | Fig. 3.20 Mixed Mode 3D Simulation results | 74 | | Fig. 3.21 Schematic of the transient current pulse generator | 75 | | Fig. 3.22 Simulated operating waveforms for the built-in current pulse generator | 76 | | Fig. 3.23 Measured SRAM performance characteristics | 77 | | | | | Fig. 3.24 SRAM access time output waveforms observed on a digital oscilloscope | 77 | |--------------------------------------------------------------------------------------------------------|------------| | Fig. 3.25 Operating waveforms for upset detection and correction | 78 | | Fig. 3.26 Flowchart of the error correction sequence | 79 | | Fig. 4.1 Critical charge variation with technology feature size | 88 | | Fig. 4.2 Effects of scaling on switching charge of a minimum size CMOS inverter | 89 | | Fig. 4.3 Sample particle fluence distribution for a 400 km Low Earth Orbit (LEO) | 90 | | Fig. 4.4 Typical measured upset sensitivity characteristic | 91 | | Fig. 4.5 Upset hardening of a CMOS latch with added resistive and capacitive elements | 94 | | Fig. 4.6 Generic block diagram of an upset-immune redundant storage cell | 96 | | Fig. 4.7 Upset-immune NMOS latch (a) and PMOS latch (b) schematics | 97 | | Fig. 4.8 Common-mode write operation for ratioed NMOS and PMOS latches | 98 | | Fig. 4.9 Upset-immune pseudo-CMOS latch configurations | 99 | | Fig. 4.10 SEU-immune cells using a redundant slave latch | 99 | | Fig. 4.11 Upset-tolerant storage cells using PMOS-NMOS latch pairs | 100 | | Fig. 4.12. Modified Rockett cell (a) and HIT2 cell (b) | 101 | | Fig. 4.13 Upset-immune storage cell with dual-node interlock control | 103 | | Fig. 4.14 Basic inverter latch schematic (a), half-latch (b) and DICE cell (c) | 104 | | Fig. 4.15 DICE storage cell with single-ended (a) and differential (b) write access | 106 | | Fig. 4.16 Spice simulation waveforms: positive upset pulse (a), negative upset pulse (b) | 107 | | Fig. 4.17 Conversion of two standard 6-transistor SRAM cells to a DICE cell | 108 | | Fig. 4.18 Layout of a standard SRAM cell (a) and the equivalent DICE cell (b) | 108 | | Fig. 4.19 DICE latch with passgate (a) and clocked inverter (b) access circuit | 109 | | Fig. 4.20 DICE dual-port register latch with coincidence access | 110 | | Fig. 4.21 Dual-port RS latch with set operation at access coincidence | 110 | | Fig. 4.22 Write-Transfer redundant flip-flop circuit (WT-HIT2) | 111 | | Fig. 4.23 DICE flip-flop with sequential access and single-node (X1) master section | 112 | | Fig. 4.24 DICE flip-flop with sequential access and dual-node (X1-X3) master section | 113<br>124 | | Fig. 5.1 Block diagram of the laser system used for SEE analysis [59] Fig. 5.2. Layout of the HIT chip | 124 | | Fig. 5.2. Layout of the HTT clip Fig. 5.3. Laser-induced latchup site in the peripheral read circuit | 123 | | Fig. 5.4. Layout drawing of the read circuit with the latchup site (x) | 127 | | Fig. 5.5 Rockett cell structure sensitive to SEL | 128 | | Fig. 5.6 Layout drawing of the latchup sensitive area of the Rockett cell | 128 | | Fig. 5.7 Laser-induced SEU sites in the HIT1 cell | 129 | | Fig. 5.8 Microphotograph of SEU site1 shown in Fig. 5.7 | 130 | | Fig. 5.9 Schematic diagram of the HIT1 cell | 131 | | Fig. 5.10 Cross-sectional view of the dual-node upset-sensitive area | 132 | | Fig. 5.11 Two laser-induced SEU sites in LIU cell | 133 | | Fig. 5.12 Schematic diagram of LIU cell | 134 | | Fig. 5.13 Charge collection distribution for dual node upsets at HIT1 and LIU cells | 134 | | Fig. 5.14 DICE storage elements: memory cell (a) and latch (b) | 137 | | Fig. 5.15 Measured characteristic of DICE SRAM effective cross section | 138 | | Fig. 5.16 Upset-sensitive dual drain configurations | 140 | | Fig. 5.17 Simplified layout of DICE_OPT cell showing the upset-sensitive areas | 141 | | Fig. 5.18 Sample dual drain topology for charge collection modeling | 142 | | Fig. 5.19 Charge collection functions: asymmetric (a) and symmetric (b) storage cells | 143 | | Fig. 5.20 Dual-node charge collection mechanisms | 145 | ## **List of Tables** | Table 4.1 | Natural space radiation environment and its effects on semiconductors | p. 86 | |------------|-----------------------------------------------------------------------|-------| | Table 4.2 | Dual-node perturbation analysis for DICE cell write operation | p.105 | | Table 5.1. | Laser-induced SEP test results | p.130 | | Table 5.2 | Heavy ion test results for DICE SRAM | p.139 | | Table 5.3 | Laser test results for DICE-REG chip | p.139 | # Chapter 1 ### Introduction Current VLSI technology allows the manufacture of complex ICs with hundreds of millions of transistors operating with increasingly low-voltage, noise-sensitive signals at clock speeds close to GHz range. Device operation is continuously pushed at high packing density in order to satisfy increasing performance constraints. This induces strong electric fields that may affect IC reliability, and also increases device sensitivity to contamination, size variations, noise and electrical perturbation. The severe stress induced on transistor structure, isolation oxides and metal interconnects increases the defect rate due to imperfections in the fabrication process and amplifies the negative impact of these defects on the manufacturing yield. The cost and complexity of defect detection and diagnosis grows proportionally with chip size and density. On the other side, the progress in refining manufacturing process accuracy can no longer keep pace with the increase of the defect rates at submicron range integration densities. Thus, effective detection of manufacturing-induced defects through electrical testing or accelerate rate stress screening tends to become unmanageable at high circuit complexity [75][77][95]. On the other side, high performance ICs manufactured in deep submicron (DSM) CMOS must satisfy increasingly shrunk operating margins for timing, power, noise etc. Since they pass the manufacturing tests with higher "hidden" defect escape rates, these combined effects may lead to a significant increase of field failure rates: transient faults, performance degradation and catastrophic failures. Transient faults, that account for more than 90% of field failures in earlier technologies, are mainly caused by critical timing, noise and package-or environment-induced particle radiation [22-24]. These failure modes show significant increase at higher integration densities, faster operating speeds and lower supply voltages, with potentially destructive effects in system-on-chip designs. As a consequence, both hidden permanent faults induced by small manufacturing defects and transient faults due to tight technology and performance margins became a main threat to DSM CMOS circuit applications. Soft failure mechanisms leading to intermediate voltage levels and delay faults are undetected or hard to detect by voltage-mode testing. This may have significant impact on high-performance product and technology development, production yield and system reliability. In these conditions, the systematic use of complementary test methodologies such as $I_{DDQ}$ testing to cope with soft error detection becomes a mandatory requirement for DSM CMOS. On the other side, the use of redundancy-based yield-enhancement techniques at design stage and the implementation of fault-tolerant circuit architectures become both mandatory and economically justifiable for high-performance, high-reliability system-on-chip applications. Since low chip failure rates, e.g., 0.1% in 10 years, are required for the new generation of VLSI chips with 100-million transistor densities, the transistor failure rate goal is staggering: one failure per $10^{12}$ transistor-years. The age of the universe is only $10^{10}$ years! [155]. When considering $I_{DDQ}$ testing as the only mean to effectively detect soft failures for quality and reliability improvement in deep submicron CMOS, we observe two main conflicting issues: On one side, increased accuracy is required for quiescent current measurement to provide adequate coverage of the new defect mechanisms. On the other side, the fault coverage with $I_{DDQ}$ testing is drastically reduced by the increased subthreshold leakage in fault-free devices that masks the small defect-induced currents. This can be illustrated as shown in Figure 1.1, by observing the exponential increase of the static subthreshold leakage with over five decades when a typical CMOS process is scaled down from $1\mu m$ to $0.18 \mu m$ . (Data from Keshavarzi et al. 1998) Fig. 1.1 CMOS subthreshold leakage increase with scaling Moreover, the higher measurement sensitivity and accuracy required to detect smaller faulty currents is drastically limited by the high-noise measurement environment, due to a corresponding reduction of the supply voltage by a factor of three. The exponential increase of static power also becomes a significant design constraint to circuits' power budget for portable applications [4-5]. This compares worse with dynamic power consumption, that may benefit from the square-law decrease with supply voltage reduction. This research work proposes an non-euphemistic view of CMOS VLSI design for test, reliability and fault-tolerance, in which the complexity of the means involved for fault detection/diagnosis and system recovery and the high cost of embedded redundancy should be an incentive to find better, more effective ways of reaching these fundamental objectives, and not a motivation to accept cost-effective products with low quality and reliability. Several main strategies for reliable and fault-tolerant design in deep submicron CMOS are explored in this document: - on-chip current monitoring techniques for both effective off-line $I_{DDQ}$ testing and concurrent on-line transient fault detection, - upset-immune CMOS system design for space and radiation environment applications using technology-independent, redundancy-based approaches, - system validation, fault diagnosis and redundancy assessment techniques using on-chip fault-injection simulation mechanisms, and - topology-related sensitivity analysis of fault-tolerant CMOS architectures using external stimulation with an accurately focused pulse laser beam. Multiple-constraint approaches for system design strategies are explored using innovative design techniques for test, reliability and fault-tolerance devised to cope with both transient and permanent faults in deep submicron CMOS. We investigate correlated design techniques for low power and $I_{DDQ}$ testability, with synergetic effects as symbolically represented in Figure 1.2. They rely on the use of controlled power supply switching circuits and embedded current monitors. Their cost may thus be shared, making system implementation more affordable, with improved effectiveness and performance. Figure 1.2 Multiple-constraint design techniques in deep submicron CMOS Single event upsets (SEU) represent a radiation-induced hazard which is increasingly dominant in high density submicron CMOS ICs, particularly in space-borne applications [32][79][92][98]. Their occurrence is most difficult to avoid compared to transient faults induced by other sources, e.g., noise, signal coupling and electromagnetic interference. The critical amount of charge that is collected from radiation-induced local ionisation at a sensitive node and is able to produce an upset decreases as the inverse square of the feature size [83]. This dependence is similar for various technologies such as bipolar, CMOS/bulk, CMOS/SOI or GaAs. SEU-hardened design techniques currently applied to commercial designs incur either a drastically reduced effectiveness or an unacceptable degradation of performance when applied to deep submicron CMOS technologies. The use of polysilicon resistive feedback interconnects on the feedback paths allows perturbation removal previous to stored state reversal, and the increase of the capacitance at the sensitive storage nodes correspondingly increases the critical charge for upsets. However, in order to be effective in DSM CMOS, these techniques would inherently induce a large degradation of memory access time. On the other side, the reduced node capacitance implies the use of resistor values of MOhm range, whose accuracy and reproducibility on high capacity RAMs are currently unfeasible technologically. System level design hardening based on digital coding techniques for error detection and correction (EDAC) [89] may also add unacceptable drawbacks due to error latency and performance degradation in time-critical applications. It adds significant system overhead and trade off system performance for reliability by performing the error detection/correction process periodically to reduce the impact on processing speed. System safety and upset tolerance can thus be lost as the result of error latency. Local EDAC processors with parallel, distributed operation may be devised to improve system safety and reduce error latency without affecting significantly the processing power. However, this quantitative improvement is obtained at the expanse of high implementation cost. Upset-hardened processes such as silicon on insulator (SOI) also show a significant degradation of their upset hardness effectiveness in deep submicron CMOS. The collected charge at the upset-sensitive nodes in SOI transistors grows linearly with feature size reduction due to additional effects of bipolar amplification and direct channel conduction. Design hardening techniques at circuit level based on storage latch duplication and state-restoring feedback circuits can be developed to achieve immunity to upsets. Their use avoids the error latency and performance loss of system design hardening solutions. Redundancy-based design hardening could represent a viable alternative to achieve upset immunity in deep submicron CMOS. Our research is focused on upset-immune memory design for nuclear and space applications, and its synergetic effects on fault tolerance, reliability and extended system functionality. Redundancy-based logical design techniques for upset tolerance in CMOS static RAMs is correlated in our study with topology-related physical design strategies for improved immunity. Expanded functional capabilities can be added, such as coding and security functions based on redundant SRAM cell design and system reconfiguration and adaptive performance monitoring strategies based on supply current monitoring. One of our main goals is to develop highly effective design strategies for test, reliability and fault tolerance that may also be used to expand system functionality. This avoids a dichotomy-based analysis of their cost and performance constraints with respect to those same constraints related to basic system functionality. Innovative test techniques based on fault injection are described in this work in order to test, assess and characterize the redundancy properties of fault-tolerant system designs. Two on-chip test approaches are considered, one of them using current-mode fault injection and the other based on logic control of the redundant elements. An extensive analysis is also performed using external injection of transient faults for upset hardness assessment and validation. This method uses a focused laser beam pulse of controlled intensity for transient fault injection at accurately defined locations on circuit's topology. Contributions to knowledge, described in this thesis, are summarized as follows: - 1. Design of high performance current sensors for low-voltage, high-speed and high sensitivity operation in deep submicron CMOS. - 2. Development and validation of on-chip current monitoring algorithms using asynchronous Built-In Current Sensors (BICS) for upset-tolerant SRAM design using transient fault detection and adaptive $I_{DDQ}$ testing. - 3. Design of upset-immune, transient fault tolerant CMOS system architectures using local redundancy techniques and their validation on circuit prototypes using commercial 1.2, 0.8 and 0.25µm CMOS processes. - 4. Design of an upset-hardened standard cell library of sequential storage elements in 0.6μm CMOS. - 5. Design and application of on-chip test and validation strategies for fault-tolerant CMOS architectures. - 6. Analysis of topology-related upset sensitivity in SEU-hardened CMOS storage elements using a pulsed laser beam. - 7. Development of a topological model for dual-node charge collection phenomena in upset-immune storage cell designs based on local redundancy. The outline of the thesis includes, in the first part, detailed analysis and experimentation of current monitoring techniques for reliable and fault-tolerant CMOS ASIC design. Performance and limitations issues for I<sub>DDO</sub> testing in deep submicron CMOS and are briefly described in Chapter 2. High performance on-chip current sensor designs are presented and synergetic approaches to I<sub>DDO</sub> measurement techniques are investigated. Online current monitoring techniques using these current sensor designs for concurrent error detection in fault-tolerant and CMOS system architectures are presented in Chapter 3. Two prototype circuit designs are described, a current-monitored self-checking multiplier and an upset-tolerant static RAM, and experimental test and characterization results are provided. Chapter 4 introduces fault-tolerant CMOS system architectures using local redundancy techniques for transient fault immunity. Detailed design techniques and implementation alternatives are demonstrated. Three prototype chip designs are described and their validation results with heavy ion testing are presented and analyzed. Chapter 5 describes diagnosis and qualification testing techniques for fault-tolerant CMOS system architectures using a focused pulse laser beam. Topology-related upset mechanisms are identified and corresponding physical design optimization techniques are derived. # Chapter 2 ## **On-Chip Current Sensors for Submicron CMOS** #### 2.1 IDDQ Testing: Performance and Limitations The CMOS IC test methodology based on the observation of the quiescent current on power supply lines allows a good coverage of physical defects such as gate oxide shorts, floating gates and bridging faults, which are not very well modeled by the classical fault models, or undetectable by conventional logic tests. In addition, $I_{DDQ}$ testing can be used as a reliability predictor due to its ability to detect defects that do not yet involve faulty circuit behavior, but could be transformed into functional failures at an early stage of circuit life. Testing the low quiescent supply current ( $I_{DDQ}$ ) of static CMOS digital circuits complements conventional voltage testing in CMOS technologies for production quality and reliability improvement, design validation and failure analysis [1-3]. It basically consists in measuring the low supply current drawn in good devices at the end of the switching phase, and before new logic transitions are applied at its inputs. $I_{DDQ}$ testing is based on the premise that CMOS circuits draw extremely low leakage currents when no transistors are switching. The ideal circuit is a fully complementary and fully static CMOS design. In practice, designs that are not ideal are tested for $I_{DDQ}$ failures, provided that the deviation from the ideal CMOS behavior has minimum negative impact on the highest test coverage attainable for a given design. DFT tools for $I_{DDQ}$ testability may be employed in order to detect high- $I_{DDQ}$ operating conditions in order to either modify the design or adequately select the $I_{DDQ}$ test patterns to ensure highest test quality and effectiveness. Conventional voltage-mode test methods for CMOS logic circuits are designed to cover faults that can be mapped on to the classical stuck-at fault model that alters the logic state at the internal circuit nodes and can be propagated and observed at circuit's outputs. They provide poor coverage of physical defects such as bridging faults, gate oxide shorts, parasitic transistor leakage, defective p-n junctions, which account for the largest part of the manufacturing defects in high density CMOS processes. A vast majority of physical defects and most of the classical stuck-at logic faults typically cause significant supply currents that can be easily detected by $I_{DDO}$ tests [2][26][28]. The effectiveness of $I_{DDQ}$ testing for obtaining highly reliable products makes it an essential component of the CMOS IC testing process. Other particular advantages include easy test generation [6][7] and very short test pattern requirements, since basically, any single test pattern covers 50% of the potential stuck-at faults at the internal circuit nodes. The limitations of $I_{DDQ}$ testing are related to the test application process for complex CMOS VLSI circuits. A reduced $I_{DDQ}$ measurement sensitivity is observed at low supply voltage, where circuit's supply lines that serve as ultra-sensitive test observation points are perturbed by increased noise and parasitics. This restrains the $I_{DDQ}$ test effectiveness in detecting low faulty currents and significantly increases test duration. Another critical issue concerns the threshold limit setting for $I_{DDQ}$ testing of complex CMOS circuits with high cumulative leakage currents in good devices. Too large test limits may reduce the fault coverage, while too tight test thresholds and large inaccuracies in current measurement increase the rates of rejects and reduce the test yield. Two alternatives have been investigated by the research community to increase $I_{DDQ}$ measurement speed and accuracy. First, the design of built-in current sensors (BICS) has been explored, for fast and accurate on-chip $I_{DDQ}$ measurement. On-chip BICS designs operate as high accuracy analog circuits with severe drawbacks when embedded in complex, high performance CMOS ASICS. They imply high development and implementation costs, significant IC design modification, large area overhead and degraded operating speed. An alternative solution has been adopted by the QTAG group formed by academic research units and several major IC and test equipment manufacturers [9][15][16]. It consists in developing a dedicated, external $I_{DDQ}$ monitor as a companion circuit that monitors the DUT supply lines during the testing process. Deep submicron scaling of the state-of-the-art CMOS processes has revived the on-chip $I_{\text{DDQ}}$ test challenge, since it leads to a significant increase of complex failure modes which are detectable with $I_{\text{DDQ}}$ testing but escape the voltage-mode testing. Smaller device geometry, added layers and lower supply voltages create the conditions for complex defect behavior that can not be modeled by stuck-at faults. As the operation speeds increase and the noise margins decrease, the effects of these spurious faults are becoming stronger, so that even small delay variations may induce functional failures. Advanced synthesis and optimization techniques required to achieve high speed operation make all paths in complex CMOS ICs become near critical. As a consequence, the large majority of smaller defects, previously tolerated by less dense processes, may result in timing faults. While path delay test pattern generation is a very complex task, taking into account the huge amount of signal paths in a complex circuit makes this process computationally unfeasible. Sematech experiments confirm these trends, showing that performance faults become critical, while defects non-covered by known fault models are observed. In this context, $I_{DDQ}$ testing is becoming mandatory as an efficient means to detect a large part of such defects. In contrast to the increased need for higher sensitivity of the $I_{DDQ}$ testing process to improve fault coverage, discriminating small faulty $I_{DDQ}$ values becomes unfeasible due to a drastic increase of the fault-free $I_{DDQ}$ . Since deep submicron CMOS technologies are forced to shift to lower supply voltages to reduce power and improve reliability, the resulting threshold voltage reduction multiplies the subthreshold leakage by several orders of magnitude. The switching power is reduced quadratically with respect to $V_{DD}$ , but a reduction in $V_t$ to compensate for speed causes an exponential increase in leakage current. The ratio of the increased leakage power to the dynamic switching power is drastically reduced, particularly for low switching activity applications. High leakage levels in large CMOS ICs mask the smaller faulty currents induced by physical defects, making $I_{DDQ}$ testing unpractical [2]. Several alternative solutions have been proposed to overcome this limitation. Sachdev considered in [130] the use of variable threshold voltage circuits to reduce the subthreshold leakage currents during I<sub>DDQ</sub> measurement. To realize this, both the p and n wells should be separated from GND and VDD, respectively, and these two regions should be reverse-biased via distinct pins to increase transistor threshold voltage. Cooling the IC under test proposed by Szekely et al. [157] also shows a reduction of CMOS subthreshold leakage with two orders of magnitude for a temperature reduction of 100°C. However, in order for low temperature testing to be effective, the temperature would have to be brought very low. This significantly increases test costs and may cause problems in future processing. #### 2.2 Built-In Current Sensors for I<sub>DDO</sub> Testing The highest potential increase in the $I_{DDQ}$ measurement accuracy is observed if Built-In Current Sensors (BICS) [10-14][17-19][43][127] are employed. Due to this consideration, we have investigated the use of BICS circuits as a possible solution to the $I_{DDQ}$ testing dilemma for deep submicron CMOS ICs. A novel BICS circuit design has been subsequently elaborated with fast and accurate operation at low supply voltage within low power and low noise constraints. A BICS-monitored circuit, presented in Figure 2.1, can be seen as a system composed of four parts: - the circuit under test (CUT); - a bypass device, to provide a high conductance path for the high current transients between the supply line $V_{ss}$ and the virtual supply line $V_{s}$ of the CUT during normal switching operation; - a sensing element with high impedance characteristic to detect the fault-induced IDDQ currents when the bypass device is deactivated; - a fast and accurate detection circuit, usually a comparator. A BICS circuit can thus be defined as a power supply switch with specific loading and dynamic switching constraints and with fast, accurate *off* state leakage detection capability. Fig. 2.1 BICS-monitored circuit under test In order to allow easier integration of the BICS insertion strategies into the power supply distribution optimization process, we have developed a hierarchical power modeling approach for CUT and BICS to support $I_{DDQ}$ test optimization, analysis and integration. A simplified graphical representation of the devised power model for BICS-monitored CUT partitions is defined in Figure 2.2 for a $V_{SS}$ -referenced BICS implementation. Fig. 2.2 Simplified power model for a CMOS circuit partition monitored by BICS Both CUT supply current behavior and BICS operation are modeled using three distinct time-dependent current sources. These artificially-split current components in the model characterize the three phases of synchronous logic switching operation in a fault-free circuit: the transient switching current I<sub>TR</sub>, the current decay I<sub>DECAY</sub> and the quiescent state leakage I<sub>sso</sub>. The three current components are approximated in the fault-free CUT model by a switching current pulse generator of rectangular shape, an exponential decay pulse generator and a dc leakage current source, respectively. Additional current generators are used to model fault-induced transient currents and pattern-dependent faulty leakage currents. Supply current parameters are obtained from simulation results using power analysis/estimation tools for good device behavior and fault simulation algorithms for typical transient and I<sub>DDO</sub> fault models. The mean value and duration of the $I_{TR}$ component as well as its largest values for worst-case input patterns and internal logic states can be obtained by CUT power analysis that requires the use of a technology-dependent gate level circuit description. For high-level design representation, supply current estimation algorithms can alternately be used to provide a CUT model description with ratio-based, relative accuracy, whose absolute values are implementation-dependent. Typical supply current characteristics for different circuit functions and design styles can be thus generated by referencing them to simulation and measurement results of an initial library of circuits and process/design rules and models. BICS circuit model is a dynamically controlled power interconnect impedance, whose resistive component is represented by three current generators. The first two currents $I_{BI}$ , $I_{B2}$ model the bypass switch operation and correspond to CUT switching transition $I_{TR}$ and quiescent state settling $I_{DECAY}$ respectively. The third current $I_{LIM}$ models the maximum leakage current $I_{SSQ}$ allowed in good devices during current measurement phase. BICS model is analogous to CUT power model representation in order to allow us to accurately define the parameter matching and synchronization requirements for the two models. Both models also include parasitic capacitance components that do not imply matching constraints or the analysis of their time-dependent variation. They are globally represented as an equivalent capacitance $C_S$ at the virtual supply node in Figure 2.2. Inductive parasitic effects are neglected since they have no significant relevance at this stage. They are to be considered in the CUT power supply distribution model at the upper hierarchical, global circuit partitioning level. The bypass switch represents a key element in the BICS design process, with highest implementation cost and performance impact, due to its inherently large size and fast switching requirements. Several types of high current bypass devices can be employed in CMOS: bipolar vertical and lateral transistors, "self-switching" p-n junction diodes and MOS transistors. Using a large MOS transistor as a bypass device represents in most cases the best cost/performance tradeoff in order to achieve a low voltage drop, and thus to avoid performance degradation. Less area-hungry solutions are eliminated from our choice as being less effective for low voltage and high speed operation: the diode junctions [11][39] induce a large voltage drop, resulting in significant performance penalty, and bipolar transistors [17] implemented with lateral structures in conventional CMOS processes have large switching time and poor current driving capability. A low-cost solution consists of using on-chip current sensors with external, off-chip bypass elements during the testing process and subsequently invalidated at chip assembly via bonding options. However, such an alternative does not eliminate the adverse effects of pin interface parasitics on $I_{DDQ}$ test speed and accuracy. It also adds significant I/O pad area that may easily exceed bypass switch area, limits partition count and may drastically reduce the effectiveness and the flexibility of the on-chip current measurement. Our approach for the BICS circuit design process considers circuit partitions of variable size, complexity and operating characteristics. This means that each current measurement unit and each controlled bypass switch of a BICS circuit may be designed operate with different parasitic input load capacitance, timing and synchronization requirements and with various ranges and distributions of dispersion of switching current amplitude and duration. The CUT partition size will determine the maximum leakage currents to be monitored by BICS. Its functional characteristics will determine the transient currents to be supplied by the power switch bypassing the current monitor and the bypass clock timing and synchronization constraints. Since external I<sub>DDO</sub> testing becomes ineffective in deep submicron CMOS due to the high subthreshold leakage, the BICS approach offers flexible means to partition a circuit into smaller parts of any desirable size, that can be separately monitored with higher test speed, accuracy and fault coverage. Such a partitioning allows us to reduce the subthreshold leakage of each monitored CUT to levels that do not mask the currents induced by physical defects. Two basic BICS functional blocks are identified in Figure 2.1 for a current-monitored CUT partition: a leakage current monitor and a controlled supply voltage switch. The former implements the on-chip I<sub>DDO</sub> testing process and the latter connects the CUT to the global supply line during normal operation and disconnects it during idle state periods that are used to perform the $I_{\rm DDO}$ test. The main partitioning constraints are defined by $I_{\rm DDO}$ fault simulation for the test function and by power modeling and simulation for normal/test mode control. In contrast to previous approaches for CUT partitioning that consider current measurement based criteria, we view this process as being closely related to power management, architectural and performance optimization functions as currently implied in complex system-chip design strategies based on reusable IP cores. This signifies that specific CUT partitioning for $I_{DDQ}$ testing is to be avoided as far as possible, for cost, performance and flexibility considerations. Two basic types of current sensing element configurations with resistive and capacitive characteristics, respectively, have been generally considered for BICS implementation. Linear resistors as well as their nonlinear and parametric extensions to diodes, transistors and current mirrors allow fast I/V conversion. They can be used in conjunction with sampling-mode strobed comparators to implement synchronous BICS operation [8][10-13]. Capacitive current sensors are employed as integrators for I/V conversion [37-39], with higher achievable accuracy at lower speed. However, the distinction between resistive and capacitive current sensing is relevant only when referred to the current measurement principle employed, since both types of current sensing devices are inherently present at the virtual supply node. The existence of intrinsic leakage current and parasitic capacitance contributions of both CUT and BICS at the current sensing node affect both resistive and capacitive BICS behavior and accuracy, that are difficult to accurately estimate and calibrate. We have adopted henceforth a more realistic R-C (in fact, I-C) model for the current sensing element, composed of a current generator I<sub>LIM</sub> and a capacitance C<sub>S</sub>. This implies that resistive and capacitive sensing may occur as particular, ideal cases. The maximum voltage drop at the virtual supply node V<sub>REF</sub> is considered as an additional design constraint. A resistive current sensing characteristic is henceforth defined by $R_S = V_{REF}/I_{LIM}$ . The parasitic capacitance C<sub>s</sub> at the virtual supply node limits the attainable speed and accuracy for both resistive and capacitive $I_{\text{DDO}}$ measurement. The higher the capacitance, the longer the test time, since it takes more time for the faulty current to charge this capacitance at a given voltage level detectable by the comparator. The current monitoring circuit implementation is the last part in the BICS design process, since its operation and performance are significantly influenced by the bypass switch design. Its main design constraints are simple, reliable operation with low silicon area. It should also satisfy various conflicting requirements, such as low supply voltage levels vs. low noise, low input offset operation and fast response vs. high accuracy. A wide range of voltage and current comparator configurations have been proposed for I<sub>DDQ</sub> detection. Voltage amplifier stages [34-35][46], operational amplifier based integrators [38] and sense amplifier latches [10] have limited speed and accuracy at low supply voltage operation. Current mode approaches that have been previously described use either current mirrors [14][19] or current conveyors [44] that lack both high speed and low voltage capabilities. In this work we describe two new families of fast and sensitive comparators for BICS operation. They are designed to cope with previously mentioned requirements for deep submicron CMOS operation and to allow flexible selection of power supply range, operating speed and I<sub>DDQ</sub> threshold limit. The first family of comparators is based on current mode sense amplifier circuit design and the other one employs source-controlled current mirrors. #### 2.2.1 Bypass Switch Design The performance penalty induced by BICS insertion can be eliminated by using large MOS transistors as bypass devices. Such transistors may also operate as selective power disconnect switches in multi-threshold CMOS (MTCMOS) technologies to enable high performance and low power operation. In a MTCMOS process, high threshold transistor switches reduce the leakage current of the low-threshold CMOS logic in the high-speed switched CUT partition. The static power dissipation is reduced with several order of magnitude, thus ensuring high test coverage. Accurate sizing of the bypass transistor requires close consideration of both input vector patterns and internal circuit structure [158][159]. Our approach for bypass transistor sizing employs optimization algorithms based on both logic and electrical simulation of CUT. We have adopted and modified a practical design method for power switch sizing proposed by Mutoh et al. in [160] and referred to as the average current method (ACM). It determines the minimum transistor size for the allowed speed penalty SP taking into account the voltage drop $\Delta V$ at the power switch and assuming a constant mean operating current. The speed penalty, defined as the ratio of the delay times at $V_{DD}$ – $\Delta V$ and $V_{DD}$ , is expressed as $$SP = \frac{\tau(VDD - \Delta V)}{\tau(VDD)} \cong \frac{VDD - V_{th}}{VDD - V_{th} - \Delta V}$$ (2.1) where $\alpha$ is 2 in the Shockley model and $V_{tl}$ is the threshold voltage for the logic gates in the circuit. The "on"-state NMOS bypass power switch in series with the CUT can be approximated by replacing it with a single linear resistor R. During normal circuit operation, the virtual ground node is close to real ground, so $V_{ds}$ of the bypass transistor is small and the resistive approximation is very accurate. The width W of the power switch is determined considering its normalized resistance R' and the maximum voltage drop $\Delta V$ induced by the switching current I: $$W = \frac{R'}{\Lambda V} I \quad ; \quad R' = R ? W \tag{2.2}$$ Replacing $\Delta V$ from (2.2) in Eq. (2.1) leads to a sizing equation for W: $$SP = \frac{1}{1 - \frac{I}{W}? \frac{R'}{V_{DD} - V_{th}} \sqrt[\alpha - 1]}$$ $$(2.3)$$ $$W = \frac{1}{1 - \alpha - \sqrt{SP}}? \overline{\frac{R'}{V_{DD} - V_{th}}} \sqrt{2I}$$ (2.4) In Eq. 2.4, the average current I considered in [158] is replaced in our case by a weighted average of the peak current $I_p$ for the worst-case delay transitions in order to achieve a more conservative design for speed and noise immunity. We developed a dual bypass switch architecture to reduce the noise induced by the commutation of the inherently large MOS transistor switch (first bypass), thus increasing the current measurement speed and accuracy. #### 2.2.2 Dual Bypass Circuit Configuration During normal circuit operation, the bypass switch is permanently driven into the "on" state. When an $I_{DDQ}$ test cycle is activated, after CUT switching transition phase has passed, and the CUT quiescent state has settled, (i.e., when the decay current has reached a sufficiently low value to become negligible with respect to the current measurement range), the bypass transistor is switched off in order to perform the $I_{DDQ}$ measurement. This commutation may induce a large switching current noise through capacitive charge injection phenomena at the virtual ground node. The ground bounce thus generated can trigger the BICS comparator, and thus be interpreted as a current produced by a defect. Otherwise, it may require an added delay, and hence a test speed reduction in order to avoid erroneous test decision. Typically, a compensation circuit is implemented in order to reduce the effects of charge injection at the virtual supply node due to power transistor switching [38]. Since accurate sizing of the compensation circuit and noise-free operation of the bypass switch are hardly attainable, we have added a secondary, delayed bypass transistor to achieve low noise and high speed BICS operation. The main bypass switch MB1 shown in Figure 2.3 has a charge injection compensation circuit consisting of a MOS transistor connected in a capacitance configuration $C_{BC}$ (i.e., with drain and source shorted), driven by the inverted bypass clock. The delayed bypass switch is represented by transistor MB2, which is progressively deactivated by a control circuit consisting of transistors M1-M5. The transistor MB<sub>2</sub> connects the virtual ground to the real ground, during MB1 activation and shortly after it has been turned off. This improves the compensation by reducing the MB1 switching bounce that cannot be effectively compensated by the charge injection circuit $C_{BC}$ . Note that it is both difficult to provide a very good compensation for this noise, by means of a charge injection scheme, and hard to estimate the parasitic inductance and capacitance in order to provide such a compensation. The connection of the virtual ground to the real ground, through the second bypass transistor is therefore an efficient complement to the charge injection compensation circuit. The second bypass drives the residual decay current to ground at the end of the switching phase, thus allowing a faster $I_{DDQ}$ measurement. Since the transient decay currents and the switching noise are relatively small, we use for $MB_2$ an NMOS transistor of moderate size (e.g., $30\mu$ m in the considered example). After $MB_1$ is switched off, $MB_2$ is smoothly deactivated by the slow discharge of its gate node (capacitance C1) through the equivalent resistor formed by M4 and M5. Thus, $MB_2$ speeds up the transient current decay and reduces drastically the switching noise injected by $MB_1$ at $V_8$ node. Fig. 2.3 Dual bypass circuit principle (a) and schematic (b) At the end of the second bypass period, $MB_2$ acts as a current mirror output that subtracts the $I_{DDQ}$ test limit current $I_{LIM}$ from the virtual supply node. Fig. 2.4 Single and dual bypass switch operation waveforms SPICE simulation results presented in Figure 2.4 show the virtual ground voltage waveforms for the carry-save adder with single and dual bypass circuit. In this simulation, the $C_{BC}$ circuit is not accurately sized. It provides an overcompensation that adds a positive, slow discharging offset voltage at the virtual supply node. Of course, in the ideal conditions corresponding to the simulated circuit, this compensation could be performed very accurately, but this does not model a typical real case. The second simulation waveform illustrates the efficiency of adding the second bypass circuit when the first bypass transistor, deactivated at t=6ns, is not fully compensated. The injected offset voltage charge is effectively removed in a couple of nanoseconds by the delayed bypass circuit. It should be noted that the dual bypass switch principle can equally be used with any $I_{DDQ}$ measurement circuit to reduce switching noise and to increase the current measurement speed and accuracy. #### 2.2.3 BICS Comparator Design A comparator circuit design for fast and reliable $I_{DDQ}$ detection in deep submicron CMOS must also satisfy low voltage, low power and low noise constraints when compared to previously reported BICS circuits. Hence, a the new comparator structure has been developed for the purposes of our BICS design. The reasons for proposing this new design are the following: • First of all, a small current flow from the virtual ground to the real ground should be maintained all along the $I_{DDQ}$ testing period. This is important when multiple CUT partitions of different sizes are tested in parallel. The time interval necessary for the faulty $I_{DDQ}$ to charge the virtual ground capacitance of a large CUT partition to a level detectable by the BICS may be large enough for the leakage current of a smaller, fault-free CUT partition to charge its virtual ground at a detectable level, thus providing an erroneous fault indication. This problem can be eliminated by using different measurement delays for each BICS circuit. The added circuit complexity and the switching noise make this solution impractical. We solve this problem by allowing a partition-dependent, predefined offset current flow $I_{LIM}$ between the virtual and the real ground in each CUT partition. $I_{LIM}$ acts as an implicit test limit for $I_{DDQ}$ , and compensates both the maximum fault-free $I_{DDQ}$ currents (in fact, $I_{SSQ}$ in our case) and the noise-induced currents in good devices. This allows us to reduce the effects of various noise sources on BICS accuracy; - Fault-free quiescent currents close to the I<sub>DDQ</sub> test limit should keep the virtual ground node of a CUT to low voltage levels. Since the raised virtual ground voltage of a fault-free CUT represents the 'low' voltage level of its outputs, this in turn may increase the subthreshold leakage currents in other CUT partitions whose inputs are driven by those raised logic 'low' levels. This propagated leakage current amplification phenomenon may finally lead to erroneous I<sub>DDQ</sub> fault decisions. Henceforth, it is useful to avoid increasing the voltage level on the outputs of a CUT to values that could induce I<sub>DDQ</sub> current amplification effects to a subsequent block. In such a case, it will not be possible to define accurately controlled test limits for parallel, multiple-CUT I<sub>DDQ</sub> testing in low voltage, deep submicron CMOS ICs. While some BICS solutions perform I<sub>DDQ</sub> fault detection at virtual ground voltage levels of 0.7V, in our design this is done at a threshold level lower than 0.1V. - Current monitoring operation with gradually increased accuracy, thus allowing: - BICS output sampling at various instances during the sensing period, where each sampling instance allows us to detect a progressively lower $I_{DDQ}$ current level. $I_{DDQ}$ fault grading can thus be performed, this allowing fast and reliable failure analysis; - User access to running I<sub>DDQ</sub> tests with adaptively variable speed and accuracy. Speed-accuracy trade-off is required in situations where it is mandatory to test the circuit at high speed while being also acceptable a threshold limit increase of the I<sub>DDQ</sub> currents detectable by BICS; - User decision at any time on the I<sub>DDQ</sub> current levels that are considered as faulty. This can be done again by adapting the test speed. It allows the chip manufacturer to design and manufacture the circuit, perform I<sub>DDQ</sub> measurement experiments with our BICS, and then determine the I<sub>DDQ</sub> current level to be used for the PASS/FAIL decision. Such an approach allows us to meet the best tradeoff in terms of defect level and yield both for inhouse designed circuits and for hard IP core qualification. - Propose a current comparator easy to be integrated in low power, low voltage, single power supply voltage technologies. Typically, an inverter biased in the active region close to the logic threshold voltage acts as a comparator formed by two controlled current generators $I_1$ , $I_2$ biased close to the equal current point (Fig. 2.5). An input voltage variation increases one of the currents and decreases the other. Since the inverter works in the active region, a high gain is obtained. The output currents in the active region are given by the equations: $$I_1 = K_1 W_1 (V_{in} - V_{T1})^2 (1 + \lambda_1 ? V_{out})$$ (2.5) $$I_2 = K_2 W_2 (V_{DD} - V_{in} - |V_{T2}|)^2 (1 + \lambda_2 ?(V_{DD} - V_{out}))$$ (2.6) The equal current condition $I_1 = I_2$ defines the input voltage threshold reference $V_R$ required to bias the inverter at maximum output transconductance: $$V_R \cup \frac{V_{DD} + m \, \mathcal{N}_{T1} - |V_{T2}|}{1 + m},$$ (2.7) where $V_{TI}$ , $V_{T2}$ are the threshold voltages of the nMOS and pMOS transistors, respectively, $m = \sqrt{K_1 W_1 / K_2 W_2}$ is a current gain ratio factor and the output transconductance effects have been neglected. Figure 2.5 CMOS Inverter Operation as a Voltage Amplifier For a symmetric sized inverter (i.e., $K_1W_1 = K_2W_2$ ), the reference threshold voltage $V_R$ is close to the supply voltage midpoint: $$V_R \cup \frac{V_{DD}}{2} + \frac{V_{T1} - |V_{T2}|}{2}$$ (2.8) Statically, we suppose the inverter biased at the reference threshold voltage $V_R$ , and we add a dynamic voltage variation $\Delta V_{in}$ at its input, $V_{in} = V_R + \Delta V_{in}$ . This input signal produces a positive variation $\Delta I_P$ , and a negative variation $\Delta I_P$ , of transistor drain currents $$I_1 + \Delta I_1 = K_1 W_1 (V_{in} + \Delta V_{in} - V_{T1})^2$$ (2.9) $$I_2 - \Delta I_2 = K_2 W_2 (V_{DD} - V_{in} - \Delta V_{in} - |V_{T2}|)^2$$ (2.10) leading to a variation $\Delta V_{out}$ of inverter's output voltage: $$\Delta V_{out} = \Delta I_1 ? R_{o2} = \Delta I_2 ? R_{o1}$$ (2.11) $R_{o1}$ and $R_{o2}$ are the equivalent output impedances of the nMOS and pMOS transistor, respectively, that are considered constant for small input variations. The maximum inverter gain can be approximated as the sum of the two transistor gains, $A_n$ and $A_p$ : $$A = \frac{\Delta V_{out}}{\Delta V_{in}} = A_n + A_p = K_1 W_1 R_{02} ?(V_{in} - V_{T1}) + K_2 W_2 R_{o1} ?(V_{DD} - V_{in} - V_{T2})$$ (2.12) The inverter gain A depends on each transistor size and transconductance, on their output impedance and on the saturation level $(V_{in}-V_T)$ , which is related to the supply voltage value. The load capacitance at the output, not considered in this study, may influence theoretically only its response time, but its dynamic variations may have observable effects on comparator's accuracy. An analytical expression for the inverter gain in the submicron region can also be obtained if corresponding transistor models for submicron mode behavior are used. Lower transistor gains and smaller variations with the transconductance and the supply voltage levels are then observed. However, typical gains of several dozens to hundreds may still be obtained. The CMOS inverter can be henceforth used as a compact, fast and accurate comparator [138]. Its main drawback consists on requiring accurate dynamic bias in the metastable region (i.e., at the switching point). This implies a dynamic operation of the comparator, controlled by a precharge clock, and high power dissipation due to the large current passing through the conducting transistors. The most important drawback of inverterbased BICS operation is the fact that it operates optimally with dc input signals close to supply voltage midpoint. The low voltage range of the input signal detected by the $I_{DDO}$ sensor is located at one of the two ends of the supply voltage trip. Input voltage level translators complicate the $I_{DDO}$ measurement process and introduce additional sources of errors. On the other hand, the inverter-based comparator has single-ended operation with an implicit reference, $V_R$ . A differential comparator function is required for BICS operation, to allow accurate setting of the $I_{DDQ}$ test threshold limit. Differential sense amplifiers based on cross-coupled inverter pairs with regenerative switching operation have been proposed for fast BICS implementation [11,17,122]. They typically induce high switching noise and power dissipation, and require critical symmetric design to reduce the offsets induced by transistor mismatch that limit their accuracy. Typical sense amplifiers designs previously proposed for BICS implementation are reviewed and analyzed in the sequel, and optimized designs are described for improved accuracy and high speed. #### 2.2.4 Sense Amplifier Design for BICS Implementation We have investigated the performance of various differential sense amplifier designs [144-147,149-150] developed for fast bit line sensing in high density memory applications. The impact of technology scaling in the deep submicron regime on sense amplifier circuit optimization for high sensitivity, speed and low power dissipation has been analyzed following the algorithm described by Sakurai in [145]. We describe and analyze in this chapter four sense amplifier circuit designs for BICS operation in deep submicron CMOS. The conventional voltage latch sense amplifier, presented here as a reference, has been previously reported by other authors in several BICS implementations [11,13,17,122]. The three other designs have current mode sensing operation. A current-controlled sense amplifier BICS was experimentally characterized on a test chip prototype as described in Chapter 3. #### 2.2.4.1 Voltage-Mode Sense Amplifier Typical CMOS sense amplifier (SA) designs are built around a voltage-sensitive latch structure formed by two cross-coupled CMOS inverters. In a voltage sensitive latch, the differential input lines are connected by transmission gates directly to the intermediate nodes of the sense amplifier, as shown in Figure 2.6. This circuit has inherent speed and accuracy limitations due to the charging time associated with the input capacitances, since the input nodes and the output nodes of the SA circuit coincide. A two-phase control timing circuit is required to separate the sensing and signal amplification phases $\phi$ 1, $\phi$ 2 for reliable and accurate operation. During the sample phase $\phi 1$ , the voltage $V_{IN}$ generated by the $I_{DDO}$ current at the sensing element $R_s$ is compared to a threshold voltage reference $V_{REF}$ . A small delay between the sampling pulse \$\phi1\$ and the BICS evaluation period \$\phi2\$ is used to reduce the effects of the switching noise on sensor accuracy. Matching of transmission gate conductance and switching characteristics are thus of less concern for BICS sensitivity and accuracy [151-154]. The sense amplifier operates with asymmetric capacitive load conditions that affect BICS accuracy and speed. The large parasitic capacitance of the virtual ground node of the monitored circuit partition connected at the BIC sensing node cannot be matched with a significantly smaller capacitance of the reference voltage source connected at the opposite node. An attempt to match BICS capacitive loads is presented in [122] and consists in adopting as a reference the sensing node voltage of a second, deemed fault-free circuit partition. However, poor estimation and control means of partition's intrinsic supply node capacitance could not allow a significant improvement in BICS accuracy. On the other side, the differential BICS technique has been intended to reduce the effects of transient decay currents on BICS accuracy by selecting equal size partitions with similar activity levels and switching characteristics, which is difficult to achieve for general CMOS circuit functions. Figure 2.6 Conventional voltage latch sense amplifier for BICS. A specific concern when designing a sense amplifier circuit for BICS operation is related to matching its maximum input sensitivity region, which is typically located at the supply voltage midpoint, with the input voltage range, located close to either $V_{DD}$ or $V_{SS}$ . For the BICS circuit in Figure 2.6, the switching noise generated at $V_{IN}$ or $V_{REF}$ during sensing is typically larger than the threshold voltage $V_{TN}$ of the unbiased NMOS transistors M1, M2. Current-mode sense amplifier circuits adopted for our BICS implementation strategy offer improved speed and sensitivity for low voltage and low power operation. #### 2.2.4.2 Source-Controlled Sense Amplifier A sense amplifier that operates in current mode with separated inputs and outputs eliminates the transmission gates, reduces the timing and clocking requirements and improves the dynamic operation [147]. The BICS circuit, shown in Fig.2.7, connects the input current $I_{IN}$ and the reference current $I_{REF}$ to the source nodes of the NMOS transistors $M_1$ , $M_2$ in the latch. Two matched NMOS transistors M7, M8 biased in the linear region act as current sensing resistive elements connecting the two nodes to ground. Figure 2.7 Source-controlled current mode sense amplifier. A precharge transistor switch M5 shorts the two nodes on the active precharge clock phase $\phi_{pr}$ and generates a difference current $\Delta I = I_{IN} - I_{REF}$ between the sensing node and the reference node. This current is then amplified and integrated as a differential output voltage at the parasitic output node capacitances $C_0$ : $$\left| \frac{d\Delta V_o}{dt} \right| = 2? \frac{\Delta I}{C_o} \tag{2.13}$$ During the sense signal amplification phase, this voltage is further amplified by the regenerative process in the cross-coupled inverter latch structure. This process is activated on the negative edge of the activation clock phase $\phi_a$ . The response time of this circuit is independent on the absolute values and the mismatch of the input capacitances at the sensing and reference nodes. This allows us to implement effective BICS sizing algorithms without affecting the dynamic performance of the sense amplifier. #### 2.2.4.3 Current-Controlled Sense Amplifier The current sensing accuracy of the two comparator circuits previously described relies on accurate parameter and topology matching of several pairs of transistors, i.e., the two inverter transistor pairs, the switching transmission gates and the NMOS transistor pair used as current sensing resistors. We have adapted a current controlled sense amplifier (CC-SA) design [145-146] for optimized BICS operation with higher gain and reduced sensitivity to mismatch on transistor parameters. A similar sense amplifier design has been reported on a high density CMOS DRAM circuit [145] using 0.5µm MOSFET's and showing 1.5ns delay at 1V V<sub>DD</sub>. The BICS circuit is presented in Figure 2.8. The cross-coupled inverter latch circuit M3-M6 is serially controlled in current mode as in the case of the source-controlled sense amplifier, but this time on the VDD side. The resistive bias control circuit of the SC-SA is replaced by a differential amplifier with transistors M1, M2. The gates of the two transistors are controlled by the input and the reference signal nodes, respectively. This differential gain stage is activated by a PMOS switch that acts as a dynamic bias current source activated by the single-phase clock φ. During the precharge phase, transistors M7, M8 reset the latch outputs to V<sub>ss</sub> and M9 transistor switch disconnects the SA from the power supply. Turning on M9 activates a fast direct sensing and amplification of the differential signal at the inputs of M1-M2 transistor pair. The two transistors act as a differential amplifier with high impedance loads consisting of the unbiased latch inverters. The gate capacitances of M1 and M2 transistors ensure the dc isolation of the comparator from the sensing and reference nodes, respectively. This allows us to design the comparator based on accurate matching of two pMOSFET transistor pairs, M1-M2 and M3-M4, for high speed and high sensitivity operation. The fast switching process of the differential amplifier reliably resolves small differential input signals $\Delta V = V_{IN}$ . The influence of parameter mismatch and parasitic capacitances in the latch inverters M3-M6 and the transmission gates M7-M8 is drastically reduced. The two currents that drive the switching process can be written as: $$I_{D1} = \beta \left( V_A - |V_{T1}| - V_{IN} \right) ? V_A - \beta \frac{V_A^2}{2}$$ (2.14) $$I_{D2} = \beta \left( V_A - |V_{T2}| - V_{REF} \right) \mathcal{N}_A - \beta \frac{V_A^2}{2}$$ (2.15) where $V_A$ represents the switched power supply voltage. The current difference $\Delta I$ , integrated on latch's parasitic capacitors before starting the regenerative amplification process that converts it to a large output voltage, is given by the equation: $$\Delta I = I_{D2} - I_{D1} = \beta ? V_A ? (V_{IN} - V_{REF} + \Delta V_T)$$ (2.16) where $\Delta V_T = (V_{T1} | - |V_{T2}|)$ is the threshold voltage mismatch. The current gain mismatch has not been considered in the equations. Its influence on lowering the comparator sensitivity is diminished due to the progressive activation of the differential stage at low currents that exhibit reduced mismatch. Since both transistors M1 and M2 are conducting, the current flows through the two inverters that compose the latch. The small difference between the two currents is first converted to a small voltage difference between the two outputs, which is in turn amplified by the regenerative feedback latch. The current flows only during inverter switching, thus insuring low power operation. Figure 2.8 BICS Circuit Design Using a Current-Controlled Sense Amplifier An analysis of the mismatch sensitivity for stochastic variations of threshold voltage and transconductance values of transistors M1, M2 using HSPICE simulation showed that a 10% current gain difference between the two transistors is equivalent to a threshold voltage mismatch of 1.6 mV. Taking into account the offset voltage induced by M1-M2 threshold voltage imbalance is very important for sense amplifier accuracy. Low threshold voltage variations may be obtained by using transistors having longer than minimum gate lengths, a physically close layout of the PMOS differential pair and careful symmetrical layout. We have implemented the current-controlled sense amplifier in a BICS design to monitor the $I_{DDQ}$ currents in a 8-bit parallel multiplier circuit. The prototype chip was fabricated using a 1.2 $\mu$ m CMOS process from ES2/ATMEL. The circuit description and $I_{DDQ}$ test results are presented in Chapter 3. Its transistor count overhead is compensated by the area reduction obtained using simple control circuits based on single-phase clock operation and avoiding switched operation at the input and reference nodes. The CC-SA BICS demonstrated a slight but notable increase input sensitivity with comparable speed and a low overhead in area and transistor count that is compensated by its single-clock operation. #### 2.2.4.4 Buffered Latch Sense Amplifier An additional sense amplifier design has been analyzed for high speed, high sensitivity implementation. A drawback of the previously described CC-SA circuit relays on the fact that its sensing delay is significantly dependent on the input voltage swing. This fast comparator circuit with low level differential input signals can further improve its speed and sensitivity by using a buffered latch configuration. A buffered latch sense amplifier (BL-SA) circuit design exhibiting 0.6ns simulated response time with 10mV input voltage swing has been reported in a commercial high speed SRAM design [144]. The improved performance is obtained at the expense of adding circuit complexity and increasing transistor count. A BICS circuit schematic, using a BL-SA circuit as comparator is presented in Fig. 2.9. The buffered operation of the latch circuit helps us increase the dynamic sensing gain. This increase in sensitivity is achieved by adding two inverter buffers, M7-M8 and M9-M10, that precharge the source and drain nodes of the input transistor pair M1-M2 at ground voltage level. Hence, while the input transistors M1, M2 in the CC-SA circuit are biased at a large drain-source voltage during sensing operation, the input transistors in the BL-SA circuit are initially biased at $V_{DS} = 0$ V. The amplification phase is used first to charge the buffer nodes to a voltage difference equal to the p-channel MOS transistor threshold. Then, the control transistor with a larger gate voltage starts conducting and sets the output latch to the logic state corresponding to the sign of the input voltage difference. The sense amplifier delay has significantly lower dependence on the input voltage swing. However, its response time is not improved compared to the voltage mode sense amplifier, and the added area overhead is higher than 50%. The comparison is performed for a $1.2\mu m$ CMOS design with p-MOSFET active sensing transistors as shown in Figure 2.9. Figure 2.9 Buffered Latch Sense Amplifier ### 2.2.5 Current-Mode Amplifiers as BICS Comparators Current-mode amplifiers offer significant advantages compared to voltage amplifiers in low voltage, low power applications using deep submicron CMOS technologies [143]. Their low input impedance results in a substantial increase in signal bandwidth and power supply noise rejection. A current mirror load circuit (Figure 2.10) biases the n-channel transistor in low inversion. This reduces power dissipation and maximizes $g_{ml}$ current gain, thus ensuring its reliable operation as a comparator with improved sensitivity. A high output resistance is set by $R_s(g_{ml}, r_{dsl})$ in parallel with the output impedance $R_o$ of the current mirror output node: $$R_L \cup R_S ?(g_{m1} ?r_{ds1}) || R_o$$ (2.17) This resistance, in conjunction with the lumped load capacitance, $C_L$ , limits the current mode comparator speed to higher values than a voltage mode comparator. A current mode amplifier has an implicit threshold reference $$V_R \cup V_{T1} + I_{BIAS} ?R_S + \sqrt{\frac{I_{BIAS}}{K_1 W_1}},$$ (2.18) close to the threshold voltage $V_{\text{TI}}$ of the amplifying transistor. This value is excessively high for low-voltage BICS operation, since it may drastically reduce the noise margins and degrade circuit's operating speed. On the other hand, the comparator threshold voltage $V_{\text{R}}$ has large process variations that cannot be accurately compensated at low voltage operation. Though inadequate as a low voltage current comparator for BICS, the current-mode amplifier ensures optimum performance as final BICS comparator stage. It actually exhibits optimum operation as fast inverter with low voltage, low power and low switching threshold. Figure 2.10 Basic Current Mode Amplifier Schematic ### 2.2.6 Current-Mode Comparator Design for BICS A practical solution to ensure high BICS accuracy with reproducible performance at low power, low voltage operation consists in employing CMOS current comparator circuits based on matched current mirrors [139][143][148]. Several variations of this circuit have been proposed for BICS implementation. Unfortunately, their applicability is quite limited, since they must rely on increased $V_{\rm DD}$ to compensate the supply voltage drop on BICS circuits. A basic CMOS current comparator configuration is shown in Figure 2.11. It allows a simple and compact BICS implementation in applications where supply noise levels larger than $V_{T1}$ can be tolerated. The measured $I_{SSQ}$ current is compared with a reference current $I_{REF}$ at the common output of two accurately matched current mirrors. The difference current is integrated at the load capacitance $C_{L}$ . The negative voltage swing obtained at the output of the current comparator stage is subsequently amplified by a current-mode PMOS amplifier stage and converted to a full rail-to-rail logic transition at the output. The inverting characteristic of the current comparator imposes the use of a p-MOSFET amplifying stage at the output with inherently lower performance than the equivalent n-MOSFET amplifier. Figure 2.11 CMOS Current Comparator Operating as BICS Our goal is to devise a non-inverting current amplifier compatible with an NMOS current amplifier stage, to be optimized for low voltage, low power BICS operation. Another main goal is to reduce the input voltage level towards much lower values than the threshold voltage $V_{TI}$ of the n-channel transistors. Current mirror circuit configurations with reduced voltage requirements have been recently reported [140-142]. They basically consist in adding buffer transistors and level shifters biased in the triode-region in order to provide a low voltage input node. Their main concern is to conserve a linear transfer characteristic of the current mirror. The example shown in Figure 2.12 [142] adds a triode-region transistor $T_b$ to the current mirror and the current input node is changed correspondingly. This leads to a current transfer function which is approximately linear, with an offset component due to the bias current: $$I_{out} = I_{in} + \overline{1} + \frac{K_M \mathcal{W}_M}{K_b \mathcal{W}_b} \sqrt{I_{bias}}$$ $$(2.19)$$ Actually, two current mirrors, i.e., $I_{bias} - I_{out}$ and $I_{in} - I_{out}$ , with different inputs and a common output, are superposed in the schematic of Figure 2.12. The input voltage level is reduced to the low drain voltage of $T_{M1}$ transistor, thus improving BICS noise and performance parameters. Figure 2.12 Low Input Voltage Current Mirror Design [142]. However, this adds dc power consumption and does not provide an adequate solution to the low voltage amplification of the p-MOSFET buffer stage required at its output. An improved, source-controlled current comparator design is subsequently described, that provides an effective and elegant solution to these problems. #### 2.2.6.1 Source-Controlled CMOS Current Comparator The schematic of the novel current amplifier is presented in Figure 2.13. The circuit consists of three current mirrors: a $V_{SS}$ -referenced NMOS current mirror M2-M2', a $V_{DD}$ -referenced PMOS current mirror M3-M3' and a floating NMOS current mirror M1-M1' that connects the outputs of the previous two ones. The two NMOS current mirrors are dominant, i.e., they are biased at higher currents than the PMOS current mirror. Consequently, the comparator output, which is common to the PMOS and the floating NMOS mirrors, is initially at low output voltage level, in the absence of a faulty $I_{DDQ}$ current at the input. The comparator accepts both current and voltage input at the common output node of the two NMOS current mirrors. The circuit has current-mode internal operation, with either voltage (i.e., $V_{IN} - V_{COMP} \rightarrow V_{OUT}$ ) or current (i.e., $I_{IN} - I_{COMP} \rightarrow I_{OUT}$ , where $I_{COMP} = I_{OFF} - I_{B}$ ) input and output. Two current comparator structures, M1-M3 and M2-M3, are activated depending on the input signal and the bias current values. An *RC* sensing circuit composed of the equivalent parasitic capacitance at the measurement node and the equivalent resistance of transistor M2 converts the input current $I_{in}$ to a voltage level $V_{in}$ . Figure 2.13 Source-Controlled Current Comparator circuit The resulting signal, applied to the source node of transistor M1, controls its gate-source voltage, $V_{GSI}$ . The gate of M1 is biased at a stable reference voltage $V_{REF}$ determined by the input parameters $V_{COMP}$ and $I_{SO}$ of the floating current mirror: $$I_{S0} = K_1 W_1 \cdot (V_{REF} - V_{TN} - V_{COMP})^2$$ (2.20) The transistor sizes and the bias currents of the floating current mirror are larger than those of the PMOS current mirror M3-M3' (i.e., $I_S > I_{BIAS}$ ). Hence, transistor M1 is biased in the linear region. Its drain current is given by the equation: $$I_{S} = I_{S0} - 2?K_{1}W_{1}?\Delta V_{IN}?\overline{V}_{REF} - V_{TN} - V_{IN} - \frac{\Delta V_{IN}}{2}$$ (2.21) where $\Delta V_{IN} = V_{IN} - V_{COMP}$ is the differential input voltage signal. The equation (2.21) expresses a subtractive current processing rule that leads to a positive discriminating signal $V_S$ at the output node when the threshold condition $I_S > I_B$ is attained. An equivalent negative transconductance is defined for the floating mirror transistor M1, and is approximated as: $$G_{m1} = \frac{\Delta I_{S}}{\Delta V_{IN}} = -2?K_{1}W_{1}?\overline{V}_{REF} - V_{IN} - V_{IN} - \frac{\Delta V_{IN}}{2} \sqrt{2}$$ (2.22) At the initial current detection stage, the input comparator formed by the lower and the upper current mirrors M2-M2' and M3-M3', respectively, is used to discriminate input currents larger than the threshold value $I_{LM}$ defined by: $$I_{LIM} = I_{OS} - I_B \tag{2.23}$$ As soon as the input condition $I_{IN} > I_{LIM}$ is fulfilled, $V_{IN}$ raises gradually activating the output current comparator formed by the middle and the upper current mirrors. The rise of $V_{IN}$ occurs with a large time constant for low differential input currents $\Delta I_{IN} = I_{IN} - I_{LIM}$ and large parasitic capacitances at the sensing node. This slow detection process represents a physical limitation due to the size of the monitored circuit partition. This limitation can be counteracted at the system level $I_{DDQ}$ test implementation through a divide-and-conquer approach based on lower granularity partitioning for current monitor insertion. Two amplification processes occur in the output comparator at $I_{DDQ}$ detection: first, a gradual transition of the M1 transistor drain current $I_S$ to the linear and, progressively, to inversion regime and next, the integrative positive output voltage transition that starts at the $I_{OUT} > 0$ condition. The output current may be approximated as: $$I_{OUT} = I_B - I_S = 2K_1W_1 ? \Delta V_{IN} ? \overline{V}_{REF} - V_{IN} - \frac{\Delta V_{IN}}{2} \sqrt{-(I_{S0} - I_B)}, \qquad (2.24)$$ where $\Delta V_{IN}$ can be expressed as an integrative function of the input current: $$\Delta V_{IN} = V_{IN} - V_{COMP} = \frac{-1}{C} \left( I_{IN}(t) - \left( I_{OS} - I_{B} \right) \right) ? dt \sqrt{-V_{COMP}}$$ (2.25) As it can be easily seen from the equations (2.21) and (2.22), two current parameters are used to set the BICS sensitivity threshold $I_{LIM}$ given by (2.23) and the intrinsic comparator threshold current $I_{COMP}$ : $$I_{COMP} = I_{SO} - I_{B} (2.26)$$ Thus, the BICS accuracy and the comparator performance are independently controlled through simple and reliable sizing of the current mirror transistors. The source-controlled current amplifier actually performs a sequence of three comparisons: $I_{IN} - I_{LIM}$ , $V_{IN} - V_{COMP}$ and $I_{OUT} - I_{COMP}$ . If we consider a finite output resistance of the current mirror outputs, a voltage-mode definition of the output comparator threshold can be used as graphically observed in Figure 2.14. The output characteristics of transistors M1 and M3 are represented here before and after the detection process. Inherently, the characteristic $I_B$ of transistor M3 is unchanged. The output current characteristic of transistor M1 is gradually changed from $I_S$ to $I_{S2}$ as modeled by equation (2.21). Figure 2.14 Operating output characteristics of the current-mode comparator In the absence of an input signal, the comparator has negative output current $I_{out}$ <0 and 'LOW' output voltage $V_s$ , corresponding to the operating point A on the output I/V characteristic of transistors M1, M3 in Figure 2.14. $V_{IN}$ increase leads to a corresponding decrease of $V_{GSI}$ and $I_s$ , and the operating point shifts from A to B on the graph. The sensitivity threshold is given by the $\Delta V_{IN}$ variation (2.25) that produces a $V_s$ increase corresponding to the switching threshold voltage for the output buffer. Hence, the comparator detection process corresponds to the A-B transition of the operating point on the SPICE simulation graph, where the voltage level $V_s(B)$ at the final operating point B is higher than the switching threshold of the voltage comparator connected at the output. The source controlled comparator circuit we have analyzed has basically an asynchronous operation, which implies a dc current consumption due to $I_B$ and $I_{SO}$ . Current consumption can be activated for accurately defined time intervals only during the $I_{SSQ}$ measurement cycles. An enable circuit is added as shown in Figure 2.15 to validate the comparator operation by controlling the pMOSFET current mirror that supplies the BICS operating currents. A current mode NMOS amplifier stage M4 is added as output voltage buffer. The test result may be read using a clocked inverter or alternately stored into a latch. Figure 2.15 Current comparator with enable control The low bias current limits the minimum propagation delay to about 50 nanoseconds. For low discrimination currents, the BICS response time doubles its value. Current gain and propagation delay characteristics for different $I_{\rm SSQ}$ limits between 1 $\mu A$ and 10 $\mu A$ are presented in Figures 2.16 and 2.17. Higher current gains are obtained for lower current thresholds. The current gain characteristics saturate to about 25 dB for large input currents at variable BICS sensitivities. Figure 2.16 Current Gain Characteristics of the Source-Controlled Comparator Figure 2.17 Propagation Delay Characteristics of the Source-Controlled Comparator The source-controlled comparator circuits in Figures 2.14 and 2.15 have reliable operation for supply voltage ranges lower than the sum of the pMOSFET and nMOSFET transistor thresholds. They have a noninverting transfer function, i.e., input current increases are converted to positive output switching currents $I_{out}$ and voltages $V_{out}$ . This reduces the BICS power dissipation by lowering the voltage trip at the output and increases the operating speed through the use of an optimized current-mode NMOS buffer amplifier. Fig. 2.18 Source-Controlled Current Comparator circuit operating as BICS A comparator structure operating as BICS is shown in Figure 2.18. It has current mode operation enabled by an externally supplied bias current $I_{BIAS}$ and is controlled by a clock signal B\_CLK. The comparator supplies a voltage output $V_{ERR}$ and a current mode, global error detection line $I_{ERR}$ . When $I_{BIAS}$ is absent, a power-save inactive mode is entered. $I_{LIM}$ and $I_{SENS}$ currents are generated from $I_{BIAS}$ and define the comparator sensitivity through accurately selected transistor ratios in the internal current mirrors. Currents $I_1$ and $I_2$ in Figure 2.18 have values partially determined by $I_{BIAS}$ and transistor sizes; their values also depend on the input leakage current $I_{IN}$ from CUT. Transistor MB2 (the second bypass) drives a maximum current when B\_CK signal is at high logic level. When this signal passes to 0, blocking the first bypass switch, the current driving capability of MB2 is progressively reduced until a small current limit $I_1 = I_{LIM}$ is reached. This avoids having the virtual ground as a floating node, reduces noise sensitivity and drives the fault-free leakage current of the CUT to the real ground. An abnormal $I_{DDQ}$ ( $I_{IN}$ ) level—is detected by a small voltage increase at $V_s$ which triggers the source-controlled current comparator M6-M8. The positive voltage transition at node $V_0$ subsequently activates the output signals $V_{ERR}$ and $I_{ERR}$ , thus resulting in fault detection. The current discrimination time can be minimized by reducing the input capacitance $C_0$ of transistor M12. However, comparator's speed and accuracy are mainly defined by the current ratio $I_{LIM}/I_{BIAS}$ and by the input parameters $C_S$ , $V_S$ and $I_{IN}$ . An increase of $V_S$ reduces the time required to detect the fault. In particular, we can select the $I_{IN}$ threshold ( $I_{SSQ}$ ) by selecting the time at which we read the comparator output $V_{ERR}$ . The virtual ground voltage level for which the comparator detects a faulty $I_{DDQ}$ is lower than 0.1V. The design also meets the previously mentioned low voltage operation requirements, since the comparator uses the same power supply level as the CUT, and since the current mode circuits can operate at very low supply voltages. In fact, the circuit of Fig 2.19 can accurately operate at supply voltages close to two MOS transistor thresholds. When both BICS speed and accuracy are traded off with implementation cost, circuit performance and test time, an important function may be added to compensate for these drawbacks: flexibility. BICS-based current measurement techniques may be adaptively reconfigured in application to serve additional interests and goals, besides the fault detection process. First, the bypass switch can be used as a power disconnect switch to save dc power. Next, we can use current measurement with BICS to monitor specific operating conditions (e.g., supply voltage ranges, timing modes, activity level, temperature) and performance margins (i.e., glitch detection and critical path delay) by setting predefined ranges of I<sub>DDQ</sub> test limits for different CUT operating conditions and input test patterns. Simple and reliable means to control both BICS intrinsic characteristics and the measurement sequence parameters can be implemented when source-controlled comparator circuits are employed. These control means may be classified in two groups: analog and digital. Analog BICS control means are basically used to program current thresholds, detection sensitivities, and operating currents. They can be effectively employed with source-controlled comparator BICS through fast and reliable control of the operating currents, i.e., $I_{BIAS}$ and $I_{OFF}$ , as previously described. $I_{DDQ}$ measurement sequences may be scheduled with different BICS sensitivities, speeds and power dissipation parameters and the results may be compared and analyzed. Digital BICS control techniques may be used to control the $I_{DDQ}$ measurement delay and activation time or to select a discrete range for the previous analog parameters. # 2.2.6.2 Multiple-Output BICS Circuits Adaptive selection of BICS activation time and measurement duration might significantly improve BICS performance, but they generally imply complex and potentially unreliable analog parameter control strategies. In this chapter we describe an alternative solution that does not rely upon an intimate knowledge or a detailed analysis of the monitored circuit's operation. It employs multiple-output BICS circuits that supply multiple-bit coded information for a predefined set of faulty current ranges. Multiple-output BICS circuits may be effectively used for design analysis, diagnosis and optimization at the product development stage. They can also provide powerful $I_{DDQ}$ signatures for fault diagnosis at production test and system-level application stages, to locate unexpected errors that produce unpredictable system behavior and are not covered by fault dictionaries. Multiple-output BICS circuits can be equally used with time-dependent single-output operation for dynamic optimization of $I_{DDQ}$ fault coverage within variable constraints of ASIC applications. Selected BICS outputs with different detection sensitivity are observed during specific functional test sequences characterized by higher operating speed or increased system noise. The multistage CMOS comparators we propose for multiple-output BICS implementation consist of a set of cascaded source-controlled CMOS current comparator stages. They operate at low supply voltage and operating currents with good accuracy and detection speed, thus being adequate for deep submicron CMOS implementation. #### 2.2.6.3 Multistage Source-Controlled CMOS Comparators Source-controlled CMOS comparator configurations can be easily extended to implement multiple-output, multiple-threshold comparator functions. Multistage comparator architectures can be viewed as compact parallel arrays of comparators sharing a single input and exhibiting a distinct set detection thresholds. They can be used for test purposes to implement an information-based test redundancy, that consists in supplying additional information on the severity and the conditions of occurrence of detected failures. This allows us to implement specific improvements of the testing process: - Higher test quality and reliability - Dynamic selection of the I<sub>DDO</sub> test thresholds - Detailed $I_{\mbox{\tiny DDO}}$ test characterization of reusable IP cores - Adaptive I<sub>DDO</sub> test pattern optimization - Improved fault diagnosis - Reduced analog control of the $\boldsymbol{I}_{\text{DDQ}}$ test process. Multiple-output CMOS comparators can be used to simplify the conventional $I_{DDQ}$ test synthesis, by using a single BICS design with multiple outputs of different sensitivities and selecting its sensitivity through simple configuration and interconnect options depending on application requirements. Basically, two design techniques may be employed to achieve multiple-output operation: the sequential replication and the parallel replication of the output current comparator stage. The two design techniques are based on common sharing of internal ressources (i.e., input and reference current mirrors) by different comparator stages, but they differ essentially in the way they share their own function. A three-stage current comparator schematic using sequential replication of the output stage is presented in Figure 2.19. Figure 2.19 Multiple-Output Source-Controlled Comparator with Sequential Output Stages A basic comparator stage, consisting of the output transistors M1, M3 of an NMOS input current mirror and a PMOS reference current mirror, controls the n-MOSFET source node of a second comparator stage, M12-M32, which in turn controls the source node of a subsequent current comparator, M13-M33. The three comparator stages have different transistor sizes and operate in parallel. A comparator stage in the chain adds a voltage offset to the subsequent stage and a current offset to the previous stage. The current comparison process consists of a timed detection sequence of switching operations. An overdrive current $I_{IN} - I_{REF}$ leads to backwards sequence switching of the comparators, starting with the last stage that operates at lower voltage and lower current and ending up with the first stage that operates at larger voltage and current. The three comparators implement a sequential, source-mode control algorithm. Each comparator stage has its distinct sensitivity (i.e., detection threshold) and speed (i.e., propagation delay). The bias current is distributed to the three comparator stages according to weighted ratios corresponding to their detection thresholds. The comparator stages in the chain have progressively decreasing speeds and increasing sensitivities. Thus, a large current overdrive may benefit from the high speed operation of the first stage, while a small current overdrive may benefit from the high sensitivity operation of the last stage. When designing a multiple-threshold comparator, we start from a single-stage comparator having the lowest sensitivity. Adding a number of subsequent stages to a current comparator according to the sequential control algorithm actually does not change its sensitivity and switching threshold, provided that the sum of the bias currents of the present and all the subsequent stages is conserved. The comparators switch in reverse order, starting with the last one, and their dc current is reduced to zero upon switching. Thus, the multistage comparator has reduced switching power dissipation. The first stage of the comparator has the highest threshold, and each of the subsequent stages actually divides to a fraction the sensitivity threshold of the preceding stage. Dividing the input threshold current into several subintervals correspondingly improves both the sensitivity and the detection accuracy. A bias current ratio between 1 and 5 is applied, i.e., a subsequent comparator stage may have a bias current equal or up to five times lower than the current stage. This actually corresponds to a five times decrease of the pMOS transistor width. Changing the threshold of an intermediate stage shifts all the subsequent thresholds, without affecting the preceding stages. The threshold-programming rule does not scale linearly with the ratio of the PMOS bias transistors. The nonlinear sizing algorithm of the multistage comparator, while somehow complicating its design, does not represent in fact a drawback. The addition of a linearization circuit does represent a drawback, since it adds unnecessary complexity and degrades comparator performance. Another important conclusion drawn from our analysis concerns the improved dynamic sensitivity of the multistage current comparators. The detection slope of the dc transfer characteristic in multistage comparators may vary with up to one decade when subsequent stages are added. Hence, single-output multistage comparators with improved sensitivity (e.g., two to three times improvement for a three-stage comparator) may be used instead of their single-stage counterparts. Multistage CMOS current comparators with parallel configuration basically add output comparator stages and their corresponding output buffers, as shown in the three-stage example of Figure 2.20. The added comparator stages consist of accurately sized transistor pairs with independent operation. Parallel comparator stages do not exhibit improved performance with respect to their single-stage implementation. However, their linear operation is particularly useful in other applications not related to on-chip current monitoring. A multiple-output comparator schematic that combines the two approaches is presented in Figure 2.21. Figure 2.20 Multiple-Output Parallel Comparator Configuration Figure 2.21 Multiple-Output Comparator Configuration with Sequential ( $V_{S1}-V_{S3}$ ) and Parallel ( $V_{S1a}-V_{S3a}$ ) Output Stages #### 2.3 Prototype Circuit Design We have selected a 32-bit carry-lookahead adder for prototype experiments, as a circuit partition with realistic size and characteristics in terms of size and speed. This circuit is implemented using a 0.8µm CMOS standard cell library from AMS. The selection of the target circuit function has significant impact on the results, which can be considered as optimistic or pessimistic depending on circuit's characteristics. A slow circuit will give optimistic results on performance penalty and/or area cost required to eliminate the performance/area penalty. Similarly, a circuit with low node activity will give optimistic results on performance/area penalty. The carry-lookahead adder is selected since it is a fast design. At the same time, although the mean activity of the adder is much lower than the mean activity of a multiplier of similar size, it's worst-case activity is similar to that of the multiplier. Since our bypass MOS transistor is sized by considering the worst-case transitions of the circuits, the results will be rather on the pessimistic side. In order to avoid the performance penalty, an NMOS transistor was inserted on the low supply voltage line $V_{SS}$ . The choice of an NMOS transistor is preferred due to its lower 'on' resistance and smaller size compared to a corresponding PMOS transistor. The goal is to minimize the voltage drop it induces on the virtual ground $V_{S}$ when traversed by large CUT transient currents. The transistor width W is selected in a conservative manner, to keep the voltage drop $V_{Smax}$ under a predefined limit (e.g., $V_{Smax} = 100$ mV) for the worst case condition of the transient current (the peak current corresponding to the worst-case input transition). Two worst case input patterns have been selected, corresponding to the input transitions (A=0, B=0, Cin=0) -> (A=1, B=1, Cin=1) and (A=1, B=1, Cin=0) -> (A=0, B=1, Cin=1). The first pattern induces a maximum number of high to low node transitions within a short time period, thus maximizing the peak transient current at $V_{S}$ node. The second transition sensitizes the longest paths of the circuit and induces the slowest decay on the transient supply current, while maximizing the number of node transitions on these paths. The peak transient current obtained with SPICE simulation on the worst-case input patterns is 180mA. The bypass transistor size is chosen as a tradeoff between area, cost and performance degradation. We have chosen a bypass transistor with 3.6 $\Omega$ on-resistance, corresponding to a 1200 $\mu$ m transistor width. Its area represents an equivalent of 11% of the CUT area. The performance penalty for this choice is insignificant (0.4%). A smaller bypass NMOS transistor could also be selected comfortably (e.g., W=600 $\mu$ m) resulting in 1.4 % performance penalty and dividing by two the area overhead. The chip layout presented in Figure 2.22 shows several placement strategies for the current comparator and the delayed bypass switch in a "split topology" BICS configuration, in order to optimize current measurement speed and accuracy. Figure 2.22 BICS placement options on the prototype circuit layout Sizing the bypass switch for worst case input vector selection tends to be an extremely conservative approximation, since the current levels will usually not peak throughout the entire logic computation period. Moreover, the accurate determination of the worst case input vector for complex logic blocks can be a very difficult task. A less conservative method using average current for power switch sizing [158] leads to a similar area overhead in a multiple-threshold CMOS (MTCMOS) circuit designed for low power and operating at 1V. An alternative delay analysis tool is proposed in [159] to optimize the power switches in MTCMOS circuits for high operating frequency and reduced leakage power. ### 2.4 IDDQ Measurement Algorithm and Hardware Cost To perform the $I_{DDQ}$ measurement, one has to wait for the residual switching current to decay at a value that does not mask the faulty quiescent current levels targeted by the designer. We have simulated the CUT with the BICS for several values of the faulty current varying from $2\mu A$ to $20\mu A$ . The simulations results for 5V and 3V supply voltages are presented in Figure 2.23. They show a worst-case detection speed of 100 to 400 ns at 5V, with less than 20% increases at 3V. A feedback circuit activates the bypass switch at error detection, allowing us to reduce the voltage perturbation at the virtual supply node within less than 100 mV for fault-detecting $I_{DDO}$ test cycles. Figure 2.23 Simulated BICS performance characteristics at $V_{DD} = 5V$ (a) and $V_{DD} = 3V$ (b). The obtained results show that the $I_{DDQ}$ test can be performed at a reasonable speed (e.g. 2.5MHz for a 5 $\mu$ A faulty $I_{DDQ}$ ). This speed is quite good for manufacturing testing. $I_{DDQ}$ testing concurrent to circuit's normal operation could offer additional benefits for test coverage and fault diagnosis. But the attainable $I_{DDQ}$ test speed does not allow this kind of test for typical cases. However, periodic on-line testing is possible, by reducing the clock rate of the circuit during the test phase. In this case, the designer can use the normal operation inputs to perform the $I_{DDQ}$ testing or he can apply specific $I_{DDQ}$ test vectors to increase test efficiency. One can use short $I_{DDQ}$ test phases within long operation cycles (e.g. a test phase of few ms every few hours). Thus, performance degradation becomes insignificant. Note also that failure mechanisms are developing slowly. Thus, by performing $I_{DDQ}$ test every few hours, one could detect defects producing abnormal $I_{DDQ}$ currents before they are transformed into defects producing functional failure. The proposed BICS structure occupies 14% of the CUT area in the described prototype. The main contribution to this cost comes from the bypass switch and its control buffers that represent 11% of the CUT area. This cost was obtained for a circuit speed penalty of less than 0.4%. For a performance penalty of 1.4% the area of the first bypass is only 5.5% of the CUT area, resulting in a total cost of 8.5%. In many low power applications (for instance, in portable systems) leakage current during idle periods may represent the most significant power dissipation component. Using large MOS power switches with high threshold voltages in dedicated, multiple-threshold CMOS (MTCMOS) processes is a viable solution to eliminate this dissipation, while at the same time they can also block the dynamic power consumption. In such cases, the MOS switches required for BICS-based I<sub>DDQ</sub> test implementation will be for free. Such high threshold MOS switches do not impact BICS measurement accuracy, which otherwise would be influenced by the high leakage current path added to the current sensing node by low threshold bypass MOS switches. #### **Timed Integrator BICS Circuits** The $I_{DDQ}$ measurement speed for both resistive and capacitive sensors is limited by the large parasitic capacitance at the sensing node. This capacitance cannot be accurately estimated and it increases with partition size. It may also vary significantly with circuit's logic state during the $I_{DDQ}$ test. However, if the intrinsic parasitic capacitance at the virtual supply node is used as a current sensing 'device', without adding other R or C elements, the highest achievable BICS operating speed is obtained. In order to calibrate this inaccurate-capacitance integrator, a multiple-measurement sequence can be performed. For each measurement, both a different integration time interval and a different threshold limit for the comparator are dynamically selected. While producing a linear increase of the $I_{DDQ}$ test pattern length, this timed integration technique offers an elegant and accurate solution for BICS calibration and threshold limit setting in each partition of the circuit. #### Dynamic Substrate Biasing During I<sub>DDO</sub> Measurement A dynamic biasing circuit technique has been used by Kuroda et al. to reduce the leakage currents by adjusting substrate bias with a feedback control circuit [156]. An on-chip generated reverse-bias voltage is applied to the substrate node of a circuit's partition in the standby mode. Leakage current monitor circuits can be used on each partition to control a substrate bias switch that applies the substrate reverse-bias voltage in the idle state. A similar back-gate biasing technique can be implemented for on-chip $I_{\text{DDQ}}$ measurement to increase the low-current fault detection sensitivity. For each pattern, the circuit partition under test is switched to a low-leakage idle state with increased reverse-bias of the substrate. The active state is quickly restored after the current measurement has been performed. #### Distributed Power Supply Control for Partitioned Functional Blocks Power supply control/distribution techniques have been proposed in large memory arrays and complex ASICs, to reduce static and dynamic power dissipation in unselected blocks. All combinatorial logic is disconnected from the supply line. The sequential storage elements and the I/O circuits enter a power-save mode. Kawahara et al. [161] observed that the power supply control/distribution technique implies a self-reverse biasing of the bypass switch. This allows to effectively reduce the subthreshold leakage currents in DRAM decoded/driver circuits. BICS circuit embedding process in power supply controlled circuits can take advantage of using the existing power supply disconnect switches to perform the current sensor bypass function at the virtual supply nodes. An effective power supply distribution strategy can thus be developed for both low power operation and on-chip I<sub>DDO</sub> testing. #### Elastic-V<sub>t</sub> CMOS Circuits The elastic- $V_t$ circuit design reduces the subthreshold leakage currents in submicron CMOS circuits by controlling MOS transistor source (not substrate) voltages, so that no special fabrication steps are required. A power management unit is used to switch source supply voltages at two different value pairs between the active (normal) and the idle (sleep) mode. A pair of compact, high speed BICS circuits can be effectively and reliably embedded into the power management unit. #### Dual Supply Circuits. The Universal V<sub>CC</sub> Concept The universal $V_{cc}$ concept covers a wide operating voltage range for the I/O buffers (e.g., from 1.5 to 3.6V), for flexible logic interfacing capability of DSM CMOS cores. It uses an on-chip two-way power supply unit to provide a constant internal operating voltage for the core. BICS circuits must reliable operate at variable supply voltages for on-chip $I_{DDQ}$ monitoring of the I/O buffers. They should operate with high sensing node capacitance and henceforth at lower speeds compared to core area BICS circuits in order to reduce the negative impact on their operating speed. However, the package pin interface circuits have simple fault mechanisms that may be easy detected with voltage mode or delay tests. Moreover, typically high leakage currents in the I/O buffers may also limit the current monitor measurement accuracy and fault coverage. Dual-measurement algorithms using dynamic threshold limit selection may be used where necessary in order to achieve reliable $I_{DDQ}$ test operation at variable supply voltages. # 2.5 Dual Measurement Techniques for I<sub>DDO</sub> Testing $I_{\text{DDQ}}$ test effectiveness can be increased by providing an accurate estimate of the faultfree background leakage current of the CUT as a test limit and eliminating or reducing its variability from wafer to wafer or device to device. The standard deviation of the background leakage in CMOS VLSI has increased sufficiently to make resolution of smaller defect currents impossible. Adjacent devices on the same wafer may exhibit smaller differences in background leakage, while large variations may occur between randomly selected devices on the same wafer or on different wafers. $I_{\text{DDO}}$ measurement might still prove particularly effective, provided that the test limit threshold is dynamically evaluated in strong correlation with the mean background leakage of closely located dies on the specific wafer area under test. The reference values for the background current in the CUT partition can be supplied by the measured $I_{DDQ}$ in an adjacent CUT partition. Differential $I_{DDQ}$ measurements can be implemented on pairs of CUT partitions having matched size and activity level. Such an approach can be easily adopted in CMOS RAM arrays, due to their regular size and organization. A pair of adjacent memory columns or two memory subarrays may be simultaneously tested using a differential BICS, provided that fault mechanisms leading to correlated I<sub>DDO</sub> increase in both partitions are improbable. Its effectiveness is greatly reduced in random logic partitions, where dummy capacitances may be added for accurate matching of the sensing node capacitances in both CUT partitions that are synchronously tested and compared. An alternative solution to increase the capability to discriminate low-amplitude fault-induced currents in deep submicron CMOS consists in removing the dependence on knowing the magnitude or the variability of the background leakage current in detecting device defects [158]. This can be achieved by determining the characteristics of variation of $I_{DDQ}$ with voltage and temperature instead of screening good devices according to a direct comparison of a single measurement with a predefined limit. The key element of the method is the fact that, due to their different nature, fault-free currents and faulty currents have different laws of variation with specific stress factors such as supply voltage and temperature. By taking two current measurements at different temperatures (or voltages), a defect current can be isolated from the background leakage current. If two current measurements are performed at different supply voltages and/or temperatures, the difference current obtained is associated with the defect current as a function of temperature and/or voltage. In other words, the current due to a defect can be resolved without determining the variability of the background leakage from wafer to wafer or device to device. This method is referred to in [158] as the Variability Method. The method is based on the dependence of the background leakage current on temperature and/or voltage as illustrated by the Arrhenius relationship: $$I_{Q}(T_{n}) = I ? e^{-\frac{E_{aT}}{kT_{n}}}$$ , $I_{Q}(V_{n}) = I ? e^{-\frac{E_{aV}}{kV_{n}}}$ (2.27) where $I_Q(Tn)$ , $I_Q(Vn)$ are background currents at temperature $T_n$ and voltage $V_n$ , respectively, k is a constant, and $E_{aT}$ is the activation energy of the Arrhenius behavior. If a defect exists, then the two currents measured at temperatures $T_1$ , $T_2$ may be expressed as: $$I(T_1) = I ? e^{-\frac{E_{aT}}{kT_1}} + I_D(T_1)$$ (2.28) $$I(T_2) = I \cdot e^{-\frac{E_{aT}}{kT_2}} + I_D(T_2)$$ (2.29) If the magnitude of the defective current $I_D$ is invariant with the temperature, cross-multiplying the two equations and solving for $I_D$ we obtain [158]: $$I_D = \alpha ? I(T_1) - \beta ? I(T_2)$$ , (2.30) a weighted current difference with two constant coefficients defined by the Arrhenius behavior as: $$\alpha = \frac{e^{\frac{E_{aT}}{kT_2}}}{e^{\frac{-E_{aT}}{kT_2}} - e^{\frac{-E_{aT}}{kT_1}}}, \qquad \beta = \frac{e^{\frac{E_{aT}}{kT_1}}}{e^{\frac{-E_{aT}}{kT_2}} - e^{\frac{-E_{aT}}{kT_1}}}.$$ (2.31) If the defective current also exhibits temperature and/or supply voltage variations, then such a variation $$\Delta I_D = I_D(T_2) - I_D(T_1) \quad , \tag{2.32}$$ weighted by one of the two coefficients, is added to the calculated expression of an equivalent $I_D$ : $$I_D = \alpha ?I(T_1) - \beta ?I(T_2) = I_D(T_1) - \beta ?\Delta I_D$$ (2.33) A defect-induced current may be estimated as having either significantly larger variability or significantly slower variability compared to the cumulated leakage currents in a fault-free device. Thus, the variability method calculates an equivalent defect current which is independent of the background voltage variations, and compares it to a screen condition to determine if the device is good or defective. A pair of $I_{DDQ}$ measurements performed at different device temperatures and/or supply voltages may provide significant increase of detection sensitivities for a wide range of typical $I_{DDQ}$ faults, as a faster, less expensive alternative to cooling. However, its cost remains high and setting an accurate $I_{DDQ}$ threshold limit relies on relatively complex algorithms. Our adaptive IDDQ testing algorithm is based on "CUT behavior gradient" and uses multi-output BICS circuits or multi-sampling measurement sequences previously described in Chapters 2.3 and 2.4. This allows us to implement time-dependent differential $I_{\text{DDQ}}$ measurement strategies with reduced cost, increased accuracy and more reliable operation compared to both heating- and cooling-based measurement methods. A key element in our measurement methodology is the dynamic increase of the $I_{\text{DDQ}}$ detection sensitivity by progressive or controlled decrease of the secondary bypass current $I_{\text{LIM}}$ , that represents an implicit test limit for $I_{\text{DDO}}$ . #### 2.6 Conclusion As VLSI technologies are shifting to the submicron domain, and CMOS defect behavior is becoming increasingly complex, $I_{DDQ}$ testing is becoming more and more important. At the same time, leakage current values in good devices are increasing dramatically, making $I_{DDQ}$ testing impractical. A possible solution to this dilemma is to use Built-In Current sensors with increased measurement accuracy in smaller circuit partitions. This Chapter describes the development of a new generation of Built-In Current Sensors destined to overcome the effectiveness limitations of external, at-pin $I_{DDQ}$ testing of deep submicron CMOS circuits. Novel design techniques have been adopted in order to cope with specific design constraints for low voltage operation with high speed and detection sensitivity, low noise and negligible impact on CUT performance. We show that performance degradation due to BICS insertion can become insignificant by using large MOS switches as bypass devices. This is done at a reasonable hardware cost, while test speed allows low test time for manufacturing testing and acceptable test time for periodic on-line testing. A dual bypass switch architecture and a gradual BICS activation circuit allows low noise operation and improved current measurement accuracy. Both synchronous and asynchronous BICS comparators have been developed, with fast and accurate operation at low supply voltage. Synchronous sample-mode BICS circuits using sense amplifier latches with strobed current-mode operation are designed and analyzed. They show reliable low-voltage operation, improved speed and accuracy and lower sensitivity to mismatch compared with conventional voltage-mode sense amplifiers previously reported in BICS designs. Source-controlled current-mode CMOS comparators described in this chapter may be effectively employed in both synchronous and asynchronous operating modes. They have fast and accurate operation at low supply voltage and power dissipation, where voltage-mode comparator performance is drastically limited. Innovative multistage comparators with multiple detection thresholds can be effectively used in current signature analysis and fault-grading applications. Simulation results show significant improvement in sensor accuracy while low current operation and reasonably low area overhead are conserved. Effective BICS calibration algorithms can be developed using multiple-output comparator configurations. They also allow accurate analysis of $I_{DDQ}$ behavior in current-monitored hard IP core partitions. We introduce a hierarchical power modeling strategy for CMOS CUT partitions and on-chip current monitors. BICS circuits are modeled as configurable power supply interconnects that allow the development of system-level constrained synthesis of current-monitored CMOS ASICs for low power and $I_{DDQ}$ testability. Our work lays solid bases for transparent, smooth and reliable embedding of high performance BICS circuits as structured building blocks in deep submicron CMOS ASIC designs. Dual measurement algorithms for $I_{DDQ}$ testing are devised to further improve the current measurement accuracy. Single-measurement techniques using multistage comparators with multiple detection thresholds and their wide range of potential utilization are also described. # Chapter 3 # **On-Line Current Monitoring Techniques** # 3.1 Introduction Embedded current monitors have initially been considered for use in complex CMOS ASIC only at development and production test stages. They allow reliable detection of hidden design errors and manufacturing defects and can be used to screen for high quality parts and improved product reliability [10,132]. $I_{DDQ}$ testing in smaller circuit partitions of complex VLSI ICs has improved fault coverage and higher test speed compared to external, at-pin $I_{DDQ}$ measurement. On-chip current sensors can be deactivated using bonded interconnects after the production test stage, in order to avoid any degradation of performance they may induce in application. In this chapter we describe a dual-use BICS implementation strategy to extend the current monitoring function for fault detection throughout product's operating life. This is particularly useful in application environments affected by transient and permanent faults. The dual-use principle relates to providing a performance monitoring function for good devices in addition to extending the error detection and correction capability in environment-critical applications. On-chip supply current monitoring capability allows to detect system performance variations and to implement adaptive strategies for operating parameter optimization in performance-critical applications. When used in conjunction with specific test and reconfiguration circuits (e.g., logic BIST, parity checking and distributed power supply control), BICS circuits ensure concurrent detection and location of permanent and transient faults, error correction, fault isolation and system reconfiguration upon failure [47-49][124-125]][136]. Architectural and design solutions for on-chip current monitoring and their experimental implementation are described in this chapter for typical regular arrays of combinational and sequential circuits. Two circuit prototypes, a parallel multiplier and a static RAM, have been designed, fabricated and tested in order to analyze the effectiveness of this approach. An important issue that we address in this chapter is current monitors' potential for system performance improvement, in contrast to a general view throughout industry that concerns speed and noise immunity degradation issues induced by integrated current sensors. BICS circuits can fulfill specific performance monitoring tasks during CMOS product's operating life and thus increase the performance/cost ratio. Their operation can be based on a wide range of measurement algorithms: absolute measurement, limit comparison, time-related or stress-related differential measurement, multiple-limit parallel sensing and adaptive sensitivity control. In this chapter, our work is mainly focused on the design and experimental validation of synchronous and asynchronous current monitors with fixed or adaptively selected threshold limit comparison. # 3.2 High Speed BICS Circuit Design Embedded current monitors can be used on-line to detect either fault-induced $I_{DDQ}$ currents or dynamic noise and power supply transient currents in CMOS static circuits. On-line current monitoring is performed concurrently with normal circuit operation, during the relatively short quiescent time periods. If circuit's function and operating frequency allows it, on-line current monitoring can be performed during the quiescent state part of an active clock cycle. Otherwise, it may be carried out at reduced clock speeds or scheduled during the inactive clock cycles of the monitored circuit. Synchronous current sensor operation is enacted at accurately defined time instances with respect to the system clock and has samplemode, clock-enabled operation. Asynchronous current monitors are defined as event-driven current measurement circuits. Their operation is scheduled by the occurrence of particular events in the circuit, possibly unrelated to system's clock operation. Both synchronous and asynchronous BICS circuit operation is enabled during system's quiescent time intervals $T_Q$ . These intervals are defined in a synchronous CMOS circuit as an integer number of clock periods $N \cdot T_{CK}$ with no logic activity following the worst-case propagation delay $t_{PDMAX}$ and the largest transient current decay time $t_{DMAX}$ of the current clock cycle operation: $$t_{O} = N \cdot T_{CK} - (t_{PDMAX} + t_{DMAX}),$$ (3.1) Figure 3.1 presents the block diagram of a CMOS functional module using on-line current monitoring circuits for either synchronous or asynchronous operation. The BICS circuit that connects the functional module to the supply voltage bus is exercised by an ENABLE control and a CLOCK signal. An activity detector block AD is defined as a circuit that monitors module's inputs, outputs and internal state in order to discriminate the active transition periods from the quiescent time intervals and thus to provide a corresponding BICS activating clock signal. A secondary current monitor circuit, with significantly higher detection threshold compared to BICS, may equally be used as a quiescent state detector AD. It can be used to monitor the transient supply currents in a predefined circuit area that corresponds either to the latest switching nodes in the circuit or to the slowest decay currents for quiescent state settling. A similar circuit is described in [45], where it implements a current-sensing completion detection (CSCD) function for handshake operation in asynchronous circuits by detecting the switching transition decay currents. A detected event, i.e., the end of a switching transition, may be used to control BICS operation, typically by activating its function after a predefined delay. Depending on their activation conditions or criteria, BICS circuits can be controlled by any single or combination of worst-case signal node voltage, clock pulse edge and supply current transitions. BICS circuits activated by delayed system clock transitions are considered as having synchronous operation since they are directly synchronized by the system clock. When activated, the BICS circuit detects any abnormal currents induced in the circuit due to permanent or transient faults, and activates an error output signal. Figure 3.1. Generic circuit architecture for on-line current monitoring In conventional off-line $I_{DDQ}$ test processes, this error signal typically activates a pass/fail decision. On-line current tests may also start a complementary action, i.e., diagnosis and reconfiguration algorithms to further test the circuit, isolate the fault and remove the error condition. System recovery procedures can also be scheduled if necessary after local error correction. The Error output has latched operation and requires initialization either through external control or by scheduling a new test execution subsequent to error removal. We have designed and implemented two typical architectures of current-monitored CMOS circuits to analyze both synchronous, sampled mode and asynchronous, continuously operated BICS circuits with different measurement activation techniques. The analysis performed on these two prototype circuit designs allowed us to assess the limits of attainable performance for BICS in sampled and continuous mode and to identify key design issues and measurement techniques for optimum BICS performance and reliability. ### 3.3 Current-Monitored Self-Checking Multiplier High-speed combinational circuits with high delay stage counts and wide switching parallelism represent most adequate CMOS circuit architectures to assess BICS performance and effectiveness for on-line current monitoring and the impact on system's operating speed, power dissipation and noise immunity. A parallel multiplier circuit has been used as a test vehicle due to its regular, iterative design with a large number of pattern-activated worst-case path delays. We have designed an 8x8 bit carry-save multiplier with self-checking architecture using two-rail code operation in order to ensure concurrent detection of logic errors at its outputs. A self-checking two-rail checker circuit is connected at its 16 coded outputs and provides a two-rail output error code. Two current sensors have been inserted on the $V_{ss}$ supply path to separately monitor the $I_{DDQ}$ currents of the multiplier and the checker. The circuit has been designed and fabricated in a 1.2 $\mu$ m technology from ES2/ATMEL. The results obtained with the implemented current sensors have been compared with the self-checking design for performance, effectiveness and implementation cost figures. Fig. 3.2. Block diagram of the current-monitored self-checking multiplier. We have used a standard cell oriented design methodology for the prototype circuit and developed a dedicated differential two-rail gate library. Input data coding is performed using logic inverters. Two $V_{SS}$ -referenced current sensors are implemented on-chip to monitor the isolated ground lines of the multiplier and the two-rail checker. The two current sensors are used to monitor, either simultaneously or alternately, the isolated ground lines of the multiplier and the checker. The circuit emulates a generic current-monitored system having two combinational circuit partitions with sequenced delay operation, since the outputs of the first circuit (the multiplier) represent the inputs of the second one (the two-rail checker). The microphotograph of the processed chip is presented in figure 3.3. Fig. 3.3 Microphotograph of the self-checking multiplier prototype using BICS. The current sensor employs a fixed-bias MOS transistor as a detection element and a current-controlled sense amplifier as a strobed comparator, as described in detail in Chapter 2.2.4.3. Two parametric faults are simulated in the multiplier using a bridging mechanism between two internal adder outputs situated on longest delay paths. These paths are subsequently exercised with the corresponding test patterns for worst-case performance during on-line testing. We added two NMOS transistors to simulate faults as resistive bridges that are undetected by logic tests but produce a small increase of pattern-dependent propagation delays. They are activated with two fault injection control signals FI1, FI2. Electrical tests performed on the prototype circuit using a Tektronix VT-500 design verification system showed reliable circuit operation at 25 MHz maximum operating frequency with an $I_{DDQ}$ threshold limit of $60\mu A$ for on-line defect detection. The operating frequency is increased to over 31 MHz for the self-checking multiplier without BICS. This represents a 20% degradation of the operating speed. The maximum propagation delay added by the TRC circuit represents 17.4% The fault injection signals FI1, FI2 were exercised in analog mode for fault detection sensitivity control. The current-controlled sense amplifier used as a fast comparator for BICS circuit operation showed simulated threshold voltage mismatch of 1.6 mV for 10% stochastic variations of transistor current gain. Fully static, off-line $I_{DDQ}$ tests performed with relaxed timing for BICS control allowed the current sensing accuracy to be increased to $1\mu A$ . The 16-bit TRC circuit implemented on chip adds 9% area overhead, while the two current monitors represent a global area overhead of 21%. However, if we consider the 48% area increase needed to implement the 8-bit multiplier with differential two-rail coding, the BICS implementation cost, though very high, becomes less impressive. It should be noted that the two on-line monitoring techniques are complementary to each other, since the two fault spaces they cover have significant non-coincidence areas. #### 3.4 Transient Fault Detection in CMOS static RAMs As increasingly complex system architectures are integrated on VLSI chips, improved design for testability (DFT) techniques are required to provide effective means to ensure system-level testability. Structured scan-based and BIST techniques are used to detect permanent faults produced by physical defects, but they cannot reliably detect transient faults. Experimental studies [22][23] have shown that more than 80 percent of system failures are transient in nature. Transient faults (TF) are temporary logic state flips of data sensitive nodes, generated by internal or external sources (i.e. supply voltage noise, signal coupling, parametric faults, electromagnetic fields, radiation effects etc.). Suitable design techniques for noise filtering and shielding can suppress most causes of transient faults except for those caused by radiation. Most of the transient faults affect system functionality by inducing single-bit logic errors (upsets) in memory cells and latches. The impact of high energy ionizing particle radiation on memory systems generates a dense track of electron-hole pairs, and this ionization can cause transient upsets, also referred to as single-event upsets or soft errors [24,29,32]. Transient logic errors generated at sensitive signal nodes of combinational logic blocks may affect system operation, and generally they can also be traced to storage elements. In order to avoid system failures generated by memory upsets, information redundancy techniques using error detection and correction (EDAC) codes [33] are implemented in computer memory systems. EDAC processors perform periodic memory exploration in order to detect and correct cell upsets. For large system memory sizes and for memory access intensive applications, the time interval between two successive accesses of the same memory word (that will correct the eventual upset) may be very long. Large latencies increase the probability of multiple upsets on a single word, which makes ineffective the error detection and correction circuits. System activity studies [25] indicate a sharp increase in upset rate (on the order of 10 to 1) at high utilization. One possible cause of this phenomenon is the latent discovery of faults. An alternative solution to increase system availability is to use specific design and processing techniques to produce memory circuits intrinsically immune to the upsets generated by either electrical noise or radiation [99]. However, in this case several other conflicting requirements are hard to be met, e.g., high speed, low power and high integration density. This makes very difficult the design of high performance memory architectures fully insensitive to upsets, with high storage capacity and low cost. In the sequel we describe implementation, test and validation details of an effective technique to achieve SEU-tolerance in CMOS static RAMs through concurrent, event-driven error detection and correction. This technique uses on-chip current monitors to detect the transient current pulses generated by memory cell upsets. # 3.4.1 On-Chip Current Testing in Static RAMs The external, off-chip testing of the quiescent supply current $I_{\text{DDQ}}$ has proved to be very effective for detecting SRAM defects that escape traditional voltage monitoring techniques [26-28]. More recently, on-chip current testing techniques have been studied as potential means to enhance fault coverage and reduce testing costs of static RAMS. In [30], memory decoder circuits are modified to select all the rows and columns, so that the whole memory array can be treated as if it were a single cell. Faults can be detected by observing the increased static supply current $I_{DDO}$ externally, since most of the functional faults in static RAMs cause I<sub>DDO</sub> to be increased. In [41], a current-testable CMOS SRAM architecture is described that implements dynamic I testing on-chip to detect pattern-sensitive coupling faults. Two transient current monitoring circuits are used to detect abnormal transients on $V_{DD}$ line for the $N_C$ - 1 unaddressed columns and on $V_{SS}$ line for the $N_R$ - 1 unadressed rows. Power supply switching is used to connect the addressed row and column directly to the supply lines. The test is performed at reduced memory operating speed due to power switching and current monitoring circuit delays. The test method cannot be effectively used for on-line error detection since it cannot work at system's operating speed, neither can identify the row or column location of the error, in order to perform the correction. Further, this technique cannot be used to detect upsets that are asynchronous to the system clock. The principle of using current monitoring techniques to detect and correct memory upsets during normal operation has been introduced by Vargas and Nicolaidis in [31]. It employs continuously activated BICS circuits in each column of the memory cell array to detect and locate the transient currents generated by memory upsets. The SRAM cell array architecture tolerant to radiation-induced memory upsets combines transient current monitoring with single bit parity coding to achieve upset correction. # 3.4.2 Transient Fault Tolerant SRAM Design using Current Monitoring Based on this approach, we have developed an improved asynchronous BICS design to achieve reliable transient-fault tolerance in high performance, high density CMOS static RAMs. The BICS circuit uses a compact version of the high speed current comparator design described in Chapter 2. It detects transient current pulses generated on the column supply lines by memory cell upsets. A row of asynchronous latches store the corresponding error indication at each current comparator output, flagging the memory column affected by the upset. #### 3.4.3 Transient Currents in Static RAM Cells CMOS static RAMs generally have fully dynamic power consumption. Static leakage currents are negligible and do not contribute to circuit's power dissipation. Dynamic supply currents are generated when active logic transitions are applied to the control inputs. Most of the power dissipation is given by the peripheral circuits (address decoders, precharge circuits, sense amplifiers, I/O buffers, etc.). Memory cell arrays take most of the SRAM chip area, but have a very small contribution to chip's power dissipation. Read/write current pulses in memory cells have low amplitude and short duration within the memory access cycle. They occur in a single cell for each memory column that is activated during read/write operation. On the other hand, radiation-induced upsets in memory cells generate significant transient current pulses. Concurrent detection and reliable location of these current pulses for subsequent correction will ensure the upset-tolerance goal. Hence, the supply bus of the memory cell array can be isolated from the periphery and monitored to detect the transient currents induced by upsets. A Single-Event Upset (SEU) is a transient fault that results in a logic error in the memory circuit at exactly one point in time and space. A typical mechanism of SEU injection is the impact of a high-energy particle on a sensitive node in a memory cell array. When such a high-energy particle strikes a memory cell sensitive node (the drain of an off-transistor), the ionized track generated will determine the collection of a charge Qd on that node followed by the immediate conduction of the drain-to-bulk junction and the temporary reversal of node's logic state. This generates two transient currents in the memory cell: a) a charge-removal short-circuit current on the complementary on-transistor connected to that node, and b) a cell transition current, generated by the process of cell state reversal. While the regenerative cellswitching current can be sensed on both Vcc and Gnd supply lines, the charge-removal current in the upset node, which has higher amplitude, will be sensed only on those supply lines, which set the initial logic state of the incident node. Hence, transient current monitoring on Vcc supply line detects easier the logical 1-to-0 transitions generated at the incident node (1-to-0 upsets), while transient currents on Gnd supply line detect the opposite 0-to-1 transitions at the incident node (0-to-1 upsets). Then, if a Built-In Current Sensor (BICS) is placed between this memory cell and the power bus of the memory, it can be used to detect the transient current in the power-bus and doing so to detect the eventual cell upset. The sensitivity to upsets is higher for the 1-to-0 upsets, since the p-transistor size of the memory is considerably smaller. As previously stated, our approach for current monitoring consists in inserting BICS circuits on the supply lines of each memory column. These BICS circuits must be insensitive to the transient currents induced during active read/write cycles. They must not detect the small transient currents induced by radiation which do not generate upsets. A proper calibration of BICS detection threshold should avoid the false alarms or, more exactly, should reduce their probability of occurrence to a negligible level. This requires an accurate estimation of the storage node sensitivity to upsets. Note that false alarms are not a serious problem if they occur at slow rates. In fact a false alarm will not lead to erroneous correction of a memory site, since a correction is performed only if after reading the memory words an erroneous parity is discovered. Of course if false alarms occur with a high rate, system operation will be disturbed by frequent interruptions. We have performed a simulation-based analysis of SRAM cell designs using 1.2 $\mu$ m CMOS bulk-epi technology from AMS. The transient currents induced by upsets and the read/write currents in the memory cell have been obtained. We could determine the BICS circuit requirements for reliable detection of upsets and false alarm avoidance. Electrical circuit simulation results using SPICE have been compared to 2D and 3D device simulation of charge collection mechanisms in reverse-biased drain junctions of memory cell transistors. #### 3.4.3.1 Upset-Induced Currents The schematic of the SRAM cell is presented in Figure 3.4. The upset-sensitive nodes are the drains of the *off*-transistors, i.e., the drains of MP1 and MN2 for the initial state selected. The reverse-biased intrinsic p-n junctions are represented as diodes for illustration. BICS circuits are represented on the current measurement path. SPICE simulations have been used to determine the transient currents induced by upsets in the memory cell. Fig. 3.4 Transient currents induced by (a) positive and (b) negative upset currents The impact of an ionizing particle may be modeled by a time varying double-exponential current pulse [28]: $$I(t) = I_0 \left( e^{t/\tau_{\alpha}} - e^{-t/\tau_{\beta}} \right)$$ (3.2) where $\tau_{\alpha}$ is the collection time constant of the junction and $\tau_{\beta}$ is the time constant for initially establishing the ion track. In this work, the induced charge is simulated by a time-dependent current source of amplitude $I_{\mathcal{U}}$ with a triangular shape approximating the equation (3.2). Current pulse rise and fall times have the values $\tau_{\rm r} = 5.0 \times 10^{-11}$ and $\tau_{\rm f} = 2 \times 10^{-10}$ sec. The injected current pulse will simulate the collection on the sensitive node of an equivalent charge Q<sub>d</sub>: $$Q_d = I_u ? \frac{t_r + t_f}{2}$$ (3.3) Either a positive current pulse on the drain of MP1 or a negative current pulse on the drain of MN2 will upset the cell. The node capacitance will be charged, temporarily reverting the logic state. A charge-removal current I<sub>f</sub> will be drawn through the drain-to-bulk junction. This current can not be detected since we can not isolate the junction to insert a BICS circuit. Two additional transient currents are generated in the memory cell: a) a short-circuit current I<sub>SC</sub> on the complementary *on*-transistor connected to that node and b) a cell transition current I<sub>tr</sub> generated by the regenerative process of cell state reversal. While I<sub>tr</sub> currents can be sensed on both Vcc and Gnd supply lines, the short-circuit current in the upset node, which has a higher amplitude, will be sensed only on the supply line that sets the initial logic state of the incident node. The currents sensed on the supply lines for positive upset (0-to-1 transition) and for negative upset (1-to-0 transition) will be respectively: $$I^{+}_{vdd} = I^{+}_{vddtr}$$ , $I^{+}_{gnd} = I^{+}_{sc} + I^{+}_{gndtr}$ (3.4) $$\Gamma_{vdd} = \Gamma_{SC} + \Gamma_{vddtr}$$ , $\Gamma_{gnd} = \Gamma_{gndtr}$ (3.5) Hence, positive upsets can be easier detected on $V_{SS}$ supply line, while negative upsets can be easier detected on $V_{DD}$ supply line. In order to insure a high sensitivity for upset detection, two BICS circuits are placed on both supply lines of each memory column. By simulation, we have found that the critical current (i.e., the minimal pulse amplitude that provokes a positive upset) is $I^+_u = 4.3$ mA (critical charge $Q_d = 0.54$ pC). For negative upsets, the critical current is $I^-_u = -2.7$ mA (critical charge $Q_d = -0.34$ pC). The memory cells are more sensitive to negative upsets, since the state-restoring p-transistor size is considerably smaller. Figures 3.5a and 3.5b present SPICE simulation results for critical positive and negative upsets. Transient currents and voltage waveform at cell nodes are presented. For easier comparison, the time scale width is restricted to 4 ns for all current waveforms. The transient supply current for the positive upset has 3.3 mA peak amplitude. The negative upset current pulse has $500 \mu A$ amplitude and longer duration. We compared the equivalent charges of the current pulses in the two cases. Fig. 3.5 Current and voltage vaweforms for the critical positive and negative upsets The charge $Q_r$ removed from the upset node and the charge $Q_t$ transited through the opposite node are respectively: $$Q^{+}_{r}(I^{+}_{gnd}) = 0.83 \, pC$$ , $Q^{+}_{t}(I^{+}_{vdd}) = 0.22 \, pC$ (3.6) $$Q^{-}r(I^{-}vdd) = 0.60 pC$$ , $Q^{-}t(I^{-}gnd) = 0.24 pC$ (3.7) These charge values have been obtained by approximate area integration. They give a quantitative measure of the minimum equivalent charge $Q_{\Gamma}$ of the upset current pulses to be detected. A similar equivalent charge will be calculated for read and write current pulses. These charge measures will then be used for proper calibration of BICS sensitivity and detection threshold. #### 3.4.3.2 Read/Write Currents In a RAM column, word lines select a single cell at a time during either write or read cycles. The transient supply currents during read/write operation of the memory cell were simulated with SPICE. The supply current waveforms are presented in Figure 3.6. Cell write currents take only a short time interval required by internal state transition. $V_{DD}$ -referenced write currents $I^w_{vdd}$ have lower amplitude than $V_{SS}$ -referenced write currents $I^w_{gnd}$ due to the smaller size of the p-channel transistors MP1, MP2. Read currents are generated only on $V_{SS}$ supply line. The cell node having low logic state will discharge the corresponding bit line capacitance, enabling sense amplifier operation. Fig. 3.6 Transient supply currents in a memory cell during (a) write and (b) read operation. Both read and write currents have lower peak amplitude than the short-circuit currents induced by upset. This is due to the additional *on*- resistance of the access transmission gates MN3, MN4. However, the cell read current has a significantly longer duration and consequently a greater equivalent charge. The area computation for the three current pulses gives the following charge values: $$Q(I^{w}_{gnd}) = 0.37 \, pC$$ , $Q(I^{w}_{vdd}) = 0.26 \, pC$ (3.8) $$Q(Ir_{gnd}) = 1.76 pC \tag{3.9}$$ The negative upset detection threshold on $V_{DD}$ line must be lower than the charge $Q^-r$ in (3.7), i.e., 0.60 pC. At the same time, it must be greater than the write pulse induced charge $Q(I^w_{vdd})$ in (3.8), i.e. 0.26 pC. This avoids false alarms during write operation. The detection threshold for positive upsets in the $V_{SS}$ -referenced BICS during an active read pulse has to be adjusted to be lower than the charge $Q^+r$ in (3.6), i.e., 0.83 pC. However, the sensitivity of $V_{SS}$ -referenced BICS must be reduced more than two times during the active read pulse to avoid false alarms induced by the 1.76 pC read pulse charge in (3.9). This may reduce the upset detection probability in the selected columns during read operation. #### 3.4.4 Asynchronous BICS Design Traditional BICS circuits [10,17-20] are used to monitor the static current dissipation ( $I_{DDQ}$ ) in CMOS static circuits, and thus they are synchronized by the system clock. An abnormal current value indicates either a permanent or a pattern-dependent physical fault. Transient faults (TF) are asynchronous events that generate, either randomly or intermittently, significant transient currents on the supply voltage lines. Their occurrence can not be detected by a synchronous $I_{DDQ}$ test which is activated by the system clock only during very short current sampling time intervals. Thus, unlike to traditional BICS, our TF detection technique uses a sensing circuit operating as a high speed asynchronous BICS. Fig. 3.7 Dual BICS circuit block diagram Its peculiarity is related to the fact that it will be driven by the upset-induced current signal (the abnormal transient current) which typically has relatively low amplitude (several mA) and very short duration (hundreds of picoseconds). It must switch very fast between the active and the inactive state in order to detect the upsets on-line without affecting system's operating speed. Dual BICS insertion between the virtual and the real power supply lines should not affect RAM's speed. The block diagram of the dual BICS circuit is presented in Figure 3.7. The current sensing resistors $R_H$ , $R_L$ which connect the functional circuit to the supply lines determine the current detection sensitivity for upset currents to $V_{DD}$ and $V_{SS}$ , respectively. The parallel capacitances $C_H$ , $C_L$ represent the lumped equivalent of the internal distributed capacitances of circuit's supply lines. They integrate the sharp transient current pulses, reducing their amplitude and increasing their duration. Thus, short current transients are detected using parallel RC current-sensing circuits, without generating excessive power supply noise. Note that fully resistive and fully capacitive current sensors are both unfeasible and unreliable at full system operation speed in reasonably large circuit partitions, since the large parasitic capacitance cannot be reduced or accurately controlled. The two comparators generate logical output pulses which set an asynchronous error latch. A high-speed dual asynchronous voltage comparator circuit has been proposed in [31]. However, its implementation with standard CMOS processes leads to a strong dependence of its performance on process, voltage and temperature variations and generates high power supply noise. We propose a novel, high performance comparator circuit that overcomes these limitations. A high-speed current-mode comparator is used to reliably detect both high and low current transients without significantly affecting the power supply noise margins. Fig. 3.8 Schematic of the dual asynchronous BICS circuit The proposed transient current detection circuit is presented in Figure 3.8. Its schematic is based on the source-controlled current comparator configuration described in Chapter 2. The monitored functional block is the cell array of a CMOS static RAM column. Transistors $T_{L1}$ - $T_{L7}$ form the $V_{ss}$ -referenced current comparator, and $T_{H1}$ - $T_{H7}$ form its symmetrical counterpart. The current mirrors $T_{L3}$ - $T_{L6}$ , $T_{H3}$ - $T_{H6}$ supply the reference currents $I_{RL}$ , $I_{RH}$ to the comparators. The current comparison is performed on the common drain node of transistors $T_{L1}$ - $T_{L4}$ and $T_{H1}$ - $T_{H4}$ . The two current mirrors $T_{L1}$ - $T_{L2}$ , $T_{H1}$ - $T_{H2}$ acting as current-controlled current switches, determine the high speed and low noise operation of the transient current detector. They allow the implementation of a reliable current comparator with a reduced sensitivity to process, voltage and temperature variations. The operating principle of the current-controlled current source is presented in Figure 3.9. The current mirror $T_1$ - $T_2$ uses equal source resistors $R_1 = R_2$ , which are the equivalent of the current-sensing resistors $R_L' = R_L$ in Figure 3.8. The differential voltage between the source nodes of the two transistors will represent a difference in the gate-to-source voltages, $V_{gs2}$ - $V_{gs1}$ , since their gates are connected to the same potential. Fig. 3.9 Current mirror controlled by transient upset current I<sub>u</sub> This differential control voltage determines the ratio of the corresponding drain currents of transistors $T_1$ , $T_2$ . It can be generated by injecting a current $I_u$ on one of the source nodes. The gain of the current mirror is controlled by the upset current $I_u$ injected into the source node of transistor $T_2$ . The current gain is a linear function of the upset current, given by the equation: $$A_{i} = \frac{I_{out}}{I_{in}} = K_{1} - K_{2} \cdot \frac{I_{u}}{I_{in}} , \qquad (3.10)$$ where the two coefficients $K_1$ , $K_2$ have low nonlinearity and process variations, and are respectively approximated by the equations: $$K_{I} = \frac{g_{mI}}{g_{m2}} \cdot \frac{I + g_{mI} \cdot R_{I}}{I + g_{m2} \cdot R_{2}} \quad , \quad K_{2} = \frac{g_{m2} \cdot R_{2}}{I + g_{m2} \cdot R_{2}}$$ (3.11) In the absence of the upset current, the gain of the current mirror (the current transfer ratio) is determined by the ratio of the gate widths of transistors $T_1$ , $T_2$ . The current gain becomes zero (i.e., transistor $T_2$ is switched *off*) during the occurrence of the upset-sensing current pulse $I_u$ which raises the voltage drop in the source of $T_2$ . The current mirror must operate at low currents, in order to reduce its sensitivity to supply voltage, temperature and process variations [121]. The upset-sensing current mirror T<sub>L1</sub>-T<sub>L2</sub> in Figure 3.8 implements this principle. It has a current gain n slightly greater than 1, so that the current comparison in the absence of upsetinduced transients is performed between the current n-I<sub>RL</sub>, flowing through T<sub>L1</sub>, and the current I<sub>RL</sub> flowing through T<sub>L4</sub>. During normal upset-free operation, T<sub>L1</sub> will drive the comparator output node to logical 0 state. The current mirror uses equal source resistors R<sub>L</sub>' = R<sub>L</sub> to control its current gain through the differential voltage drop on R<sub>L</sub>. The sensed current pulse is amplified and subtracted from the comparison node (the drain of transistors T<sub>L1</sub>), switching comparator's state. The active-load inverter T<sub>L6</sub>-T<sub>L7</sub> buffers the comparator output. It amplifies the voltage pulse generated by the current comparator stage to full-rail output logic levels. The output signal Err\_l controls the asynchronous error latch. A symmetric current comparator detects the transient currents on the virtual supply line $V_{DD}$ and has a similar behavior. The comparators operate at low reference currents (1-10 µA) and have propagation delays lower than 3 ns. The delays are determined by the virtual supply node capacitance. A 128-cell memory column designed with 1.2 µm CMOS/epi process has extracted parasitic supply node capacitances of 1.16 pF on the virtual $V_{\text{DD}}$ line and 1.62 pF on the virtual V<sub>ss</sub> line. This values should be corrected by adding estimated interconnect capacitances of similar order of magnitude. This capacitor forms a first-order filter cell with the parallel current sensing resistor. This RC filter cell integrates the sharp upset current transients, making them detectable during a reasonably large time interval. Supply noise is also reduced, at the expense of increasing the propagation delay of the current comparator. Fig. 3.10 SPICE simulation of BICS operation for (a) positive and (b) negative upset currents SPICE simulation results of BICS circuit operation are presented in fig. 3.10 for both positive and negative cell upset currents. The memory cell array is designed with 1.2 $\mu$ m CMOS technology from AMS. A sharp current pulse of 4.5 mA amplitude with rise and fall times of 50ps and 200 ps, respectively is injected into the sensitive node. The switching characteristics of the cell nodes, the voltage pulse generated at the virtual supply line and the error detection latch outputs are presented. The integrative characteristic of the current sensing RC circuit determines a relatively low level of voltage noise at the virtual supply lines of the memory column (i.e., 0.3 V on $V_{DD}$ ' and 0.4 V on $V_{SS}$ '). The supply voltage noise and the BICS delay characteristics simulated with SPICE for different injected upset currents are presented in Figures 3.11 and 3.12. Figure 3.11 Supply noise variation with upset current amplitude in the current-monitored SRAM. Figure 3.12 Simulated BICS delay characteristics for positive and negative upset detection The behavior of the proposed asynchronous BICS is controlled by two signals, *Bypass* and *Reset*. The bypass signal inhibits BICS operation during the active circuit transitions of the memory column operation, by shorting the current sensing inputs to the supply voltage lines through low resistance switches R<sub>BL</sub>, R<sub>BH</sub>. The reset signal initializes the error latch after upset correction, before resuming circuit operation. ### 3.4.5 Design Optimization of The Current Monitored SEU-Tolerant SRAM A 1K bit upset-tolerant CMOS SRAM using the described current monitoring technique has been designed, fabricated and successfully tested. The block diagram of the circuit is presented in Figure 3.13. The regular structure of the memory cell array is partitioned for power supply distribution. BICS circuits are inserted on the supply lines of each memory column. This allows to locate the column affected by the upset. Separate power supply lines are used to bias the substrate and well contacts in the RAM cell array. The memory array is organized as 128 lines by 8 columns of single-bit words. Only the current Fig. 3.13 Block diagram of the upset-tolerant CMOS SRAM In this section we provide implementation insights and circuit performance data. The memory array is organized as 128 lines by 8 columns of single-bit words. The typical structure of a memory column with transient current detection circuits is presented in Fig. 3.14. Isolated column-level power bus distribution for BICS insertion and the separated p-substrate connection to $V_{ss}$ for n-channel transistors, in order to reduce the latchup sensitivity, led to a cell area increase of 6.5% due exclusively to the increase in column pitch, as shown in the layout drawing of Figure 3.15. Substrate isolation has been proven essential for achieving high detection sensitivity and low false alarm rates, since the current induced on well-to-substrate junctions are no longer connected to current sensor inputs. Figure 3.14 Memory column with asynchronous BICS circuit. It should be noted that for SRAM architectures using an internally generated substrate bias, the redesign of the cell array in order to isolate the supply lines for adjacent columns and to prevent the latchup is no longer needed. Another optimization is obtained if the supply voltage lines shared by two adjacent columns are monitored with a single BICS circuit. Column upset detection is then performed using the same error latch to locate upsets in both memory columns. Error correction is achieved by checking the parity code on both columns. In this case the extra area is reduced to 3.5%. A topological RAM architecture using a central area of column decoders and sense amplifiers and a central power distribution tree would have allowed us to use a single row of error latches for both Vcc-referenced and Gnd-referenced upsets. However, we chose to implement a cell matrix architecture with peripheral, interdigitated power distribution tree. This led us to implement two topologically distinct rows of BICS circuits with two physically and logically distinct rows of upset error latches, located on the two sides of the memory array. The additional area occupied by each BICS circuit on a memory column is the equivalent of 7 additional memory cells as shown in the microphotographs in Figure 3.16. Thus, for a 128-bit memory column size, the cell array area overhead of the dual BICS circuits is 11 %. For an equivalent 512-row memory architecture, the cell array area overhead is only 2.7%. Figure 3.15 Power supply isolation for current monitoring in CMOS SRAM column. The initial cell layout and its isolated-substrate counterpart are represented. Dual BICS circuits are implemented as previously described in Figure 3.8 using two high-speed current comparators for $V_{DD}$ -referenced and $V_{SS}$ -referenced upset currents $I_{UH}$ , respectively $I_{UL}$ . Internally generated reference currents $I_{RH}$ , $I_{RL}$ are used as threshold limits for current comparators. Resistive current sensors implemented as saturated channel MOS transistors connect the column supply bus lines to the global power bus $V_{DD}$ - $V_{SS}$ . An asynchronous error latch stores the column upset information $ERR_i$ for subsequent cell-level detection and correction. A global chip-level error line to be used as system-level interrupt request is driven by the outputs of the column error latches $ERR_i$ through n-channel transistor switches. The upset code in the error detection latches is read and written through the bit lines of the memory array. The row of error latches can be activated for R/W operation. This enables reading and resetting these latches through the memory data bus. a) b) Figure 3.16 Circuit area microphotograph showing the VDD-referenced (a) and GND-referenced (b) rows of asynchronous current sensors In order to discriminate the upset currents from normal cell read/write currents and thus to avoid the false alarms due to normal memory operation, a bypass technique is used. This reduces the BICS input sensitivity for the memory columns that are active during R/W operation. An activity decoder circuit generates the bypass control signals, as described in Section 3.1. During write operation in a RAM array, the $V_{\rm DD}$ -referenced and $V_{\rm SS}$ -referenced bypass transistors $R_{\rm BH}$ and respectively $R_{\rm BL}$ in Figure 3.8 are activated in the selected memory block and for the selected rows in this block. For the prototype circuit presented here these transistors can be activated during a time interval of 6 ns which is accurately timed with the row address decoder output. During a read cycle, only the $V_{\rm SS}$ -referenced bypass transistor is activated in the selected columns of a memory block. In our prototype circuit these transistors can be activated for a period of 10 ns out of the 20 ns in the operating cycle. The upset detection sensitivity is thus reduced only for the duration of active bit line operation in the corresponding R/W cycles. The peak noise induced on the active column supply lines, simulated with SPICE, is presented in the graphic waveforms of Figure 3.17. Figure 3.17 Operating waveforms of the current-monitored SRAM cell with positive (a) and negative (b) upsets The current comparators implemented in the asynchronous BICS add a static current consumption to RAM's power budget which is typically equal to 2 $I_{REF}$ (10 $\mu$ A) per memory column. This corresponds to an equivalent increase with about 10% of RAM's dynamic power dissipation. This dissipation can be reduced by appropriately selecting the active memory blocks to be monitored for upset detection and correction. Effective BICS operation relies upon three key analog parameter, the reference current $I_{REF}$ and the sensitivity control voltages $V_{PH}$ , $V_{PL}$ . The reference current can be externally activated only for the memory blocks and for the time intervals that impose to monitor concurrently the integrity of their data. In the absence of the external reference current source, BICS operation is invalidated and no static supply current is observed. Memory R/W operation can be performed to either monitored or unmonitored blocks, with no difference in functionality and speed. Static $I_{DDQ}$ tests can also be validated during quiescent memory state by adjusting correspondingly the sensitivity control voltages. Memory chip microphotograph is presented in Figure 3.18. Figure 3.18 Chip photograph of the current-monitored SRAM prototype # 3.4.6 Mixed Mode Simulation of SRAM Cell Upset An important issue in the design of embedded current monitors concerns their accurate and reliable calibration. This issue is aggravated in the case of asynchronous BICS by a continuous distribution of the input transient currents generated by radiation-induced charge collection in the sensitive areas of the SRAM. Large safety margins may imply significant rates of false alarms that may severely reduce system performance. SPICE simulations allowed us to electrically characterize the SRAM cell and BICS performance (i.e., speed and upset detection sensitivity, respectively). However, electrical simulators such as SPICE cannot accurately model the fast charge generation and collection processes at collapsed transistor drain junctions and derive a realistic transient behavior of the MOS transistor. This has led us to an optimistic estimation of BICS sensitivity to upsets. Mixed-mode, device and circuit level simulators are needed to replace conventional electrical simulation using SPICE in order to model the charge generation and collection processes with higher accuracy [137]. Three-dimensional mixed-mode simulation using DAVINCI performed in cooperation with IXL Bordeaux gave us a more realistic estimation of the upset-induced voltage and current perturbation in a current-monitored SRAM column. Thus, we were able to calibrate the BICS sensitivity in order to obtain reliable detection of all the upset occurrences with reduced false alarm rates. The simulated SRAM cell schematic is presented in Figure 3.19. $V_{DD}$ and $V_{SS}$ -referenced transistors T7, T8 used as current sensing elements have been modeled by resistor values $R_H$ , $R_L$ . Parasitic virtual supply line capacitors $C_H$ , $C_L$ have been added. The two access transistors T5, T6 and the current comparators BICS $_H$ , BICS $_L$ represented in the figure have not been used in the simulation. The process assumptions were based upon the 1.2 $\mu$ m CMOS technology used for chip fabrication, with a heavily doped p+ substrate and a lightly doped 12 $\mu$ m p-type epitaxial layer. The drain junction of p-channel transistor T3 has been simulated at device level with five values of ionization densities between 0.1 and 0.15 pC/mm. The second cell node voltage and the virtual supply line perturbation characteristics are shown in Figure 3.20. Analyzing Figure 3.20b we can draw an interesting remark: the highest peak voltage at the virtual $V_{DD}$ node is obtained for the lowest charge density applied. This lower amount of deposited charge did not upset the cell and led to a slower charge removal process, as indicated by the rightmost position of its peak amplitude point. Longer conduction time intervals are thus obtained for the memory cell inverters, additionally charging the virtual supply node capacitance and increasing the supply node perturbation. Hence, collected charges lower than $Q_C$ induce larger transient current pulses. The time delay for the peak perturbation occurrence decreases progressively with the increase of the collected charge. SPICE simulation waveforms result in about two times larger perturbation amplitudes and an added 0.8 ns delay. These characteristics have been subsequently used to adjust the BICS sensitivity for reliable detection of faster transient current pulses with significantly lower peak amplitudes. These challenging performance constraints require accurate analog circuit design techniques for BICS. In order to insure reliable operation and performance reproducibility with temperature, voltage and transistor parameter variations, large memory columns may require multiple-BICS insertion, this increasing the design complexity and area overhead. Figure 3.19 SRAM cell with current sensing elements used for mixed mode simulation. Figure 3.20 Mixed Mode 3D Simulation results for charge injection at T3 transistor drain: (a) opposite node voltage, (b) virtual $V_{\rm DD}$ line perturbation. ## 3.4.7 Test and Characterization Techniques using Current Injection The upset detection capability of BICS circuits has been characterized using on-chip implemented BIST resources: two programmable current pulse generators have been implemented on the prototype chip to mimic the SEU-induced current pulse at sensitive nodes of two predefined memory cells in the array. Either positive or negative current pulses are generated, with variable peak amplitudes up to 3 mA and 3 ns duration. By implementing such a mechanism inside the chip, we were able to "emulate" the SEU effects in two memory cells (one cell at a time or concurrently). However, this test is far from being very accurate, since the added parasitics modify cell's capacitance, and the current pulse widths are significantly larger than those generated by heavy ion impact. Nevertheless, it helps us verify the correct BICS behavior and to characterize its typical sensitivity and output delay prior to performing real upset tests under radiation exposure. A general description of current-mode fault injection techniques to validate fault-tolerant system architectures is presented in Chapter 5 of this thesis. The simplified schematic of the current pulse generator is presented in Figure 3.21, and the simulated operating waveforms are presented in fig. 3.22. A typical delay of 16 ns has been measured between the occurrence of the simulated upset the global error output, where the contribution of the pad I/O buffers is about 8 ns. All the upsets were detected including those injected during read/write operation. The minimum current pulse amplitude that generates upsets has been estimated with Spice simulation and qualitatively verified during testing. Its value is 350 $\mu$ A for 1-to-0 node upset (Qc = 0.52 pC) and less than 1.7 mA for 0-to-1 node upset (Qc = 2.55 pC). Fig. 3.21 Schematic of the transient current pulse generator The rising edge of the input control signal Upset, validated by the selection signal Sel=1, activates a sharp negative current pulse at the output, which will produce an upset-to-0 of target cell's sensitive node. The falling edge of the Upset signal, validated by Sel=0 condition, activates a positive current pulse at the same output. The width of the current pulse is determined by the delay of an inverter chain with capacitive loads. The shape of the current pulse is generated by switching currents with externally controlled amplitudes, that charge and discharge the capacitors $C_1$ , $C_H$ . The sharp rising and the slower falling edges of the current pulses are created using transistors $T_1$ , $T_4$ to switch a quick discharge process for the capacitors $C_L$ and $C_H$ and then to recharge them with $I_{DL}$ and $I_{DH}$ currents. The analog input control voltages $V_l$ , $V_h$ control the amplitude of the generated current pulses. These pulses are deterministic events, since they are injected always in the same memory cell node, at the occurrence of the *Upset* activating signal. Their amplitude and shape are controlled, in order to test and characterize the upset detection capability of the current monitoring circuits. Figure 3.22 Simulated operating waveforms for the built-in current pulse generator Figure 3.23 presents measured waveforms of error detection delay with injected upsets and the access time characteristics of the current-monitored SRAM (a) with BICS and (b) without BICS, for different operating supply voltages. A small access time degradation is observed due to the BICS insertion. The measured access time range spans between 19.1 ns at $V_{DD} = 6V$ and 26.7 ns at $V_{DD} = 3.5V$ . Figure 3.24 presents the access time characteristics measured on a digital oscilloscope. Figure 3.23 Measured SRAM performance characteristics Figure 3.24 SRAM access time output waveforms observed on a digital oscilloscope Figure 3.25 presents the oscilloscope waveforms of a measurement sequence simulating the upset detection and correction algorithm, which has been tested successfully on the prototype circuits. The test sequence starts with a first group of 4 normal read/write cycles, W1/R1/W0/R0, into an arbitrarily selected cell, followed by an upset injected into a predefined memory cell location. An 8-step error detection and correction algorithm is then simulated. Each step of the algorithm is accomplished during a single clock period, marked 1 to 8 on Figure 3.25. The flowchart of the exercised test sequence is described in Figure 3.26. The upset detection and correction routine is started by the activated ERR signal. During the first step, a sequential read routine is performed on the internal error latches, in order to detect the column containing an error latch flagged by the upset. The second step of the algorithm verifies if the source of the upset in the flagged column is a permanent fault. Resetting the error latch and then testing its initialized status checks the existence of permanent faults in the circuit. Figure 3.25 Operating waveforms for upset detection and correction The subsequent routine (3) reads all the memory words in the column indicated by the latch and detects the memory word which has the parity bit affected by the upset. The steps 4,5,7 and 8 perform the error correction in the memory word and the column error latch, and check the word parity and the non-initialized status of the latch after the correction. Step 6 is an extension of the upset detection and correction algorithm that detects the occurrence of multiple errors. Multiple logic errors are undetectable if they occur in different bits of the same memory word, or in different bits of different words within the same memory column. Fig. 3.26. Flowchart of the error correction sequence However, column multiplexing architecture in a large memory array makes such an event impossible to occur due to a single particle hit. All the other multiple errors can be corrected using the proposed algorithm. Uncorrectable errors (i.e., permanent hard faults or soft faults) may require a different action (for example, signal the system and invalidate the memory block for further operation). In the deterministic test routine, exercised on the prototype at 20 MHz on a TEK LV500 ASIC verifier, all the steps, including those requiring sequential read operation on memory column words (step 3) and the line of error latches (step 1), are reduced to a single clock period. The test frequency is not limited by the performance of the RAM, but rather by the limitations of the loop test sequence programming on the ATE. The access time observed on a digital oscilloscope is 19 ns, hence the maximum operating speed attainable is about 50 MHz. The upset sensitivity of the memory cells has been characterised using the on-chip programmable upset current generators. Accurate and reliable upset detection operation and a low signal delay of 16ns on the global error output have been observed (see Figure 3.23). The current pulse generators use an external voltage to control the current amplitude. The minimum current pulse of triangular shape and 3 ns duration that generates upsets has been determined through simulation and has been qualitatively verified during testing. Its value is 350 $\mu$ A for 1-to-0 node upset and less than 1.7 mA for 0-to-1 upset. The corresponding critical charge deposited by the upset current on the memory cell node is 0.52 pC for 1-to-0 upset and 2.55 pC for 0-to-1 upset. The upsets induced by these pulses are successfully detected, thus guaranteeing the detection of any other upsets induced by stronger pulses. The memory has been successfully tested for different sensitivities of the current sensors by controlling their input resistance. No measurable access time degradation has been observed due to BICS insertion on the supply current path. This behavior is explained as follows: For read operations, the parasitic capacitances of the column power supply lines are large enough to supply the initial current pulse flowing through the cell and required to attain the voltage level detectable by sense amplifiers. On the other hand, during the write operation, the state of the memory cell is changed by the currents supplied by the write amplifiers through the bit lines. The current flowing through the cell to the column power supplies is opposed to the state change. Thus the reduction of this current due to the insertion of the BICS is benefic for the write operation speed. Noise margins on the cell array can not be measured, but simulations showed a 500 mV worst case degradation of the supply voltage level on the memory columns as previously shown in Figure 3.17. BICS operation has been successfully verified on all the tested prototypes for a wide range of supply voltages, between 3.5 and 6 V (see Figures 3.23 and 3.24). #### 3.4.8 Radiation Test Results A set of irradiation tests has been performed at two radiation facilities: a CF<sup>252</sup> radiation source available at ONERA/CERT/DERTS (Toulouse) and the Tandem Van de Graaf accelerator at the Institute of Nuclear Physics, IPN, Orsay. The Cf<sup>252</sup> californium is a man-made element that spontaneously fissions and gives off high-Z fission products [133,134]. The <sup>252</sup>Cf test has a flux of ~500 particles/sec/cm<sup>2</sup>, having linear energy transfer (LET) on the order of 41-45 MeV cm<sup>2</sup>/mg. The second irradiation test was performed using an accelerated particle beam of higher fluence (Bromine ions, 1600 part/sec/cm<sup>2</sup>, LET=36 MeV cm<sup>2</sup>/mg). Two versions of the designed prototype have been used, one with common substrate connection, the other one with isolated substrate. The results of the second test have confirmed full current mode detection of all the encountered upsets. The status of the error detection latches has been read periodically at 1 sec. time intervals. The existence of bit flips in the flagged upset columns has been verified during the test through exhaustive reads of the memory array at the same regular time intervals of 1 sec. However, the first test showed that inaccuracies in current sensor calibration may invalidate the result. None of the 26 SEUs occurring during the irradiation with <sup>252</sup>Cf (duration 1500 sec, flux=150 particles/sec) were detected. The upset detection sensitivity has been shown to rely heavily on critical BICS sensitivity calibration. We have identified narrow operating ranges of the resistive sensing element with the variations of bias currents. This may require adaptive on-line procedure implementation to compensate the effects of parameter variations with temperature, supply voltage and cumulated radiation dose effects. In a second experiment performed on the same prototype, lower external bias voltages $V_{PH}$ and $V_{PL}$ have been applied on the gates of the current sensing transistors. The values of their resistances have been increased to increase current detection sensitivity. The second irradiation test, performed using Bromine ions, has shown that the augmentation of BICS sensitivity in order to detect reliably the critical charge upsets also increases drastically the noise effects on BICS operation. Excessive false alarm rates have been detected, that accounted for more than 80% of the detected events in the common-substrate RAM architecture. Typically, false alarms are induced by upset occurring in the error latches or in the BICS comparator itself. Since the sensitive junction areas for false alarm represent in our case less than 5% of the total sensitive area in the RAM cell array, a false alarm rate of this order of magnitude would be acceptable. Since the measured false alarm rates are ten times higher, in the range of 50%-85%, two mechanisms have been proven responsible. In the common substrate prototype, the large reverse-biased well-to-substrate junctions in the memory columns are connected to the input of the V<sub>DD</sub>-referenced BICS circuit. Particle impacts that attain this sensitive region generate a significant charge collection resulting in the large rate of false alarms. This physical limitation has been removed in the second prototype. The second mechanism of false alarms is due to the variable distribution of collected charge at the upset sensitive nodes. Charge values lower than the critical charge correspond to slower diffusion mechanisms from peripheral impact areas for the energetic particles with respect to the sensitive junctions. These perturbations lead to relatively slow restoration of cell's logic state compared to fast flipping induced by critical charge values. The corresponding currents generated are generally higher than the upset-induced currents, and consequently are detected by BICS as false alarms. The results of this test can be summarized as follows: - 22 SEUs have been detected, - 107 "false alarms" were observed, - no undetected SEU occurred. From these results we can conclude the full detection of the 22 encountered upsets (measured cross section = $2.5 \times 10^{-4}$ cm<sup>2</sup>/device). On the other hand, the rate of false alarms observed is 5 times more frequent than the rate of detected upsets. This restricts the use of the proposed current monitoring technique to radiation environments with relatively low event rates to allow for the significant increase in error processing overhead. #### 3.5 Conclusions In this chapter we have described current monitoring techniques to detect permanent faults in high speed CMOS combinational circuits and to correct transient faults in sequential CMOS circuits. Two circuit prototypes (a parallel multiplier and a static RAM) have been developed in order to analyze the effectiveness of this approach. The first prototype is a self-checking 8x8 bit parallel multiplier with embedded current sensors which also employs conventional two-rail logic coding and an output checker. The circuit prototype operates at 25 MHz with reliable on-line detection of injected faulty currents. Self-checking circuit design approaches using current-mode checkers compare favorably with two-rail code checkers on area overhead, power and fault coverage, and fairly well on speed. The second prototype is a 1Kx1 bit CMOS SRAM that combines on-line current monitoring to detect and locate upsets at column level with parity coding for error correction. The circuit uses an asynchronous current sensor for transient current detection based on fast current comparator architecture. It achieves detection delays lower than the minimum clock cycle and fast correction algorithms based on a single-column read sequence with parity check. Its operation exhibits low sensitivity to supply voltage variations and has no measurable degradation of RAM's operating speed. This technique achieves lower area overhead compared to Hamming SEC/DED coding [27]. Heavy ion tests have been performed to assess the hardness of the designed prototype for space radiation environment. They validated the concurrent detection and correction of the injected upsets. Prototype evaluation shows that all the upsets induced during memory quiescent state are detected as well as most of the cell upsets induced during the active memory R/W cycles. The noise margin degradation is reduced to 500 mV compared to more than 1V for the voltage glitch detector described in [31], ensuring reliable memory operation and no performance degradation. An innovative technique to test and validate circuit's operation is also described, using small on-chip circuitry to inject programmable current pulses in selected cell locations. This technique has been successfully used to evaluate BICS sensitivity and circuit performance. A 1k bit SRAM prototype circuit implementing this technique has been designed and processed using standard commercial 1.2µm bulk-epi CMOS technology. A built-in upsettest simulator circuit is also implemented in the designed SRAM. It generates current pulses of programmed amplitude on predefined memory cells, allowing test and characterization of the upset-tolerant RAM. Simulated upset tests performed on the prototype chips successfully validate the upset detection capability with no significant change in access time and operating frequency. The described principle of I<sub>DDQ</sub> monitoring in storage element arrays can equally be used to detect permanent faults inducing high static supply currents and to detect upsets induced by coupling faults, electromagnetic noise or high energy particle radiation in terrestrial and space applications. # **Chapter 4** # **Fault-Tolerant CMOS Architectures Using Local Redundancy** #### 4.1 Introduction On-line fault detection using current monitoring described in the previous chapter is a global observability test technique typically applicable to regular, structured CMOS subsystem arrays with low switching activity levels. It has proven limited effectiveness for high performance system applications operating in harsh environments and subjected to high transient error rates. Its performance in low voltage and low power system applications has reduced significance, due to the inability to discriminate fault-induced abnormal currents from harmless operating current noise. However, its key advantage resides in the ability to detect hidden faults that cannot be covered by voltage testing. Complex calibration strategies are required to ensure reliable operation of the current monitors and thus to achieve effective fault detection in application. Concurrent error detection at low subsystem level (i.e., memory block/column) allowed us to obtain an optimum cost/effectiveness trade-off. However, current-monitoring techniques cannot be applied to control-dominated sequential circuit architectures with high switching activity levels. Here they become prohibitively complex, costly and unreliable, and the error latency conflicts with circuit's safety constraints and high speeds operation. Relying on system resources to remove soft errors at singular nodes is not the best way to cope with random, transient internal faults. This may become critical in sequential CMOS control logic due to fast error propagation to vital system areas. Design hardening at storage element level is the only viable solution to this fault scenario for commercial, unhardened CMOS processes. The reduced effectiveness of existing hardening approaches with deep submicron CMOS processes, correlated with their increased sensitivity to upsets, put tremendous challenges on the projected use of advanced, high performance systems-on-chip in space applications [95][103]. In this chapter we analyze the main constraints imposed by space radiation environment on advanced CMOS system operation, and subsequently describe novel circuit design techniques that ensure transient fault tolerance of high-performance, time-critical CMOS ASICs. These techniques employ circuit redundancy at lowest granularity level (i.e., combinational logic gates and sequential storage elements). Prototype circuit implementation data and preliminary radiation test results are also provided. The effects of transient failures whose duration is comparable to system's clock period (such as high frequency noise, coupling faults or soft errors) are generally located or can be traced to isolated internal storage elements. Transient faults that alter the stored information in a synchronous sequential circuit enjoy the important property of being confined to single storage latch locations throughout the clock period of their occurrence. Based on this property, we have devised transient fault tolerant (TFT) sequential circuit architectures using redundant latches and perturbation-immune combinational circuit design strategies. They insure the survival and uninterrupted error-free functioning of vital system areas in safety- and time-critical applications operating in harsh, high-rate upset environments. # 4.2 Radiation-Induced Reliability Failures in Deep Submicron CMOS There are two major degradation mechanisms affecting electronic life-time and reliability [74-76]: the electrical reliability issues and the environmental reliability issues. Examples of the electrical degradation mechanisms are latchup, electrostatic discharge, hot carrier effects, thin dielectric breakdown and electronigration. Examples of the environmental reliability issues are radiation-induced effects, thermal and mechanical stress, and corrosion. Our analysis is focused on a specific transient fault mechanism with increasing impact on advanced submicron CMOS processes: the soft errors induced by radiation (i.e., $\alpha$ -particles for commercial ICs and heavy ions, protons, neutrons, etc. for nuclear and space applications.)[128-129][135]. The recent impetus to future development of satellite-based global communication systems, driven by the developments in multimedia networks and wireless communications, motivated our research towards satisfying the specific reliability constraints imposed by space radiation environments to submicron CMOS microelectronic systems. A short overview of the space radiation environment, its effects on microelectronic systems and the means to estimate, assess, measure and counteract these effects is first presented. Circuit and system level design techniques for deep submicron CMOS ICs are then described and analyzed, that provide high levels of immunity to radiation-induced upsets as well as noise and transient fault tolerance for terrestrial applications operating in harsh environments. #### 4.2.1 Space Radiation Environment for Microelectronics Three primary radiation components of the natural space environment affect CMOS devices [74, 78, 82, 93]. First, planetary magnetic fields trap belts of high-energy protons and electrons, thus subjecting satellites to large fluxes of these particles when they pass through the radiation belts. Second, galactic cosmic rays occur everywhere in space. These highly energetic particles, with a wide range of atomic numbers, exist in a very low flux compared to the number of particles in the radiation belts. However, a single galactic ray can deposit sufficient charge in a modern integrated circuit to change the state of internal storage elements and may also cause more complex internal behavior. Third, solar flares produce varying quantities of electrons, protons and lower energy charged particles. Solar flare activity varies widely at different times. During periods of high solar activity, very high fluxes of particles may occur over time periods of hours or days. Table 4.1 below summarizes the three components of the natural space environment along with their primary effects on CMOS devices. | Radiation Source | Particle Types | Primary Effects in Devices | |-------------------------|--------------------------------|---------------------------------------------| | Galactic cosmic rays | High-energy charged particles | Single-event effects (SEE) | | Trapped radiation belts | Electrons | Ionization damage | | | Protons | Ionization damage; SEE in sensitive devices | | Solar flares | Lower energy charged particles | SEE | | | Electrons | Ionization damage | | | Protons | Ionization damage; SEE in sensitive devices | Table 4.1 Natural space radiation environment and its effects on semiconductors #### 4.2.2 Radiation Effects on Advanced CMOS ICs Interaction of silicon CMOS ICs with radiation involves three fundamental mechanisms [79,84,85,87]. One is the creation of displacement damage in the lattice structure that can be caused by high-energy neutrons and charged particles. Displacement faults can occur when particles are absorbed or scattered by nuclei in the semiconductor material and dislodge atoms from the lattice leaving vacancies and creating interstitial states. They generally lead to small but cumulative parameter degradation. Another is ionization (i.e., the creation of dense electron-hole populations through energy transfer at the impact areas), usually due to high-energy charged particles, X-rays and gamma rays. A third mechanism concerns nuclear interaction of the material with neutrons or charged particles. The byproducts of the reaction are usually energetic secondary particles that produce equivalent effects as the incoming particles (i.e., displacement damage or ionization). Total radiation doze (TRD) in a semiconductor area can produce cumulative effects particularly in oxides on the chip surface, resulting from trapped charge in lattice defects or isolated regions, and from interface state generation [86, 87]. Biased gate oxides in MOS transistors are the primary regions of radiation-induced cumulative device and circuit failures. The results are shifts in device parameters such as threshold voltages, increase of leakage currents and degradation of carrier mobility. In the case of n-channel transistors, negative shifts of the threshold voltages occur that increase the subthreshold leakage in "off" state and may lead to partially conducting states at zero volts gate bias. P-channel transistors exhibit smaller, positive shifts of the threshold voltages. Radiation-induced leakage currents are also caused by trapped charges in the isolation oxides surrounding the MOSFET that activate parasitic transistor areas. These leakage currents can become a dominant failure mechanism in complex CMOS VLSI circuits, especially in large semiconductor static and dynamic RAMs, where a large increase in standby leakage current may lead to information loss and major power drain on the system. Radiation dose rate and temperature can be important components of the cumulative effects. Annealing with time or temperature can partially reverse the effects by detrapping the charge or healing interface states [80]. Annealing effects, that may take hours to years, occur concurrently with rebound phenomena that can further degrade parts after radiation exposure has ceased, due to continuous charge migration in the oxide layers [81]. High-energy particle impact on sensitive silicon areas may also induce transient faults in CMOS ICs. Reverse-biased p-n junctions that isolate MOS transistor drains from bulk silicon areas are the main sensitive regions to radiation-induced rate or transient faults. Carrier plasma produced by ionization either quickly recombine or may lead to charge transport phenomena induced by local potential fields in the semiconductor that generate fast drift or slower diffusion currents. Charge collection at the parasitic junction capacitors of internal circuit nodes may inadvertently change, for short time intervals, the internal voltage in the circuit. These transients may subsequently change the electrical behavior of the MOS transistors in digital and analog circuits. As a result, they may induce loss of the information stored in memory cells, abnormal system operation and permanent circuit damage. Instant rate of ionization charge transport through conductive device areas and the induced voltage transients may also produce single-event latch-up (SEL) faults that consist in parasitic thyristor mechanism activation in CMOS structures. Most of LSI circuits used in space application are made using CMOS processes. This is due to their general performance: high integration density, low power dissipation, and high noise immunity. Cumulated radiation dose induce long term effects on CMOS IC's operating in space radiation environment that may reduce the planned duration of the mission. Hardened processes use special processing techniques to reduce the gate and field oxide effects. On the other side, transient radiation or single event effects (SEE) such as upset or latchup phenomena may have drastic impact on mission performance and survivability [83]. Both latchup and total dose effects can be reduced to acceptable levels using some of the existing commercial CMOS technologies (e.g., bulk-epi processes, CMOS/SOI.). Deep submicron CMOS processes exhibit significant improvements in both total dose and latchup hardness due to thinner gate oxides and improved device layout and isolation topologies [98]. Single event upsets (SEU) represent the main hazard affecting CMOS circuit operation in space applications. They have significantly higher rate of occurrence and lower sensitivity thresholds in advanced submicron CMOS technologies. The minimum (i.e., critical) charge value $Q_c$ required to induce a soft error in a typical cross-coupled inverter latch with minimum size transistors decreases with the square of transistor feature size L [101] (Figure 4.1). This dependence is similar for various technologies such as bipolar, CMOS/bulk, CMOS/SOI or GaAs. Figure 4.1 Variation of critical charge with technology feature size and corrected curve for the submicron region (dotted line) [101]. However, recent experimental data reveal a flattening behavior of the upset sensitivity curve for feature sizes below 1 $\mu$ m. These results are also confirmed by our tests performed on a prototype circuit processed with 0.25 $\mu$ m CMOS commercial technology [94]. This can be explained by a detailed analysis of charge collection and geometrical effects in submicron devices. This analysis takes into account the increased effectiveness of charge removal and the decreased effectiveness of charge collection processes [93]. A practical way to examine scaling effects is to compare the switching charge obtained by circuit simulation for a logic transition in a minimum size CMOS inverter, as shown in Figure 4.2 [90,98]. As devices evolve, the charge required to switch the inverter decreases nonlinearly for different scaling scenarios. Fig. 4.2 Effects of scaling on switching charge of a CMOS inverter with minimum feature size. [90,98] #### 4.2.3 SEU Modeling and Rate Prediction SEE occur via stochastic processes driven by random incidence of ions of various species, energies and angles of incidence in space environment. The direct ionization process is characterized by two pairs of environment-related variables and microelectronic system related variables. The environment variables are particle fluence and the linear energy transfer (LET), which is a measure of the energy deposited per unit track length. This measure is proportional to the square of the particle atomic number and inversely proportional to its energy. Particle LET also depends on the semiconductor material and is expressed in MeV cm²/mg units. Each 3.6 eV of deposited LET energy generates an electron-hole pair in silicon. Figure 4.3 presents a sample LET spectral distribution of particle fluences for a typical 400 km low Earth Orbit (LEO). Various spectral energies and fluences of interest characterize the high altitude avionics environment (i.e., tens of km altitude) as well as elliptical and geostationary orbits [91,92]. The microelectronic system variables are (1) the minimum (critical) charge $Q_{\rm c}$ for a single-event occurrence (i.e., upset or latchup) and (2) the sensitive volume (SV) modeled as a rectangular parallelepiped defined by the sensitive junction area times the charge collection depth. Chord-length distributions are used to calculate the number of ion interactions that can collect the critical charge amount $Q_c$ for ion environments expressed in terms of a LET distribution [96]. A constant LET is assumed along the particle track. Typically there will be more than one SV per storage cell with different geometries and different thresholds for upsetting the latch. The predicted upset rate is the product of two factors: an effective area on the chip that will cause the effect, (i.e., cross section or CS), and the flux of ions in the environment whose LET is beyond a threshold value associated with the critical amount of collected charge $Q_c$ for upset (i.e., LET threshold or $L_t$ ). Fig. 4.3 Sample particle fluence distribution for a 400 km Low Earth Orbit (LEO) The sensitive CS of a device is the ratio of the upset count to the particle fluence: $$\sigma \left[ cm^2 \right] = \frac{N_{error}}{F_p} \tag{4.1}$$ For each individual SV we assume a sharp threshold level $L_t$ for upset and a well defined saturated upset cross-section, where a constant error count is obtained for $LET_i > L_t$ at constant particle fluences. However, as circuits become faster and more complex, the $Q_c$ concept loses its validity since 1) a wide statistical distribution of node sensitivities implies multiple, weighted thresholds for node vulnerability and 2) charge disturbance and restoration compete on the same time scale, making a perturbing current or voltage waveform the best descriptor of an event [93]. Moreover, high-speed synchronous circuits with high switching activity levels spend a significant fraction of duty cycle in transition, further widening the distribution of upset effects. ## 4.2.4 SEU Testing A recent review of the available means to simulate the space environment for IC inflight performance and reliability assessment through ground testing is provided in [96]. Upset sensitivities are determined by bombarding the chip with a unidirectional, monoenergetic, single species ion beam in an accelerator test. The ions randomly probe the SV's, giving an averaged chip response that can be measured. The measured cross section per chip is defined as the ratio of detected upsets to the ion fluence as defined by equation (4.1). A current practice to obtain wider LET ranges with a limited set of monoenergetic particles is to adjust the angle of incidence. The track length and the corresponding effective LET are multiplied by the secant function of the angle of incidence: $$LET_{eff} = LET_i \cdot \sec\theta = LET_i \cdot \frac{1}{\cos\theta}$$ (4.2) CS variation curve with particle LET characterizes circuit's sensitivity to radiation-induced single event effects. A typical LET variation curve of the cumulated CS area of a tested circuit $\sigma(\text{LET})$ is presented in Figure 4.4. The measured data are analyzed and used to predict the upset rate when the chip is placed in the omnidirectional distribution of ions and energies in space. Figure 4.4 Typical measuresd upset sensitivity characteristic #### 4.2.5 Upset Tolerant Design vs. Transient Fault Tolerance SEU-tolerant CMOS design may be considered as a special case of transient fault tolerance, where the induced perturbation has unique characteristics: *fast*, subnanosecond transient currents, with *high rate* of occurrence at *random time* instances and *arbitrary single node* locations in the circuit [21]. As a result, a typical SEU fault is initially confined to a single bistable circuit, such as a memory or register cell. Depending on system operation an upset occurrence may enjoy an error latency property if it induces a faulty behavior only after a quantifiable latency delay. In order to satisfy time-critical system safety and reliability constraints with minimum cost and reduced impact on system performance, we propose in this work a synergetic SEU-hardening strategy that selectively applies different design hardening methodologies to specific circuit functions and system areas. This strategy consists of three main fault-tolerance analysis and design procedures applicable at different levels of circuit and system representation: upset criticality analysis, upset immune design and optimization and system cost/performance impact assessment. Our research results presented in this thesis concern the main, middle stage of this strategy. Upset criticality assessment techniques [89] attempt to sort possible SEUs into three relevant categories by their relative impact on system safety, operation and performance. A first category consists of error-functional upsets for which large probabilities of occurrence are acceptable. They are confined to unused locations, to inherently self-correcting closedloop systems and state machines or to storage areas used by fault-tolerant software procedures that screen for bit errors without logical significance. A second category comprises error-vulnerable SEU failures for which the risk of a low probability is assumable. They may reduce system accuracy or performance (e.g., erroneous scheduling of system procedures that do not impact system's reliability and subsequent availability) [40, 42]. Finally error-critical functions are those where SEU is unacceptable (e.g., vital processor registers). The upset criticality measure is dynamically analyzed to assign progressive levels of vulnerability to latent faults in stored data. Each stored information in a time and safety critical system has accurately defined processing steps and latency intervals, which are typically implemented through adequately structured system tasks. A reliable method to analyze and predict SEU effects on both hardware and software systems is by simulation. Static transient fault analysis and simulation engines [88] and dynamic mixed-mode simulation [102] are employed for this purpose. A lengthy series of tests simulating random upsets provides a statistical base from which a meaningful conclusion may be drawn. The upset sensitivity analysis procedures addressed and extensively employed in our work have been limited to conventional circuit and device level simulation. #### **4.3 SEU Hardening Techniques** Designs can be made less sensitive to upsets by reducing the sensitive area CS and by providing effective error detection and correction means. When upset-sensitivity reduction is the main target, components may be either specifically designed/processed for radiation hardness or adequately tested/screened from statistically favorable unhardened device populations [105]. Screening and selection procedures for commercial CMOS processes are based on wide statistical variations of their radiation hardness characteristics. SEU hardened processes use highly doped, lower depth (e.g. epitaxial) or isolated (e.g., SOI, SOS) substrates to reduce charge generation and charge collection effectiveness. Processing changes that affect material and junction properties generally involve costly and complex trimmed process steps that involve almost invariably lower yield and performance loss. Small process changes (or even unforeseen statistical variations) in existing high performance commercial processes may prove effective in providing significant increases in device hardness [88][100]. Design hardening against SEU can be accomplished at several levels: system, circuit or device. System-level upset hardening techniques are typically based on periodic or concurrent error detection and correction algorithms. Massive redundancy techniques may also be implemented. They are generally based on system-level triplication and majority voting, that add significant cost and complexity. Approaches applied at lower level provide lower error latency, higher system speed and improved system safety levels. Design, process and screening techniques to raise device tolerance to upsets may be used in conjunction, thus providing a synergy of their effects. #### 4.3.1 CMOS Circuit Design for SEU Hardness A comprehensive overview of the available design hardening techniques is provided in [99]. Two basic CMOS SEU hardening techniques are used at circuit level: storage node discrimination using latch feedback delays to filter out the upset pulse, and local redundancy techniques based on storage latch duplication with cross feedback for error recovery. The common approach to storage node discrimination for SEU hardening consists of adding internal resistor, diode, transistor and capacitor structures at the upset-sensitive nodes of a storage cell or on the propagation path of the upset pulse. This adds propagation delays that screen out the upset-induced subnanosecond pulses and dynamically isolate the two storage nodes. Larger signal pulses applied to the circuit nodes during write operation are delayed, thus inducing speed degradation to the system. Figure 4.5 presents a 6-transistor SRAM cell schematic with added RC decoupling elements that may be employed to make its operation slower but harder to SEU. The RC elements are added either as individual device geometries or as parasitic components. Inter-cell coupling resistors R1-R2 implemented in polysilicon are used to slow the regenerative feedback response and are successfully used in many CMOS bulk-epi and SOI SRAM designs. They have low area overhead (0-20%) and allow implementing large capacity SRAMs (i.e., up to 1Mbit) in CMOS/SOI and epitaxial technologies and upset-hardened FPGAs invulnerable to cosmic ions. Costly, critically tuned CMOS processes are required to control the required high resistor values and to provide reproducible characteristics of high resistivity polysilicon, making this approach unattractive for many applications. T-resistive network approaches reduce the performance degradation by adding voltage divider resistors R3-R6 and lowering the values of resistors R1-R2 in the cross-coupled legs of the cell [104]. Resistive hardening may severely impact write cell operation at low temperatures . Several upset-hardened CMOS SRAM designs have been reported that replace the resistors R1-R6 in Fig. 4.5 with diodes or transistors [110]. Capacitive hardening techniques, though easier to implement than resistive hardening, have the major drawback of requiring significant area overhead. Fig. 4.5 Upset hardened CMOS latch design using added resistive and capacitive elements In hardening a circuit to single event effects, the designer must first establish which sections of the circuit are potentially vulnerable to upset. The regions of a MOS circuit that are sensitive to a single event are limited to the volumes within the substrate encompassing the depletion region of each strongly reversed drain diffusion. Each sensitive node should be examined for different operating conditions to determine if single events occurred that could result in disruption of normal circuit operation. If the affected node is a low $Q_{\rm c}$ data node in a bistable storage element, then the single event induced voltage transient may cause a bit-flip soft error – hence, the loss of stored information. Upset-hardened circuits based on added delays for storage node discrimination ensure incremental hardness improvement. They typically increase the $Q_c$ values and exhibit reduced upset rates. If the worst-case critical charge is increased to at least 6.7 pC, i.e., the maximum charge that can deposit in silicon the largest galactic particle, i.e., the 150MeV krypton, then the circuit is upset-immune for space radiation environment conditions. Basically, there are two ways to harden a circuit against SEU's at device level: by minimizing the amount of charge $Q_{coll}$ that can be collected by a sensitive node per event and/or by maximizing the critical charge $Q_{c}$ necessary to produce an upset. Adequate sizing of transistor layout topologies and drain junctions, and the use of structured substrate approaches can minimize the total collected charge. Floating nodes or resistively isolated (high-impedance) nodes are susceptible to single-event upsets and should be avoided in the design of hardened circuits. Device layout topology optimization reduces the charge collection amplification effects. Enlarged design rules may ensure appropriate spacing and isolation of sensitive device areas. Ideally, design hardening techniques should have minimal adverse effects on circuit's performance. On the other side, it should not introduce excessive process complexities or severely reduce circuit density. Design techniques for radiation hardness are generally conservative. They take into account severe worst-case conditions, increased design margins and derated parameter values such as fanout, slew-rate and propagation delays that account for the statistic variations of TRD effects with circuit's logic activity. An adequately weighted sum of these constraints may be used in the design, to assess more realistically the environment impact and the life-time behavior. #### 4.3.2 CMOS Logic Design for SEU Immunity Single event upsets represent the radiation-induced hazard that is most difficult to avoid in space-borne applications, particularly in high density submicron CMOS ICs. Recent experimental results have invalidated the square law increase of upset sensitivity for CMOS feature size reduction in the submicron range. However, the slower rise of CMOS storage cell vulnerability is complemented by a drastic reduction of SEU hardening effectiveness exhibited by most of the conventional approaches: CMOS/SOI processes, resistive hardened design and system-level coding for error detection and correction (EDAC). The charge collection enhancement effects due to parasitic bipolar transistor structures in thin CMOS/SOI devices determine their increased sensitivity to heavy ions [109, 110]. For resistive hardening, the reduced cell node capacitances push the intracell resistor values to the MOhm range and the speed degradation to unacceptable values. The use of error correcting codes implies prohibitively large latencies in fast memory arrays with high access rates. High frequency spaceborne applications operating at high data rates are required to implement the future global networks of data communication systems. Strongly correlated performance, safety and reliability constraints put tremendous challenges on the actually available SEU-hardened design technologies. Logic hardening techniques might offer a viable solution to this problem, since they rely on storage cell duplication and direct cross-feedback and ensure upset immunity through transparent, real-time error correction. Existing upset immune storage cells based on logic/circuit level hardening are designed to insure hardness against single node upsets and not just a relative improvement in SEU tolerance, as compared to other resistive or capacitive design hardening techniques. They have the main advantage of being fully compatible with standard CMOS technologies. Logic hardened redundant storage cells must satisfy the following design constraints: - (a) An SEU perturbation may affect a single latch section. - (b) Transients generated in the perturbed latch are not propagated to the second latch. - (c) The second latch restores the correct state in the first latch after the perturbation. These properties must not change with circuit topology and statistical or environmental variations of circuit parameters. The generic block diagram of a logic hardened storage cell is presented in Fig. 4.6. Two fundamental concepts were used to design SEU immune storage cells using conventional CMOS processes. First, redundancy is used in the memory circuit in order to maintain a source of uncorrupted data after an SEU. This is obtained by using two specifically designed latch sections, L1 and L2, that store the same data. Second, data in the uncorrupted section provides specific feedback to recover the corrupted data. Several upset immune cell structures have been previously developed and have been used mainly in hardening flip flops in high performance custom integrated circuits [61,62,63,71]. Figure 4.6. Generic block diagram of an upset-immune redundant storage cell They basically use two main circuit design techniques for upset hardening: a) the insertion of an upset-immune redundant slave latch, made of NMOS or PMOS inverters, that stores the same data, and b) the use of state-dependent control feedback for upset recovery that employs critical, dose-sensitive ratioed inverter design. An analysis of their characteristics and performance reveals three main drawbacks: static power dissipation, degradation of performance and decreased SEU hardness with cumulated dose [20]. These limitations, which add to the inherently large area overhead (typically beyond 100%), are mainly due to the hardening principles employed. A strongly p-dominant or n-dominant inverter is adopted to conserve a preferred state in case of output driving conflict, even for nonlinear post-dose transistor parameter variations. However, it reduces the operating speed and increases both circuit area and junction leakage. PMOS-only or NMOS-only inverters are chosen to avoid the reverse-biased drain junctions in a preferred state, when the output drain and the common substrate voltages coincide. However, in the opposite state, the channel threshold barrier degrades their logic levels and induce partial conduction states with high static currents [120]. We analyze in the sequel the immunity to upsets of PMOS and NMOS latches composed of a cross feedback transistor pair and a control transistor pair of the same polarity (see Figure 4.7). Figure 4.7 Upset-immune NMOS latch (a) and PMOS latch (b) schematics The initial logic state X1=1, X2=0 of the NMOS latch in Fig. 4.7a is externally controlled by inputs IN1=0 and IN2=1 through transistors N2, N4. The common drain X1 of transistors N1 and N3 is driven at (V<sub>DD</sub>-V<sub>TN</sub>) voltage by the conduction state of transistor N3. The two drains of transistors N1 and N3 are reverse-biased with respect to the V<sub>SS</sub>-biased p-substrate and, henceforth, they are both sensitive to critical charge collection. A negative voltage transient generated by collected charge at node X1 switches "off" transistor N2. Node X2 enters a high impedance state that conserves capacitively its voltage value (X2=0), thus providing immunity to upset. The external control transistor N3 restores the initial state of node X1. Correspondingly, the externally controlled PMOS latch in Fig. 4.7b with initial logic state X1=0, X2=1 (i.e., the control inputs IN1=1, IN2=0) is sensitive to positive transients induced at node X1 but is immune to logic state reversal. The NMOS latch exhibits degraded "1\*" logic levels at its outputs, and the PMOS latch presents degraded "0\*" logic levels. This induces significant static power consumption in the controlled circuits due to transistor bias in partial conduction states. External differential control is required for both NMOS and PMOS latch operation. Hardness analysis to input transient pulses shows that the NMOS latch is sensitive to positive input transients at the "0" logic state input and the PMOS latch is sensitive to negative input transients at the input having logic state "1". NMOS and PMOS latches can employ ratioed design consisting of strong cross-feedback transistors (N1, N2, P3, P4) and weak control transistors (P1, P2, N3, N4) to ensure also full immunity to input transients. In this case, the slave latch cannot be statically written through the weak control transistors. Dynamic write operation can be adopted by adding a common mode write control input as shown in Figure 4.8. Fig. 4.8 Upset-immune NMOS latch (a) and PMOS latch (b) with ratioed design and common-mode write operation An alternative solution to avoid the added write control signal is to add complementary control transistors (N5, N6, P5, P6) that are differentially controlled by the opposite inputs in series to each control branch (Fig. 4.9). The latch configurations thus obtained have pseudo-CMOS operation. They still conserve their hardness properties, since the added reverse-biased drain junctions are isolated from the storage nodes X1, X2 by control transistors P1, P2 and N3, N4. Several logic/circuit design hardening techniques for upset immunity have been developed using upset-immune latch structures of Figures 4.8 and 4.9. They use a dual, redundant latch circuit to store the binary information, and feedback connections to restore the logic state. The added circuits prevent the loss of stored data for all possible cases of upset occurrence. Fig. 4.9 Upset-immune NMOS and PMOS latch configurations with pseudo-CMOS operation The storage cell proposed by Rockett [62] (Figure 4.10a) adds a 6-transistor circuit to a standard 6-transistor CMOS memory cell. The added circuit uses a redundant PMOS ratioed slave latch with dynamic write control formed by p-transistors P5-P8. The slave latch can be written through the main cell during the write access cycles. The CK signal, when inactive, validates the operation of the slave latch for redundant data retention (i.e., when the memory cell is not accessed), by connecting to ground the common drain terminal of transistors P5 and P6. Transistors P3-P4 act as a state-restoring feedback circuit to the main storage latch. The use of a PMOS slave latch avoids the generation of negative upset pulses at its internal nodes. The feedback transistors P3-P4 reinforce the logic state of the main storage cell when it is subjected to upset transients. The positive pulses generated on the slave latch are not propagated to the master latch, since they deactivate the p-channel feedback transistors thus isolating the master cell and ensuring the immunity to upsets. Fig. 4.10 SEU-immune cells using a redundant slave latch: (a) Rockett cell [62] and (b) HIT1 cell [61] A positive upset pulse in the main latch (i.e., on the drain of either P1 or P2 transistors) will not propagate to the slave latch through transistors P5-P6, for the same reason. Negative upset pulses occurring on the drains of transistors N1...N4 will not propagate through P5-P6 transistors if they are weak compared to P7 and P8. The corresponding sizing constraints required in order to achieve the upset immunity can be written as $$W(P7,P8) >> W(P5,P6)$$ $W(P3,P4) >> W(N1,N2)$ This must be ensured not only for worst case variations of supply voltage, temperature and statistical process parameters, but also for the effects of the total dose on the transistor parameters. The circuit in Fig. 4.10a provides accurate CMOS logic levels at master cell outputs. This avoids static power dissipation in the p-channel latch and the external driven circuits. However, it adds a high curent loading on the word line, consequently affecting also the cell access time. A similar design, devised HIT1, has been proposed by Bessot and Velazco [61][131] (Figure 4.10b). It employs an inverter-driven n-channel master latch with N1-N4 and P1-P2 transistors. This avoids the use of p-channel restoring feedback transistors that impose severe sizing constraints on the master latch of Rockett cell. On the other side, HIT1 cell has degraded logic levels at the output and subsequent static currents during the active clock cycle. In another approach, described by Whitaker in [71] (Figure 4.11a), a cross-coupled pair of upset-immune p-channel and n-channel latches is employed, with parallel differential access for write operation using complementary transistors. The n-channel transistor latch is immune to positive charge collection, hence to positive upset pulses, and the p-channel transistor latch is immune to negative upsets. Fig. 4.11 Upset-tolerant storage cells using PMOS-NMOS latch pairs: (a) Whitaker cell [71] and (b) Liu cell [63] The circuit has a prohibitively high static power dissipation (i.e., tens of µA per cell) due to the use of directly coupled n-channel and p-channel latches with degraded logic levels. In order to protect a latch against logic state flipping caused by an upset pulse occurring in the second latch, ratioed inverters are used. Transistors P1-P2 and N3-N4 can be activated by the feedback upset pulses and thus they should be weak compared to P3-P4 and N1-N2, respectively, in order to achieve upset immunity. Accurate transistor sizing requirements for upset immunity and performance optimization make the design process complex. Total dose effects on transistor parameters (high leakages, threshold voltage shifts, transconductance variations etc.) may significantly affect the operating characteristics of the ratioed inverters. This may invalidate the upset immunity in the case of long exposure to radiation. Liu proposed in [63] an improved version of Whitaker cell (see Fig. 4.11b). It employs two crosscoupled p-channel and n-channel latches with pseudo-CMOS configuration. The static power dissipation is reduced, at the expense of cell area increase. Dynamic power dissipation is increased through duplicated access circuit. The circuit has degraded output logic levels that may induce static power dissipation on the controlled circuit. The circuit also exhibits a high sensitivity to the effects of cumulated radiation dose. Various other redundant storage cell structures can be obtained by combining PMOS and NMOS latches and adequate transistor sizing techniques, with the added drawbacks this implies. Of particular interest as hardness, performance and complexity trade-offs are also HIT2 cell proposed by Bessot and Velazco [61] and a modified version of the Rockett cell (see Figure 4.12). Figure 4.12. Modified Rockett cell (a) and HIT2 cell (b) The modified Rockett cell conserves the CMOS output levels and avoids clock line loading. This is achieved by converting the dynamic write-controlled p-channel slave latch to a static-controlled pseudo-CMOS latch through the insertion of n-channel transistors N5-N6. HIT2 cell shown in Figure 4.12b consists of an n-channel latch (N1-N4, P1-P2) with pseudo- CMOS configuration and a p-channel latch with n-channel control transistors (N7-N8, P3-P4). Both latches are n-dominant, (i.e., n-channel transistors N1, N2 and N7, N8 are strong), thus allowing fast and reliable implementation as high performance memory and register arrays. The write access circuit (N5, N6) can be connected to either one of the two latches in the cell. If the access circuit controls the pseudo-CMOS latch, higher operating speed is achieved. HIT2 circuit configuration can be easily implemented by modifying existing RAM arrays through interconnect level reconfiguration of adjacent pairs of memory cells, since transistor sizes do not need to be changed. Re-routing of interconnects is particularly simple and effective when the access circuit controls the CMOS latch (N7-N8, P3-P4). Dual port operation can also be implemented with separated access nodes. It should be noted that in our analysis, the redundant storage cells have been divided into a *master* latch and a *slave* latch, that have essentially different functional meaning from the conventional denomination of master and slave latch sections in edge-triggered flip-flops. However, a direct analogy exists, since our master latch is the one that is directly written through the data input lines, and then the data is transferred to the slave latch. We remind that the main drawbacks of the upset-immune redundant storage elements we have analyzed reside in the two previously defined hardening principles: (a) ratioed design to avoid transient pulse propagation and (b) the exclusive use of n-channel or p-channel transistors in upset-immune latch configurations. These drawbacks (i.e., dose impact on hardness, high power dissipation and significant degradation of operating speed) are inherently amplified when high speed, low power submicron CMOS technologies are employed. Subsequently, in this chapter we present novel upset immune storage cell designs based on radically distinct hardening principles, that help us reduce or avoid the aforementioned drawbacks. A new principle of dual-node control is developed and explained in detail in the sequel. This is essentially a logic design principle instead of being a circuit selection principle as the previously mentioned ones. It allows us to design compact, ratioless redundant storage cells with fast and reliable upset-immune operation and reduced sensitivity of their performance to total dose effects. The new design strategies put no particular constraints on transistor sizes and thus do not evidence the high sensitivity to total dose of the ratioed designs. They have lower area overhead and performance degradation compared to other logic design hardening techniques for both CMOS static RAM cells and sequential logic elements (latches, flip-flops, registers etc.). #### 4.4 SEU Immune Redundant Latch Design using Dual Node Control In this section we present a novel storage cell design entitled Dual Interlocked storage Cell (DICE), that achieves upset immunity avoiding the previously mentioned drawbacks. The proposed cell puts no particular constraints on transistor sizes and thus it does not evidence the high sensitivity to total dose of the ratioed designs. It has a lower area overhead compared to other logic design hardening techniques for both CMOS static RAM cells and sequential logic elements (latches, flip-flops, registers etc.). The new cell is suitable for replacing latches and flip-flops distributed within the logic blocks in CMOS ASICs, in order to make them tolerant to upsets. It may also be used to implement SEU-hardened static RAMs for applications where achieving reliable SEU immunity prevails over the cost of duplicating the size of the memory cell which halves the RAM storage capacity. # 4.4.1 Dual Interlocked Storage Cell Design The new upset immune storage cell design uses a symmetrical 4-node structure, as shown in Fig. 4.13. It basically uses a modified four CMOS inverter ring, where each inverter has its n-channel transistor and p-channel transistor separately controlled by two left-side and right-side adjacent nodes storing the same logic state. Fig. 4.13 Dual-node control interlocked storage cell This approach is different from the previous logic hardened storage cells, that consisted of two latches with cross-coupled, state restoring feedback connections, in the sense that here, the internal latch cross feedback and the inter-latch cross feedback are essentially identical. The two latch entities and their corresponding storage functions are state-driven and dynamically assigned to different pairs of adjacent nodes in the ring. Each node in the ring is controlled by the two adjacent nodes. The right-adjacent node controls the conduction state of the n-transistor of the current node inverter, and the left-adjacent node controls the conduction state of the p-transistor. The four nodes of the DICE cell form a pair of latches in two alternate ways, depending on the stored logic value. In Figure 4.13, the adjacent node pairs X0-X1 and X2-X3 have active cross-feedback connections and form two-transistor, state-dependent latch structures. The other two adjacent node pairs, X1-X2 and X0-X3, have inactive feedback connections (i.e., cross-coupled transistors in "off" state) which isolate the two latching pairs. Hence, two "non-adjacent" nodes are logically isolated, and store the same data. They must be simultaneously reverted in order to upset the cell. DICE cell operation is analogous in many respects to that of two equivalent cross-coupled inverter latches. In a conventional latch, the stored logic state is enforced by two of the four transistors (i.e., N1 and P2 in Figure 4.14a) that compose the latch. Fig. 4.14 Basic cross-coupled inverter latch schematic (a), an externally-controlled half-latch (b) and DICE cell schematic (c) The other two transistors (P1, N2) are in a non-conduction state. They form a second, inactive latching feedback loop that is activated by an upset-induced transient voltage pulse (or by a write operation with opposite data). The regenerative feedback process consists of progressive activation of P1-N2 loop simultaneously with the progressive deactivation of N1-P2 loop. Then, if we remove the inactive feedback loop by providing external gate control to transistors N2 and P1, the latch structure thus obtained becomes intrinsically immune to upsets. The currently active feedback loop is only temporarily and partially deactivated by an upset, (i.e., a single node is temporarily flipped), and there will be no regenerative feedback. We denominate this latching structure as a Half-Latch, since it has a single internally stable (i.e., feedback-enforced) logic state. The opposite logic condition can be either externally controlled or dynamically conserved as a high impedance state. This simple circuit, also known as a CMOS thyristor, has been successfully used as CMOS ESD protection element in high density submicron technologies and as a compact, low power delay element with low sensitivity to power supply voltage and temperature variations [111]. Previous applications of this circuit are based on its fast or controlled activation ("latch-up") properties. Our application concerns coupled pairs of active and inactive half latches with transient fault immune operation, where the activation of an inactive half-latch structure is prevented by two adjacent, active half-latches. By linking two externally controlled half-latch structures (Fig. 4.14b) in a closed-loop feedback chain, we obtain a four-node symmetric circuit (Fig. 4.14c) consisting of four linked active and idle, inactive half-latches. It has two stable logic states, 1010 and 0101, hence a bistable storage function. The analysis of circuit behavior for both single-node and dual-node perturbations is of interest for circuit's operation, performance optimization and practical applications. Since DICE cell has full CMOS implementation, all the internal circuit nodes have both VDDreferenced and VSS-referenced drain-bulk isolation junctions of p-transistors and ntransistors, respectively. Hence, regardless of logic state, any internal node in a DICE cell may be flipped to the opposite logic level. Let's consider a positive transient voltage pulse at node X2 (Figure 4.13) induced by a heavy ion strike at P2 transistor drain. This perturbation forces transistor N1 into conduction and transistor P3 into non-conducting state. The second node, X3, of the active half-latch structure X2-X3, conserves its logic state dynamically in high impedance due to capacitive effects. Node X1 is temporarily connected to both $\boldsymbol{V}_{\text{DD}}$ and V<sub>ss</sub> through active conduction paths of transistors P1 and N1 and henceforth switches to an intermediate voltage level that induces the conduction of transistor P2 and reduces the conduction state of transistor N0. The second half-latch X0-X1 is only partially deactivated. Logic states at nodes X0 and X3 are not altered, and drive transistors P1 and N2 that restore the initial logic states at the perturbed nodes X1 and X2. Owing to circuit's symmetry, this analysis is valid for both logic states of any internal node of the cell. A write operation in a DICE cell requires to force two non-adjacent cell nodes to the same logic state in order to revert the logic state of the cell. This condition for cell write operation results from a dual-node perturbation analysis, whose results are listed in table 4.2. | Dual | Dual node perturbation | | | Output transition | |------|------------------------|----|----|-------------------| | X0 | X1 | X2 | X3 | Q0-Q3 | | 1 | ~ | ~ | 0 | ~ | | ~ | 0 | 1 | ~ | ~ | | ~ | ~ | 1 | 0 | 1010 | | 1 | 0 | ~ | ~ | 1010 | | 1 | ~ | 1 | ~ | 1010 | | ~ | 0 | ~ | 0 | 1010 | Table 4.2 Dual-node perturbation analysis for DICE cell write operation. Differential mode write operation at two adjacent nodes allows to perform a state-dependent write operation, i.e., only set or reset type functions may be implemented through differential cell access. Hence, DICE cell allows either four-node differential or two non-adjacent node single-ended write operation. Figure 4.15 presents typical n-passgate write access circuits. Figure 4.15 DICE storage cell with single-ended (a) and differential (b) write access It should be noted that, if two simultaneously sensitive nodes of the cell which store the same logic state (i.e., either nodes $X_0$ - $X_2$ or nodes $X_1$ - $X_3$ ), could be flipped due to the effect of a single particle impact, the immunity is lost and the cell is upset. The probability of occurrence for this event can be made very low if the transistor drain areas occupied by the simultaneously sensitive node pairs are topologically spaced on cell's layout, so that the critical charge amount can not be collected simultaneously at both nodes to upset the cell [70]. DICE cell operation relies on the principle of *dual node feedback control*, in order to achieve immunity to upsets. This means that the logic state of each of the four nodes of the cell is monitored by two adjacent nodes that separately control the p-channel and n-channel transistors in alternate conduction and blocking state. The previous analysis showed formally that whatever the electrical charge collected at the perturbed node is, the cell recovers its initial state. Electrical simulations are also used to illustrate this situation. They show that the recovery process is very fast (less than 1 ns). This is due to the fact that the restoring feedback function is embedded in the latch structure, without requiring the addition of oversized feedback transistors. This feedback is active both during storage cell's idle state and during read/write operation. Only small increases are added to cell node capacitances by the additional metal line wiring of the dual node feedback interconnects. Estimations of their effects on both circuit's performance and the recovery time have been done on a designed prototype using a 1.2 µm, two metal line CMOS/epi process from AMS. The contribution of the added metal wiring to the delay is less than 3%. Spice simulation results presented in Fig. 4.16 present the signal waveforms at the four cell nodes for both positive and negative upsets induced by triangular pulses of 50 mA amplitude, 200 ps duration and 50 ps rise time. The equivalent charge injected at the perturbed node is 5 pC. Fig. 4.16 Spice simulation waveforms: a) positive upset at node X1, b) a negative upset at node X2 # 4.4.2 Memory Array Configuration using DICE Cells The 12-transistor DICE memory cell implementation previously described and presented in Fig. 4.13 has an area overhead close to 100%, compared to a standard 6-transistor static RAM cell. It has no static power dissipation, but requires an increased word line driving capability and hence a small increase in the dynamic power dissipation. Additional design changes are required to adapt the word line routing, the write buffer drive capability and column pass gate width to the increased load requirements. When implemented in a high interconnect density technology with three metal layers, local interconnect and stacked contacts (as those usually employed for complex submicron designs), the added connectivity inside the cell will occupy a significantly lower area, and the overhead can be reduced close to 70%. This is obtained by keeping at minimum and hence reducing the area of the external interconnects. This cost figure is quite acceptable in order to obtain full immunity against upsets. Significant cost savings can be obtained through fast simple and reliable conversion of existing RAM cell arrays to DICE architecture. Two adjacent CMOS SRAM cells in a memory column of a standard, existing design can be directly converted to DICE cell by simply rewiring the internal feedback interconnects, without changing the transistor sizes. The DICE conversion methodology for a typical CMOS static RAM cell is graphically described in Figures 4.17 and 4.18. Using this technique, we have converted an embedded SRAM block design to DICE cell structure in a very short time. The dynamic performance of the memory array obtained is not affected by the changes, provided that a DICE word line will be driven using at least double current capability. This is easily obtained using enhanced word line drivers fitted within the area constraints of two pre-existing word line drivers. Fig. 4.17 Conversion of two standard 6-transistor SRAM cells to a DICE cell (a) Two adjacent standard SRAM cells (b) Their fusion into a DICE cell configuration Fig. 4.18 Layout of a standard 6-transistor CMOS SRAM cell (a) and the equivalent DICE cell (b) Additional changes are required in decoder logic, in order to halve its addressing space and to enhance the driving capability of the word line drivers. A simple and reliable decoder conversion algorithm has been developed, based on LSB address input suppression. A 1Kx8 SRAM circuit prototype has been designed using 1.2 µm two-metal CMOS/epi technology. The area comparison with a single cell layout can be seen in Fig. 4.19. The DICE cell area is 980 µm² and the overhead involved is 91%. However, for practical reasons, we conserved the RAM column height and two word lines per DICE cell for addressing in order to provide some interesting BIST capabilities and to address in parallel an unhardened block of RAM columns for performance comparison. The prototype has been processed, tested and characterized for hardness assessment with particles of different energies and with high energy pulsed laser excitation. A detailed analysis of the laser measurement techniques and the obtained test results is presented in Chapter 5. # 4.4.3 DICE Latch Design A compact passgate access DICE latch configuration is presented in Fig. 4.19a. It can be used as master and/or slave section in edge-triggered flip flop circuits, allowing to optimize circuit's silicon area, operating speed and power dissipation. The circuit uses weak feedback inverters N0-P0, N2-P2, and may be interconnected in application using input and output buffers. Clocked inverters (Fig. 4.19b) can be used for improved high speed operation. Fig. 4.19 DICE passgate access latch circuit (a) and clocked inverter latch circuit (b) # 4.4.4 Dual-Port Memory and Register Design using DICE Cells The availability of two separate data inputs on a DICE latch structure and the conditional write operation it achieves allow us to implement dedicated dual-port latch architectures which incorporate logic functions. Advanced low power and high speed CMOS design relies heavily on the use of conditionally enabled flip-flops, as they provide the ability to deactivate functional blocks that are not used. The basic property that can be exploited in a dual-port latch using DICE is the conditional write operation at data coincidence. A dual-port DICE storage element is presented in Figure 4.20. This circuit implements a synchronous C-element latch function when a single access clock is employed, and a coincidence-access dual-port register cell function when dual clock concurrent operation and synchronized memory access are of interest. Fig. 4.20 Dual-port register latch with coincidence access: Schematic, symbol and truth table A special dual-port latch circuit implementing data-dependent conditional write operation is presented in Figure 4.21. Fig. 4.21 Dual-port RS register latch with coincidence set operation: Schematic, symbol and truth table It has differential access operation on both access ports and performs unconditional reset and write 1 at coincidence. Single-clock, dual input implementation has subsequently a NAND gate function at the inputs without employing additional transistors and without adding propagation delays. # 4.5 Upset-Immune Flip-Flop Design using Timed Access Techniques High speed microprocessors extensively use pipelining techniques to increase throughput at low cost. Pipelining involves extensive use of sequential register stages by partitioning a process into n hardware stages separated by registers to hold the intermediate results. Adding transient fault tolerance to pipeline registers adds significant hardware cost, increased power dissipation and larger propagation delays. An effective method to reduce these drawbacks consists of implementing self-timed flip-flop operation with a single latch structure, instead of using the conventional master-slave flip-flop approach. # 4.5.1 Upset-Immune Flip-Flop Design with Sequential Access Control A self-timed, sequential access operation may be employed in a redundant latch structure by providing separate access to the master latch and to the redundant latch, respectively, using self-timed delayed clock circuits. The basic single-latch sequential clocking strategy can be easily explained referring to the flip-flop implementation of a HIT2 storage element presented in Figure 4.22. Figure 4.22. Write-Transfer HIT2 (WT-HIT2) redundant flip-flop circuit immune to upsets The two latch sections are separately controlled by two clock signals: a write access clock to the master latch, W\_CK, and a transfer clock T\_CK to the slave latch. Input data is transferred to the output on the rising edge of the transfer clock if the write access clock concurrently activates the master latch at the same instance. This write/transfer (WT) flip-flop circuit achieves high speed edge-triggered operation with two clock phases derived with delay inverters as shown in Figure 4.22. A somehow different access method is required to implement edge-triggered access operation to a symmetric, DICE redundant latch with dual-node control. The solution we have adopted consists of partitioning the four nodes of the latch in two sets and providing sequential access to these two groups of nodes, controlled by two clocks, a master write/precharge clock and a slave transfer clock. In contrast to HIT2 approach previously described, we have two node partitioning options: either using a single-node dynamic master partition or selecting a two-node master partition multiplexed with an equivalent two-node write access partition. The schematics of sequential-access (SA) DICE flip-flop circuits with single-node dynamic and dual-node multiplexed master section are presented in Figures 4.23 and 4.24. Clocked inverter approaches applied to upset-immune latch design lead to series transistor current paths from the internal nodes to $V_{DD}$ and $V_{SS}$ that may affect circuit performance. On the other side, dynamic storage operation of the circuit in Figure 4.23 sets the flip-flop output node to high impedance state during the active write clock pulse. The impact of these changes on the dynamic upset hardness of the storage element has been experimentally analyzed on a prototype circuit designed using an advanced deep-submicron CMOS technology. Figure 4.23 Sequential access DICE flip-flop with single-node (X1) master section Figure 4.24 Sequential access DICE flip-flop with dual-node (X1-X3) master section # 4.5.2 Prototype Chip Design and Radiation Test Results A test chip termed DEEP1 has been designed and fabricated in a commercial $0.25\mu m$ CMOS technology in cooperation with CERN Geneva. The circuit includes three shift registers: a static 2048-bit register, a dynamic 1024-bit shift register and a 2048-bit static register using the SEU-hardened DICE flip-flop architecture of Figure 4.23, that employs a single latch section to achieve dynamic master-slave operation. The registers are used to measure the performance of sequential-access DICE flip-flop, its SEU hardness, and to compare them to those of standard, non-hardened static and dynamic shift registers. A global two-phase clock generator has been used in the designed prototype to control the SA-DICE register operation. The size of flip-flops was $18 \times 16 \mu m$ for the dynamic, $33 \times 16 \mu m$ for the static and $50 \times 16 \mu m$ for the hard, this implying an area overhead of only 50%. All the shift registers used radiation-tolerant layout practices: all the NMOS and most of the PMOS were designed with enclosed geometry, and guard rings surrounded all the NMOS devices (see Annex D). All together, the three shift registers contain some 150 000 transistors, and occupy an area of about $2.7 \mu m^2$ . SEE tests were performed at the 88" cyclotron of Lawrence Berkeley Laboratories, California, using a heavy ion beam at room temperature. The particle LET was changed by selecting the ion species (Nitrogen, Neon, Argon, Copper, Krypton and Xenon) and by tilting the device up to 55° relative to the beam line. In this way, LETs varying from 3.2 to 89 MeV cm²mg⁻¹ were obtained. Total dose behavior of the employed technology was also tested through X-ray irradiation up to 30Mrad (SiO₂), showing continuous register operation and low levels of parameter degradation [94]. This demonstrated that the use of radiation tolerant layout practices in deep submicron CMOS allows us to extend the tolerable total dose level well beyond the inherent technology limit [107]. No SEL was observed during the whole irradiation campaign. Similar results had previously been obtained even on standard (without guardrings) structures integrated in the same technology [108]. Static SEE measurement performed at a supply voltage of 2V, without exercising the static registers during irradiation, showed a LET threshold of about 15 MeV cm²/mg for the standard, unhardened register. The experimental cross-section curve is presented in Annex D. The SA-DICE register began to experience upsets only starting from the highest available LET of 89 MeV cm²mg⁻¹. Even at that high LET, the cross-section was measured to be lower than 10⁻⁵ cm²/bit. This validates the hardness of the compact edge-triggered flip-flop architecture for static operation. Dynamic upset hardness is also essential for advanced synchronous sequential systems exhibiting high switching activity levels [113-115]. Dynamic SEE tests have been performed on DEEP1 prototype chip by exercising the static registers at a constant frequency of 2.5 MHz. The hardened register experienced a considerable number of errors when the LET was increased to 5.6 MeV cm<sup>2</sup>/mg, which represents about twice the LET threshold of the dynamic register. This negative result is confirmed by circuit analysis through electrical simulation of dynamic upsets during register access. The analysis showed that upsets may occur for half of the clock period when clock line is active, and pulse mark/space ratio is 1. The output node of the flip-flop is in high-impedance state, and replaces the suppressed slave latch. The parasitic capacitance at the output node is about twice that of the unhardened static flip-flop, this explaining the ratio 2 between their LET thresholds. Two alternative design solutions may be devised to eliminate or reduce the dynamic upset sensitivity and still use a single-latch edge-triggered upset-immune circuit to circumvent full latch duplication. One technique consists in avoiding the high-impedance output state during write using a dual, multiplexed master section as shown in Figure 4.24. This leads basically to a 6-node flip-flop structure, with two redundant pairs of multiplexed master nodes. The two remaining slave nodes alternately form a DICE loop with one of the two master node pairs. The other alternative, which may also be used complementarily with the first one, consists in implementing self-timed flip-flop operation to control and reduce the active write access pulse duration to a minimum value, comparable to the metastability period. # 4.5.3 Upset-Immune Sequential Cell Library Design We have initiated a practical implementation approach to upset-hardened storage elements as dedicated standard cell libraries in total-dose and latchup-resistant commercial submicron CMOS processes. A preliminary version of such a standard cell library has been recently designed and characterized using a 0.6 µm CMOS bulk/epi process from AMS, and is currently embedded in an ASIC implementation of a digital modem for Spanish LEO Communication Satellite program NANOSAT. The designed library consists of basic flip-flop cells with and without asynchronous set/reset inputs and scan flip-flops with input multiplexers. Sample preliminary data sheet specifications for library elements based on worst-case simulations are presented in Annex E. Standard cell library design requires layout optimization within specific topology constraints [106,112]. A basic constraint is the cell height and the well/substrate area distribution inside the cell due to tight cell abutting requirements. The cell height of the AMS 0.6 µm CMOS standard library ensures relatively large margins for increased transistor sizes. Our objective was to implement both performance and reliability-driven cell layout optimization, by reducing the degradation of performance at the expense of added transistor area and power dissipation and by taking into account specific layout design rules to enhance dual-node upset protection and latchup avoidance. Though the hardened SA-DICE and WT-HIT2 flip-flop schematics employ complex gate structures with serial-parallel current path transistor topology, a linear matrix layout architecture [116] has been adopted, mainly due to the cell height constraint. Both n and p transistors are laid out in single rows. However, they are only partially aligned at the common gate connection, due to the crossed dual-node gate control. Subsequent alignment is performed at the common drain connection. The diffusion areas of the simultaneously sensitive drains and the pair of clock signal inputs are reliably spaced and separated by diffusion strips of other transistors in the cell, as shown in the sample layouts in Annex E. This adds a small overhead of less than 5% in the interconnect area, which we estimated in our case as having no significant impact on circuit performance. However, it should be noted that larger interconnect overhead may induce a measurable impact on performance for deep submicron process implementation. Post-layout transistor sizing iterations have been required for correlated optimization of cell timing (i.e., propagation delay, hold time, minimum clock pulse width) and power dissipation. Due to the duplication of the cell nodes and the input clock signals, the parasitic capacitance and the switching activity of the cell (hence, the power dissipation) has doubled. To compensate the corresponding increase in timing delays, the cell access buffers have been oversized, with further negative impact on area and power budget, in order to keep the added propagation delays within 10 to 30 percent limits. A comparison table of transistor sizes, gate count and power dissipation between the basic AMS standard cell D-flip-flop circuits and the designed SEU-hardened cells is also given in Annex E. Transistor sizing and performance optimization has been performed for the basic D-flip-flop cell and then extended for the added functions (i.e., asynchronous preset and clear inputs and the scan input multiplexer). The algorithm employed consists of optimizing the power and area under timing constraints by selecting gate sizes with predefined p/n transistor ratios at the inflexion point of the cell delay vs. transistor width characteristic. Layout-level DFT rules have also been implemented in the cell design to reduce glitch occurrence probability [117]. We have defined a set of transistor placement algorithms, topology optimization transformations and enlarged spacing rules for SEU and SEL sensitive areas that could be further applied in an automatic fashion to guide and optimize the cell design process. Ongoing research is directed to provide a compact and consistent set of layout design rules for reliability that could be embedded into automatic layout synthesis tools [118,119]. #### 4.6 Conclusions In this chapter we described CMOS design strategies for transient fault tolerance at circuit level, with process independence and unconstrained effectiveness for low voltage, high speed, deep submicron CMOS system applications. We put major emphasis on specific design techniques for CMOS storage element immunity to radiation-induced single event upsets. We briefly analyzed the main constraints imposed by space radiation environment on advanced CMOS system operation. The limitations of the conventional design hardening techniques against SEU for advanced deep submicron CMOS ICs are analyzed. We describe novel design techniques for upset-tolerant storage elements to overcome these limitations, as well as the performance drawbacks of the system-level upset tolerance technique described in Chapter 3. These techniques employ local circuit redundancy at lowest granularity level (i.e., combinational logic gates and sequential storage elements). Two novel design hardening principles are defined and experimentally validated on fabricated and tested prototypes. First, a dual-node control principle is applied to the internal nodes of redundant CMOS latch structures with single-transistor supply current paths. The upset-hardened memory cells and latches implementing this principle achieve high static upset tolerance with no constraints on transistor sizing and negligible impact on circuit performance. Upset-tolerant circuit design techniques using dual-node control latches have been employed to develop a low capacity memory macrocell and a basic library of sequential standard cells using commercial unhardened 1.2 mm and 0.8 mm CMOS processes. These designs have been validated with radiation tests on memory and register prototypes, and can be fast and reliably embedded in upset-hardened system applications. They have minimal dependence on transistor size ratios and parameter variations. The new design hardened storage cell can also be implemented in other complementary submicron technologies such as CMOS-SOI and C-GaAs, thus ensuring low-cost SEU-resistance, and taking full advantage of other radiation hardness features (e.g., total dose and latchup resistance). A second hardening principle addresses the dynamic upset tolerance constraints, i.e., achieving upset immunity in high-speed, high switching activity systems during memory access and register transfer cycles. For static RAMs, this consists of applying specific Dual port addressing modes are proposed for upset-immune static RAMs that allow SEU-immunity assessment through electrical testing and avoid access mode induced upsets. Security coding applications employing dual-port data registers with upset-immune storage elements are also proposed. Compact designs of upset-tolerant edge-triggered flip-flop designs are described. They employ single-latch structures with self-timed, delayed write operation, and multiplexed dual feedback loops. We reported radiation test results showing an upset LET threshold of more than 80 MeV cm $^2$ /mg on a prototype chip fabricated in a commercial 0.25 $\mu$ m CMOS process. An upset-hardened library of sequential cells has been designed using a 0.6 $\mu$ m CMOS process. # Chapter 5 # **Diagnosis and Qualification Testing of SEU-Hardened CMOS Architectures** #### 5.1 Introduction Fault-tolerance properties of CMOS fault-tolerant circuit architectures based on local feedback are unverifiable at system level. They must be validated at manufacturing stage. Conventional testing techniques can hardly be applied to fault-tolerant architectures, since they lead generally to unacceptable overhead. The practical solution actually employed to assess quality, reliability and fault tolerance properties of ICs in nuclear and space applications consists of product qualification in severe environment. The drawbacks of this analysis are its excessive length, high cost and the lack to identify the causes and cures of detected or other possible, undetected weaknesses. Environment-induced faults have statistic occurrence in time and space. Deterministic test means are required to accurately and reliably monitor the fault mechanisms for product qualification and reliability improvement. Hardware and software fault injection techniques are typically employed to test redundant system architectures [50]. Dedicated fault injection tools and standardized test procedures have been developed to test CMOS VLSI ICs for space radiation environment applications. Two basic methods are used to physically inject upset faults in microelectronic systems: heavy ion tests and pulsed laser beam tests. They emulate the effects of high energy galactic particles on sensitive semiconductor areas and are extensively used for radiation hardness assessment and characterization. In this chapter we present the basic principles of using a laser pulse for accurately controlled charge injection in sensitive silicon areas and its main advantages and limitations compared to accelerator-based heavy ion testing. We describe measurement and investigation algorithms for topological analysis of upset-hardened IC prototype layouts, and analyze practical results obtained applying these algorithms to design-hardened storage cell architectures described in Chapter 4. The results obtained allowed us to assess the hardness variability induced by layout topology. This helped us identify, characterize and model dual-node upset mechanisms, and improve fault tolerance characteristics of design-hardened CMOS latches using optimized topologies. # 5.2 Pulsed Laser Testing Accelerator testing represents the standard method used to characterize the sensitivity of modern device technology to single-event effects. However, it is both expensive and not easily accessible, and has provided only limited spatial and temporal information. During the past few years, the pulsed (sub-nanosecond) laser has been successfully used as an alternative technique to characterize SEE behavior in both memory and logic circuits [56-58] and also as a tool to understand fundamental charge collection mechanisms in semiconductor devices [59]. A laser pulse excitation is applied to a sensitive node in a CMOS IC. The charge deposition process can be accurately controlled and synchronized with circuit's operation, allowing to characterize its electrical behavior. Laser generated tracks in semiconductors are a good approximation to that generated by ion strike. The laser technique provides complimentary information to that obtained from accelerator testing, with some unique characteristics and capabilities that offer several distinct advantages over accelerator methods. The laser equivalent LET for silicon based on one photon absorption is approximately 5 MeV-cm²/mg, for an applied intensity of 1 GW/cm². Reference pulse energy considered is 100 pJ, pulsewidth = 10 ps, laser spot = 1 $\mu$ m², and the wavelength is 1 = 1.06 $\mu$ m. At a wavelength 1 = 0.8 $\mu$ m in silicon, laser light would still have a 1/e penetration depth of approximately 10 $\mu$ m. Such a penetration depth is sufficient for testing modern devices built on epilayer (typically, with a thickness on the order of 10 $\mu$ m), since charge deposited in the highly doped substrate is not efficiently collected. A working wavelength between 0.80-0.85 $\mu$ m (1.45-1.55 eV) is easily generated with conventional dye lasers, or with new titanium sapphire laser technology. A relative deep penetration distance (10-20 $\mu$ m) is still preserved. The amount of charge collected across a diode junction irradiated with a laser pulse depends on the total beam energy to the 4/3 power [56]. Using the laser method we have control over both the laser wavelength and the laser pulse width (which adjust the laser intensity). The use of a pulsed laser beam as a test and hardness assurance tool to simulate single-event effects (SEE) such as upsets (SEU) or latch-up (SEL) in microelectronic devices have been recently analyzed by several researchers [56-57]. The published results show that a good correlation may be obtained between the sensitivity thresholds for equivalent phenomena induced by laser pulse and by heavy ion impact. Pulsed laser measurements could then be used to replace costly and time-consuming heavy-ion ground tests. Laser measurement tools are very convenient, since their setup time is very short, no vacuum is required for testing, the LET can be changed by changing the light intensity and there is no ionizing radiation threat. Laser beam measurements can assess the hardness level and may ensure a faster detection and diagnosis of radiation sensitivity weaknesses. The capability of tightly positioning the laser beam with submicron accuracy improves the diagnosis and enhances the analysis results for individual transistors within the circuit. The laser pulse can be used for characterizing sites on the same chip having specific sensitivities to radiation impact by adjusting the laser intensity. Moreover, the laser pulse can be synchronized with circuit clocks and, thus, can provide temporal information and can be used to characterize the influence of dynamic circuit operation on its sensitivity to radiation. Although laser pulses can damage microelectronic devices at high enough optical pulse energy or average intensity, most laser induced SEE measurements are performed at low enough laser pulse energy or low enough average optical intensity that no laser-induced damage occurs. This allows successive tests to be performed to simulate a wide range of particle energies without cumulative damage such as the lattice displacement damage that occurs during heavy-ion testing. These capabilities comprise a powerful diagnosis tool that we have used to validate and characterize radiation-hardened CMOS structures, to identify their weaknesses and find their cures. Nevertheless, laser testing has also a small number of limitations: they cannot provide an absolute measure of SEE threshold, they give no measure of the asymptotic cross-section and it cannot access sensitive silicon areas covered by metal. All these make laser testing a powerful and effective technique to replace or complement heavy ion testing. Simulation of heavy-ion induced SEP with a pulsed laser has proven of benefit in several recent studies on both specially designed test chips and on commercially available components [58-59]. The results suggest that the pulsed laser can be used for hardness assurance measurements on devices with sensitive areas larger than the footprint of the laser beam [60]. # 5.2.1 Laser-Generated Charge Tracks in Silicon A laser pulse used to perform SEE measurements must have a maximum intensity sufficiently small so that linear optics adequately describes the light-matter interaction. In the linear regime, both laser beam propagation and loss are adequately described by the material's linear susceptibility. Laser spot size and material absorption are readily determined from the well-known linear refractive index, n, (n=1.5 in $SiO_2$ passivation layer, and n=3.65 in silicon), and linear absorption coefficient, $\alpha$ . The most widely encountered laser beam is one where the radial intensity distribution is Gaussian. The propagation of Gaussian beams in linear media may be derived from the wave equation. The beam radius $\omega(z)$ is determined by the point at which the laser intensity decays to 1/e of its value on axis, and is given by the following equation [58]: $$\omega^{2}(z) = \omega_{0}^{2} ? 1 + \frac{\lambda z}{(\pi \omega_{0}^{2} n)^{2}}$$ (5.1) where z is the propagation distance into the medium, $\omega$ is the 1/e radius of the laser spot at the surface (focus), and $\lambda$ is the wavelength. There are two parameters that are important to SEE measurements. The first is the confocal length, $$z_0 = \pi \omega_0^2 \frac{n}{\lambda} \tag{5.2}$$ representing the penetration length at which the beam diameter has expanded to $2^{1/2}$ times its value at the focus. The second is the laser spot diameter, $2\ \omega_0$ , which is determined by the optical focusing apparatus, and is written as $$\omega_0 = 4f \frac{\lambda}{\pi D} \tag{5.3}$$ where f is the focal length of the lens used to focus the laser beam, and D is the diameter of the laser spot illuminated on the lens. The charge density (neglecting absorption) decreases by a factor of 2 at a penetration depth equal to the confocal length. For silicon, assuming a laser spot diameter at the surface of 1 $\mu$ m, and a wavelength of 0.8 $\mu$ m (which are typical experimental values), the confocal length is about 3.6 $\mu$ m (compared to $z_0 = 1$ $\mu$ m in air). In the linear regime, material absorption is described by Beer's law, which is expressed as $$I(z) = I_0 ? e^{-\alpha z} \tag{5.4}$$ where $I_0$ is the intensity entering the absorbing material, and $\alpha$ is the linear absorption coefficient. For simplicity, both temporal and radial dependencies of $\alpha$ have been suppressed. Equations (5.2) and (5.4) indicate that both material absorption and spreading of the beam lead to a reduction in the charge density created by a laser pulse as it propagates into the device. In contrast, compared to a laser pulse, a high penetration-depth ion generates a relatively constant charge density. The ion produces a much more highly peaked radial charge distribution than the laser. The different radial charge distribution generated by the laser track can affect the total charge collected. #### Surface Reflections The amount of light entering the device will depend on the nature of the device surface and on the presence of passivation layers. The transmission of light into the silicon substrate may be calculated by the formula [65] $$I_T = I_0 ? \frac{T^2}{1 - R^2} ? \frac{1}{1 + F \sin^2(\Delta/2)}$$ (5.5) where $I_T$ is the intensity transmitted to the silicon substrate, and $I_0$ is the intensity incident at the passivation layer. Also, $T = t_1 t_2$ and $R = r_1 r_2$ denote the transmittance and reflectance, respectively. Variations in passivation layer thickness cause the transmission to vary over a range of +/-16% about the median value. Typical +/- 10% variations in oxyde thickness induce a 7% variation in transmission, which is of no significant concern for SEE measurements. #### Nonlinear absorption At high beam energies necessary for SEE experiments, the relationship between light intensity and carrier density is no longer linear. The phenomenon which appears is the two-photon absorption (TPA). However, the majority of laser testing typically falls into a regime where the nonlinear contribution, $\beta I_0/\alpha$ , is sufficiently small so that TPA may be ignored. A working wavelength of 1.06 $\mu$ m for testing silicon devices has been previously considered to be an optimum choice because of the deep penetration depth (700 $\mu$ m). In various recent works [58, 59, 66], a working wavelength around $l = 0.8 \mu$ m is argued to be a more optimum choice. The energy bandgap depends on the carrier density through two effects: bandgap renormalization and the Burstein-Moss shift [67]. High doping levels reduce the bandgap through its renormalization. The Burstein-Moss shift increases the effective bandgap because carriers at the bottom of the conduction band in n-type material fill up states that are then not available for electron transitions from the valence band. The net effect is a bandgap reduction, which increases the absorption as the carrier density (doping) increases. At very high carrier density, light may be absorbed efficiently by free carriers. Free carrier absorption (FCA) effect is important only when the plasma frequency is comparable to the frequency of light (10<sup>15</sup> s<sup>-1</sup>). Considering the case of silicon, it has been concluded [58] that FCA may be ignored. FCA may also occur by carriers generated by laser light, i.e., the leading edge of the laser pulse, if sufficiently intense, can generate carriers which then absorb an appreciate fraction of the trailing edge of the pulse. #### 5.2.2 Laser vs. Ion Charge Collection When a charge track with a density greater than the doping density of semiconductor passes through a junction, the electric field associated with the junction is distorted, producing a funnel that gives rise to additional charge collection [55]. The track density determines the length of the funnel, which, in turn, determines the amount of charge collected in excess of that deposited in the depletion region. Funneling plays a role in charge collection for both ions and laser light [54]. In cases where funneling contributes a significant amount of charge to ion-induced upsets, larger equivalent LETs will be needed for the laser. This is one reason why the laser cannot replace the accelerator for determining *absolute* values for the SEE thresholds. At the high carrier density generated by an ion track, or with an intense laser pulse, it is also expected that Auger recombination becomes significant. # 5.3 SEE Hardness Analysis using a Pulsed Laser Our study and experimentations described in this chapter concern using a pulsed laser to validate and compare SEU-hardened CMOS SRAM cell architectures on several CMOS circuit prototypes. In the past this analysis could not be completely performed using heavy ion tests, due both to the occurrence of latchup phenomena and the architecture of the test vehicle. Indeed, the results given in [61] show that until very recently, devices were more sensitive to SEL than to SEU for particles having a LET higher than 25 MeV/mg/cm<sup>2</sup>. Owing to the small number of memory cells implemented in our prototypes (1k bit RAM arrays and 64 to 2048 cells for each shift register bank) very high particle fluences were needed to have chances to get SEUs. In order to avoid total dose effects, we could exercise only partially the SEU-hardened cell designs implemented on chip. The measurements performed using the pulsed laser helped locate and explain the behavior of the latch-up sensitive areas. It allowed us to characterize hardness performance of various storage cell designs for a wide range of impact energies. #### Laser Measurement Setup The operating principle of the equipment used to perform the laser measurements is shown in Figure 5.1. A Rhodamine 6G dye laser operated in a cavity-dumping mode is synchronously mumped by the output of an actively mode-locked, frequency-doubled Nd:YAG laser. The laser output parameters were: (a) 600 nm wavelength, (b) 180 mW average power, (c) 10 psec temporal optical pulse duration and (d) 5 MHz pulse repetition rate out of the dye laser. An electro-optic shutter (EOS) was used to reduce the repetition rate of pulses incident upon the device. The measurements were performed with the EOS operated in single-shot mode. After the EOS, a small portion of the beam is directed by a beam splitter (BS) in front of the periscope of a fast silicon PIN photodiode PD. The transient electrical response of the PIN photodiode, with the amplitude linearly proportional to the optical pulse energy, is monitored using a digital storage oscilloscope. The laser beam is then attenuated and passed through a periscope incorporating a microscope so that the laser beam could be focused on the sample at less than 2 $\mu$ m spot size [70,72]. Fig. 5.1 Block diagram of the laser system used for SEE analysis [59] BS – beamsplitter, EOS – electro-optic shutter, M – mirror, PD – photodetector, SHG – optical second harmonic generating crystal, VA – variable optic attenuator. A dye laser mirror is used in one portion of the periscope to allow incoherent light from a white light source to illuminate the sample colinearly with the laser beam. A second glass beam-splitter in the periscope allowed the image of the illuminated sample to be viewed through an eyepiece or imaged onto a CCD camera. Pictures of the illuminated portion of the device are obtained from the CCD output with the aid of a frame-grabber. The position of the device is controlled using a two-axis computer-controlled micro-positioning system with a repeatability of $0.1~\mu m$ . # 5.3.1 Pulsed Laser Analysis of HIT Chip Our first analysis concerned a previously designed prototype that includes several shift registers with upset-hardened CMOS storage cell designs [61]. The circuit failed the heavy ion tests due to an early latchup phenomenon. The analysis of the test circuits using laser pulses allowed the detection of topology-dependent upset mechanisms specific to various layout topologies in several upset-hardened storage cell designs. These mechanisms involve simultaneous excitation, with a single laser pulse, of two sensitive nodes that are closely located. When reverted, the two nodes upset the hardened cell. A bipolar lateral conduction process collects residual charge in a closely located MOS transistor channel, and inverts its state. This dual-node upset mechanism has been detected using the 2 $\mu m$ spot diameter of the laser pulse. Even though substantial areas of each cell were covered with metal, there were always areas close to the drains that could be pulsed with laser light. If the devices were sensitive to a single-node type upset, then we should be able to produce that by exciting the sample near the sensitive node. We did not observe such effects. Figure 5.2. Layout of the HIT chip: The top area contains HIT1, HIT2 [61] and standard SRAM blocks, respectively. The bottom area contains the SRAMs blocks using designs by Rockett [62] and Liu [63]. The prototype circuit used for pulsed laser testing has been designed for the characterization of two new hardened storage cell architectures, called HIT1 and HIT2 [61]. It has been fabricated using 1.2 $\mu$ m CMOS bulk-epi process from Thomson-TCS. The circuit (shown in Fig. 5.2) contains five register banks of 8x8 bits each. The first two banks are implemented using the two hardened storage cells, HIT1 and HIT2. The three additional sections are built, for comparative analysis, using conventional, unhardened memory cells (UMC section) and using the design hardened memory cells proposed in [62] (Rockett) and [63] (Liu). These designs are briefly described in Chapter 4 of the thesis. Heavy ion tests performed on HIT1 cell with low energy Argon 190 MeV particles (with a LET of 15 MeV/mg/cm²) led to no upset when exercised with a total fluence of $3x10^8$ particles/dice. For this LET, the implied SEU cross-section is less than $3x10^{-9}$ cm<sup>2</sup>. Higher energy particles could not be used, due to circuit's sensitivity to latchup (cross-section of $2x10^{-4}$ cm<sup>2</sup>/device for a LET of 25 MeV/mg/cm<sup>2</sup>) [61]. Pulsed laser tests have been employed to evaluate hardness features of the designed storage cells at higher energies. They also helped identify the cause of the unexpected latchup at low particle energies for a commercial bulk-epi CMOS process supposed to be resistant to latchup. #### **Experimental Conditions** The dye laser system is used to generate pulses of 600 nm wavelength and 10 psec temporal optical duration, operating in a single-shot mode for critical energy measurement at sensitive sites or with a high repetition rate when chip area is scanned for sensitive site detection. The absorption depth of the 600 nm light at which its intensity decreases at 1/e is 1.8 $\mu$ m, which is satisfactory for the excitation of the burried well-to-substrate junctions. The diameter of the focused laser spot is focused for each measured chip to approximately 2 microns at the sample surface. The chip was connected to a test fixture that exercised all the registers. The device biases and clocks were controlled using a memory board interface tester. Two measurement algorithms have been employed: A SEP detection algorithm was employed first, consisting of the exploration of the chip's surface with a scanning laser beam in the high repetition-rate mode. Large step increases in the pulse energies are used, to ensure quick detection and location of the single-event phenomena. A subsequent SEP characterization algorithm is then performed for each detected sensitive site. This consists in determining the laser pulse energy threshold that triggers the event, with the pulsed laser beam in the single-shot mode. Upset and latchup thresholds are measured with low step increases in laser pulse energy, observing the minimum pulse energy ( $E_0$ ) needed to produce the corresponding SEP. This measurement was performed for different locations in the sensitive site area. Also, by positioning the beam on the same location in several cells, low variations in the pulse energy threshold (generally at most +/- 1 pJ), have been observed for the tests performed. The reflectivity of the sample was measured at the site of each sensitive node. The measured reflectivity was taken into account in reporting all laser pulse energy thresholds. #### **5.3.2** Latchup Diagnostic Our first experiment concerned the diagnosis of the sensitive locations responsible for the latchup at low incident energies. A single site, with a large sensitive area, and a threshold energy of 11.5 pJ, has been identified in the peripheral read circuit of the storage cells, in all the 5 banks of registers (Fig. 5.3). The analysis of the layout showed that this sensitive area is located between two lines of N and P transistors having a minimum spacing of 3.3 $\mu m$ (not violating the design rules) between the well edge and the N-transistor drain junctions (Fig. 5.4). Two U-shaped contact diffusions encircle the MOS transistor area. The P-MOS transistors are located in the upper side, and the N-MOS transistors in the lower side. The two A markers in Fig. 5.4 indicate the horizontal separation edge between the well and the substrate. The latchup sensitive site detected by the laser in the center of the A-A line is distantly spaced at 16 $\mu m$ from the substrate contact and at 22 $\mu m$ from the well contact diffusions. Figure 5.3. Laser-induced latchup site (arrow) in the peripheral read circuit of the HIT1 cell. Figure 5.4. Layout drawing of the read circuit with the latchup site (X). Figure 5.5. Rockett cell structure sensitive to SEL. This large spacing which is typically encountered in most of the existing commercial standard cell designs, represents a design weakness for radiation environment applications. Based upon previous measurements of laser-induced and energetic particle induced SEL in CMOS test structures [59] we have found that the energy threshold of the laser pulse correlates well to the LET of 25 MeVcm²/mg obtained during heavy ion tests for this single, dominant latchup site . Figure 5.6 Layout drawing of the latchup sensitive area of the Rockett cell. Several additional latchup sites with much less sensitive areas (i.e., higher energy thresholds) have been also identified in other areas of the circuit. As an example, the closest threshold site to the previously described one, with an energy threshold of 14 pJ, has been found inside the Rockett cell (see Figures 5.5 and 5.6). Here, the A-A critical edge of the well area separates two vertical chains of N-MOS and P-MOS transistors. The analysis of the design revealed the same, minimum 3.4 µm well spacing to N+ diffusion and 13µm/18µm maximum spacings to the substrate and well contact junctions, which are represented with black rectangles inside the storage cell, for easy visibility. By correlating the corresponding differences in well to N-drain spacing with he variations in the energy threshold for different laser target points, we obtain useful data to provide typical rules for design improvement in order to ensure latchup-free operation at highest impact energies we could attain (i.e., about 300 pJ). # **5.3.3 SEU Sensitivity Analysis** Subsequent search algorithms have been applied to characterize the SEU hardness of the storage cells. A low speed scan of each storage cell area has been combined with a "directed" search" algorithm exploring the transistor drains detected as upset-sensitive and their surrounding areas. As expected, four SEU-sensitive sites were detected in the standard, non-hardened UMC cell, but also SEU-sensitive sites were detected in the HIT1 cell (Fig. 5.7) intended to be SEU immune [113]. Figure 5.7. Four laser-induced SEU sites in the HIT1 cell. Sites 1 and 4 correspond to SEU with the cell in the "high" state. Sites 2 and 3 correspond to SEU with the cell in the "low" state. Figure 5.8. Microphotograph of SEU site1 shown in Figure 5.7. Note that the image is vertically mirrored. All the measurements have been performed in static mode (i.e., the laser pulse excitation is applied to the storage cell during its idle state). No significant difference has been observed in this case when performing the tests in the dynamic mode. The results obtained are presented in Table 5.1. | Cell | Provoked event | Site | Energy<br>threshold | |--------------|----------------|------|---------------------| | Read circuit | SEL | | 11.5 pJ | | Rockett | SEL | | 14 pJ | | Std (UMC) | SEU | 1 | 5.5 pJ | | Std (UMC) | SEU | 2 | 15.5 pJ | | Std (UMC) | SEU | 3 | 11.5 pJ | | Std (UMC) | SEU | 4 | 16.0 pJ | | HIT1 | SEU | 1 | 12.0 pJ | | HIT1 | SEU | 2 | 16.5 pJ | | HIT1 | SEU | 3 | 33.0 pJ | | HIT1 | SEU | 4 | 21.0 pJ | | LIU | SEU | QP | 11.5 pJ | | LIU | SEU | QN | 35.7 pJ | Table 5.1. Laser-induced SEP test results. The conventional cross-coupled inverter cell used as reference showed different upset thresholds at each of the four sensitive sites (i.e., two sites for each logic state). A relatively large difference in the SEU-threshold for the two N-transistor drains has been observed, that is explained by the metal covering of the sensitive area and the topological non-symmetry of the spacing distance to the substrate contact diffusion. The most important task has been to identify the mechanism of the upset that occurred at a relatively low incident energy in a storage cell supposed to be immune to upsets. Moreover, since the HIT1 cell architecture and topology are symmetrical, the detection of a single SEU-sensitive site required also further investigation. The schematic diagram of the HIT1 storage cell is presented in Figure 5.9. Figure 5.9 Schematic diagram of the HIT1 cell. P3...P6 and N3-N4 provide the SEU immunity In the layout drawing of the storage cell presented in Fig. 5.7, we have indicated as "1" the location of the initially detected SEU site. The correlated analysis of both the layout and the schematic showed that the upset phenomenon occured when the laser pulse excited the neighbouring area of P2 transistor drain (i.e., node Q\* on the schematic in Fig. 5.9). This node has been proved to be insensitive to upset by simulation under any level of collected charge. However, an upset could occur if both nodes Q\* and A are inverted. This could be possible if residual charge is collected at node A when a particle hit or an incident laser pulse traverse a sensitive region located between the nodes Q\* and A, and enough charge is collected at the two nodes to invert their state. This will put into conduction both N1 and N3 transistors, inverting the logic state at node Q and flipping the cell. Excitation of a region of the sample with either a laser pulse or an energetic particle produces a dense, highly non-equilibrium electron-hole plasma. This plasma, the source of SEP, evolves rapidly as carriers recombine, diffuse from regions of high density, and drift under the effects of built-in, applied, or SEP-induced fields. Thus, the ability of charge generated by either a laser pulse or an energetic particle to produce dual-node upset is highly affected by the topology of the circuit. Electrical simulations with HSPICE using two charge injection sources at nodes Q\* and A confirmed also the fact that a much lower critical charge is required at node A (i.e., 0.15 pC, compared to 0.55 pC at node Q\*) in order to rise its potential at the switching threshold for transistor N3 and flip the cell. Three-dimensional device simulations reported in [64] for charge collection phenomena at two adjacent node junctions spaced at 4 um in bulk CMOS silicon substrate showed a relatively high effectiveness in charge collection at the secondary node (about 15%) when a particle hit occurs at 2.8 $\mu m$ distance. In our case, the spacing between the two sensitive junctions is 7.4 $\mu m$ . This represents the length of two transistor channels and the width of the common P+ source junction connected to $V_{DD}$ . A simplified cross-section of the sensitive area is presented in Figure 5.10. Surprisingly, the existence of the P+ source junction between the two sensitive nodes, instead of acting as a barrier to positive charge diffusion through n-well to the P+ drain of node A, helps enhance the charge collection process at this node. The ionization currents driven by the bipolar vertical transistor between the common source junction and the substrate will trigger a leakage current through the bipolar lateral channel transistor to the node A. This node has been brought into a floating state by the high transition at node Q\* (connected to the gate of P3 transistor which links the node A to the WR clock line). Additionally, the positive transition at node Q\* will also inject a positive transient pulse at the node A through the gate to drain coupling capacitance. The two symmetrical pairs of sensitive nodes identified in the layout drawing in Figure 5.7 have been subsequently exercised with tightly focused laser beam pulses. Upsets have been detected for different pulse energies applied at the sites marked 1-4 as shown in Figure 5.7. Figure. 5.10 Cross-sectional view of the area sensitive to dual node upset mechanism. The threshold energy levels are presented in Table 5.1. Sites 2 and 4 are better covered with metal, this explaining the higher SEU energies. It should be noted that though site 3 exhibits an open drain area (i.e., not covered by metal) of 2.2 µm x 2 µm, its sensitivity to dual-node SEU generation is significantly lower than that of site 1. This is explained by the fact that the higher charge collection effectiveness at the secondary node A when the laser target location is "1" represents the critical phenomenon that triggers the upset. The node Q\* alone is easily flipped by pulse energies lower than the critical double-node upset pulse energy, similar to those needed to flip the unhardened storage cell. The other three SEU-hardened cell designs have also been characterized using the pulsed laser beam, this time starting from the analysis of the schematic and the layout and the prior selection of the pairs of sensitive sites to exercise. Two of the other cell designs, HIT2 and Rockett, have been found insensitive to dual-node upset mechanisms both through layout analysis and experimentally, using the pulsed laser. The pairs of simultaneously sensitive nodes in these cells are widely spaced and can not be upset by a single particle hit or by a laser pulse. The other SEU-hardened cell (Liu cell), has two pairs of simultaneously sensitive nodes that could be flipped with relatively low energy laser pulses. They are wider spaced than in the case of the HIT1 cell, since the channel length of one of the transistors is much larger than minimum. The sensitive node pairs are similarly separated by the two channels and the common source, a topology that should be avoided. Pulsed laser tests performed on the primary sensitive node areas triggered the upset at higher impact energies, shown in Table 5.1, validating experimentally the dual-node upset mechanism. It should be also noted that previous ground tests performed on a latchup-free test chip implementation [63] have found Liu cell less hardened to upsets than the other cell designs, without identifying the cause. The proximity of the bulk contact diffusion to both the sensitive drains and to the impact point of the laser pulse have been proven essential in determining the upset energy threshold. We performed a similar analysis for the two SEU sites identified in LIU cell (see Figure 5.11). The cell is composed of a pair of pseudo-CMOS inverter latches with PMOS and NMOS latching feedback operation, respectively, as shown in Figure 5.12. Figure 5.11. Two laser-induced SEU sites in LIU cell. The two sites correspond to SEU with the cell in the "high" state The simultaneously-sensitive node pairs are QP - Y and QN - X that induce the upset at the complementary nodes QP, QN, consisting exclusively of p-channel MOS transistor drains and n-channel MOS transistor drains, respectively. In order to provide layout compactness, transistors P3-P7 and N1-N7 are connected in a common source configuration that favor the dual-node charge collection from single-particle induced ionization tracks. The intrinsic SEU hardness feature provided by cell's logic design is thus severely limited. Figure 5.12 Schematic diagram of LIU cell. Figure 5.13 presents the critical collected charge characteristics for dual node upsets in HIT1 and LIU cells. The points (x,y) on the graph at the right side of each curve represent the combinations of collected charges at the two sensitive nodes that upset the cell. This area delimits the upset sensitivity ranges for the two nodes. Figure 5.13 Critical charge collection distribution for dual node upsets at HIT1 and LIU cells. It can be easily seen that the critical charge of the secondary node (i.e., the y value for the horizontal zone of the two curves) is lower than the critical charge for the primary node (i.e., the x value for the vertical zone of the curves in Figure 5.13). The critical charge value depends on transistor sizes, node capacitances, inverter type (i.e., P-MOS or N-MOS) and feedback interconnects. Each of the two (solid and dashed) lines in Fig. 5.13 separate the plane in two regions of points which represent on the left side collected charges not inducing upsets and on the right side collected charges leading to upset. We note that they asymptotically approach an infinite slope along the vertical axis at a charge of Q1 = 0.274 pC for the Liu cell and at Q1 = 0.8 pC for the HIT1 cell. Thus, the HSPICE simulation shows that regardless of the charge injected into the Q1 node, no upset will occur unless at least that much charge is injected simultaneously into Q2 node. Consequently; the solid lines represent transitions in a phase space. For charge injection (Q1, Q2) to the right and above each line, the cell will upset. For charge injection (Q1, Q2) to the to the left and below each line, no upset will occur. Further measurements can be performed to generate detailed sensitivity maps for the areas surrounding the sensitive drains. These measurements allow quantitative estimation of the effectiveness of laser tests targeted to areas surrounding the sensitive junctions and the transistor channel areas. They can further provide effective means of using the pulsed laser for fast, low cost hardness analysis and validation of ASIC designs using high density submicron CMOS processes. The results show the effectiveness of pulsed laser testing in validating SEP-robust circuit architectures and in uncovering problems not detectable using particle-beam measurements. A SEU mechanism has been detected that occurs in upset-immune storage cell designs for topology-dependent, closely located dual simultaneously sensitive nodes. This explains the variations in the SEU cross-section among various SEU-hardened designs. Limitations in the effectiveness of these design hardening techniques have been identified. They occur if appropriate topological constraints are not enforced, especially in applications using high density submicron technologies. Qualitative and quantitative results are also presented on performing laser test experiments using MOS transistor channels and drain junction surroundings as the target sensitized areas. Small variations have been measured for repetitive measurements on similar locations. This suggests that the laser could also be used for hardness assurance testing in high density technologies or in designs with limited direct access to the sensitive locations, provided that a reliable indirect access is available to stimulate the lateral edges of the sensitive areas. # 5.4 Design Analysis and Optimization of SEU-Hardened CMOS Latches When analyzing SEU hardness of a new CMOS storage cell based on latch redundancy, laser beam simulation helped us detect and investigate topology-dependent upset mechanisms due to charge collection at two sensitive nodes using laser excitation between the nodes. This led us to devise compact upset-immune device topologies in order to achieve the high immunity levels required in upset critical applications. Device level simulation has been used to confirm the laser experiment results. We have also developed and characterized a diffusion-based dual node charge collection model that helped us implement a set of design rules for topology-related hardness to dual-node upsets. # **5.4.1 Upset Sensitivity Analysis** Downscaling SEU hardened CMOS technologies for increased complexity and improved performance leads also to increased sensitivity to upsets. The critical charge value collected from a heavy ion strike which upsets a memory cell decreases with the square of feature size [51]. Additional upset mechanisms such as bipolar amplification [52] and direct channel conduction [53] further increase the upset sensitivity of submicron CMOS circuits and reduce the effectiveness of both process and design hardening techniques. Logic design hardened storage elements based on latch redundancy (LR cells) described in chapter 4 have the potential of achieving high levels of immunity against upset using submicron CMOS technologies. They store the information in two latch structures, with cross-coupled state-restoring feedback interconnects. However, as previously shown in Chapter 5.3, single event upset can be associated in these intrinsically SEU-immune memory structures with a pair of simultaneously sensitive nodes, that store the same logic value in each of the two latch sections. This 'dual-node' upset may occur for each of the two logic states of the cell, when charge collected at both nodes ( $D_1$ , $D_2$ ) exceeds the critical values ( $Q_{c1}$ , $Q_{c2}$ ) required to revert their logic state. If the upset-sensitive node pairs are appropriately spaced and isolated (typically by other device junctions and/or by substrate contact diffusions), then the critical charge collection probability at both nodes due to a single particle impact can be reduced to insignificant values. Then, whatever the electrical charge collected at a perturbed node in a latch may be, the cell recovers its initial state conserved in the second latch. Critical charge values required to upset a node are a function of several design parameters: transistor sizes and ratios, drain areas and interconnects. These parameters define node capacitance, collection volume and charge removal effectiveness. Design hardened storage cells analyzed in Chapter 5.3 use ratioed designs and differently sized latch structures, leading to asymmetric critical charge characteristics (i.e., $Q_{c1} > Q_{c2}$ ). Their upset sensitive areas are centered close to the node with higher critical charge [70]. Topological changes must also focus on the least sensitive drain area in order to reduce the upset sensitivity. In this chapter we present a new investigation performed on two prototypes using DICE cell design [68,69] that has been described in Chapter 4. The analysis shows the existence of a larger number of sensitive node pairs than in the previous asymmetric designs. Two distinct layout topologies have been investigated, with and without spacing and isolation constraints for the sensitive drains. Test results obtained with pulsed laser excitation on the constraint-free cell design, reveal the existence of an additional pair of simultaneously sensitive nodes, corresponding to the transistor drains of the write buffer circuit. Laser excitation of the layout-conscious LR cell design indicates a higher sensitivity of inside-the-well transistor drains for laser hits on well-to-substrate junction close to the two drains. Based on these results, we derived specific topology selection criteria and layout design methodologies for reliable upset immunity. # **5.4.2 DICE Prototype Circuit Description** The schematics of the DICE SRAM cell used in the memory prototype and the DICE latch structure withclocked inverters used in the register array prototype are presented in Figure 5.14. The four nodes of the DICE cell form a pair of latches in two alternate ways, depending on the stored logic value. Figure 5.14 DICE hardened storage elements: a) Memory cell b) Latch In Figure 5.14a, the adjacent node pairs A-B and C-D have active cross-feedback connections and form two-transistor, state-dependent half-latch structures. The other adjacent node pairs, B-C and D-A, have inactive feedback connections (i.e., "off" transistors) which isolate the two latching pairs. Hence, two "non-adjacent" nodes are logically isolated and must be both reverted in order to upset the cell. The two circuit prototypes have been designed and fabricated using 0.8 µm CMOS bulk/epi process from AMS. The first prototype is a 2K bit CMOS SRAM circuit composed of two sections using standard, 6-transistor non-hardened SRAM cells and DICE hardened cells. The second prototype chip comprises three shift registers. One of the registers is built from standard, unhardened latches. The other two registers use two different DICE cell topologies, with and without transistor size and topology constraints, respectively. They are called "optimized" and "non-optimized" DICE cells throughout this chapter. # 5.4.3 Heavy Ion and Pulsed Laser Test Results The SEU immunity of the prototypes has been tested at the 68" cyclotron of Lawrence Berkeley Laboratories, Berkeley, CA. Under exposure at various particle energies (see Table 5.2), we obtained a LET threshold for DICE cells of 50 MeV cm²/mg, compared to less than 10 MeV cm²/mg for the unhardened cell. It should be noted that a LET threshold of 30 MeV cm²/mg is considered as the typical limit to SEU hardness for space applications. The sensitive cross section characteristic experimentally obtained with heavy ion tests is presented in Figure 5.15. After these preliminary heavy ion tests, we used pulsed laser excitation in potentially sensitive areas to investigate the potential upset mechanisms limiting the DICE cell immunity to upsets. Figure 5.15 Measured effective cross section characteristic of DICE SRAM prototype A picosecond-pulse laser beam equipment previously described in this chapter was used to investigate the SEU sensitivity at different impact sites for a wide range of equivalent energies. The laser beam simulates the effects of a particle impact with an angle of incidence spanning a lateral distance up to $2\mu m$ inside the sensitive volume. We can also increase the spot diameter at will, in order to investigate wider ionization effects that simulate heavy ion impact on silicon at grazing angles. The absorption depth of the 600 nm wavelength light at which its intensity decreases at 1/e is $1.8~\mu m$ , which is satisfactory for the excitation of the 3 $\mu m$ buried well-to-substrate junctions. A scanning method described in [70] has been used to identify the SEU sensitive locations. An energy threshold of 16.1~pJ has been measured, which corresponds to a LET of $48.1~MeV~cm^2/mg$ , based on previously published correlation results [59]. This agrees well with the 50 MeV cm²/mg LET threshold found with heavy ion tests. Four SEU sites, two for each logic state, were detected for the non-optimized DICE cell topology. The mechanisms found responsible for the detected upsets are charge collection processes at two sensitive transistor drains of non-adjacent cell nodes, resulting from a single ion hit between the drain areas. In the optimized cell layout, two SEU sites, with higher energy thresholds, have been identified. They are located at the lateral edges of the well-to-substrate junction. The corresponding laser energy thresholds are given in Table 5.3. | Ion Species/angle | Ar / 0° | Cu / 0° | Kr / 0° | Kr /24° | Kr / 35° | Xe / 0° | |----------------------------------|-----------|-----------|-----------|---------|----------|----------| | LET [MeV/mg/cm <sup>2</sup> ] | 15 | 30 | 41 | 45 | 50 | 63 | | Cross-section [cm <sup>2</sup> ] | <7.6 E-08 | <1.3 E-07 | <9.5 E-09 | <9 E-09 | 2.9 E-06 | 1.3 E-05 | Table 5.2 Heavy ion test results for DICE SRAM | SEU Site | DICE-0a | DICE-1a | DICE-0b | DICE-1b | DICE_OPT-0 | DICE_OPT-1 | |------------|---------|---------|---------|---------|------------|------------| | Energy[pJ] | 19.1 | 16.1 | 23 | 81.5 | 92.4 | 58 | Table 5.3 Laser test results for DICE-REG chip # 5.4.4 Upset-Sensitive Topologies in Non-Optimized DICE Cells The topological analysis of the SEU-sensitive locations in the non-optimized DICE cell reveals two typical configurations for the two simultaneously sensitive drains. The first configuration (Fig. 5.16a) is made up of two transistors sharing a common source node. In this case, the dual-node charge collection can be significantly reduced using simple topology changes to avoid SEU occurrence. This is achieved by splitting the common source node in two and by isolating them using a bulk-biasing diffusion (BBD), as shown in the upper side drawing of Figure 5.16a. As a rule-of-thumb, transistor-level bulk biasing near the source junction avoids the bipolar amplification effects. It significantly reduces the collected charge at the drain, mainly when the source node is directly connected to the power supply. BBD insertion inherently adds area overhead, which is reasonably small in most cases. The second upset-sensitive topology configuration identified with our laser is shown in Fig. 5.16b. Two locations (DICE-0b and DICE-1b) have been identified, depending on the logic state of the cell. Here, one of the two transistors is located between the sensitive drains. The lowest critical charge area detected with the laser beam is located in the bulk, between drain D2 and source S1 of the opposite transistor. In this case, the source nodes of the two transistors (i.e., N6-N7 transistors in the write buffer of Figure 5.15b) are interconnected with metal, but they are not directly connected to the power supply. Henceforth, they cannot be effectively isolated by means of a substrate contact diffusion. Figure 5.16: Upset-sensitive dual drain configurations: a) Common source and b) Isolated drain #### **5.4.5 Topology Optimization for Upset Immunity** The topological hardening solution (depicted in the upper layout drawing of Fig. 16) consists in turning the transistor channel to the opposite direction and isolating the two drains with a bulk bias diffusion. The accumulated cell-level area overhead for the two topological hardening solutions is about 9%. The two layout configurations found sensitive to upset occur for both p-channel and n-channel transistor pairs. Other design-hardened cell implementations [61-63,71] have a single pair of sensitive drain areas for each logic state (i.e., either p-channel or n-channel transistor drains, but not both cases). Hence, they need less topology constraints to reach upset immunity. On the other hand, they have the drawback of critical charge asymmetry, larger area and higher complexity. #### 5.4.6 Upset Mechanism in the Optimized DICE Cell The proposed hardened topologies presented in Figure 5.16 have been implemented in the optimized DICE cell. For this cell we used a compact design with minimal transistor sizes and critical layout spacing allowed by the technology. The simplified layout of the cell is presented in fig. 5.17. The scanning algorithm using the laser beam revealed two upset-sensitive regions, DICE\_OPT-0 and DICE\_OPT-1, with high energy thresholds. These regions are both located at the lateral edge of the well-to-substrate junction. The p-transistor drain regions inside the n-well are widely spaced and isolated by n<sup>+</sup> BBD areas. The SEU mechanism consists of the delayed diffusion of majority carriers from the well-to-substrate ionization to the sensitive drain junctions. Figure 5.17 Simplified layout of DICE\_OPT cell showing the upset-sensitive areas An effective topological solution to avoid this upset mechanism consists of splitting the well area in two halves, each one housing one of the two sensitive drains. This 'split well' topology has little impact on area overhead in storage cell arrays, where cell rows are abutted at the well areas. ## 5.4.7 Topological Modeling of Dual Node Charge Collection Dual-node upsets can be modeled and analyzed by device simulation and pulsed laser characterization on a typical dual node circuit topology presented in Figure 5.18. Charge collection effectiveness for each of the two transistor drains, D1 and D2, is modeled as two functions $f_{c1}$ , $f_{c2}$ of the spacing distances $X_{d1}$ , $X_{d2}$ from the impact point to the drain junction. The influence of each drain area and perimeter size should also be considered. Functions $f_{c1}$ , $f_{c2}$ represent the yield of the node-collected charge to the injected charge $Q_I$ at the impact site: $$Q = Q_t \cdot f_c(X_d, \Theta) \tag{5.6}$$ They depend on the ion angle of incidence $\theta$ and they are assumed to be independent of the ion's linear energy transfer (LET). A bipolar amplification function (which significantly depends on the ion LET), is also modeled for transistor channels placed laterally or interposed to the distance between the two drains. In Figure 5.18, only the channel D1-S1 of the first transistor, which is placed laterally, is considered for modeling the bipolar amplification effect. Figure 5.18 Sample dual drain topology for charge collection modeling The initial collection function at the drains (i.e., without taking into account the effect of adjacent device geometry) is assumed to be complete (i.e., $f_c = 1$ ) for ion hits in the drain area. It is assumed to decrease as an exponential function on the distance *x* between the ion impact site and the drain-body junction edge (i.e., the space charge area): Figure 5.19 Charge collection distribution functions for asymmetric (a) and symmetric (b) storage cells $$f_c = G \cdot e^{-x/D} , \qquad (5.7)$$ where D is the diffusion length. In a first approximation, we consider $X_d$ distances to be small enough so that the effects of finite diffusion time may be neglected. The additional G function accounts for the effect of both charge collection at the second drain and the ion trajectory angle of incidence. It can also model the influence of process and device parameters such as drain area and geometry, doping level, junction depth, thickness of the charge space volume etc. The charge sharing between the two nodes by diffusion is modeled by a function $f_s(X_{dl}, X_{d2})$ of the two distance vectors. The impact of other device geometries, which are interposed between the two drains on the charge collection function can also be modeled as multiplying coefficient functions. Function $f_a(X_a, X_s)$ models the bipolar effect of charge amplification produced by a transistor channel connected to the collection node and located near the ion impact site. The charge removal effect of the bulk biasing diffusion (BBD) is also modeled as a function of the two distance vectors, $f_r(X_a, X_b)$ . The effect of other closely located transistors on charge collection effectiveness can also be modeled as charge collection sharing functions. The charge collected at the two drains in Figure 5.18 is then given by the equations: $$Q_{dI} = Q_{I} \cdot f_{cI}(X_{dI}) \cdot f_{s}(X_{dI}, X_{d2}) \cdot f_{d}(X_{c}) \cdot f_{r}(X_{dI}, X_{b})$$ (5.8) $$Q_{d2} = Q_1 \cdot f_{c2}(X_{d2}) \cdot f_s(X_{d2}, X_{d1}) \cdot f_r(X_{d2}, X_b)$$ (5.9) The critical charge $Q_c$ of a node depends both on the collection function and on circuit parameters such as node capacitance, active transistor current drive at the perturbed drains, and logic threshold and delay characteristics of the state restoring feedback loop. In a symmetrical implementation of DICE cell, the critical charge value is identical at the two simultaneously sensitive drains (Qc1 = Qc2 = Qc), and the most sensitive upset location is at mid-distance between the drains. We have confirmed this result with the laser beam test at DICE-0a and DICE-1a locations for n-channel and p-channel transistors respectively. For a given LET value, $Q_1(LET)$ is obtained from the equation: $$Q_I = \frac{q\rho}{W_P} ?LET ? \frac{t_{Si}}{\cos \theta}, \tag{5.10}$$ where $\rho$ is the silicon density, $W_p = 3.6 eV$ is the elementary energy absorbed by silicon to generate an electron-hole pair, and q is the elementary charge. The collected charge at the two drains is: $$Q_c = Q_1 = Q_2 = Q_i G e^{-\frac{d}{2D}},$$ (5.11) where d is the distance between the two drains, $D_1$ and $D_2$ . From this equation, we can obtain the minimum distance between the two drains for the given LET and $Q_I(LET)$ to avoid the upset occurrence: $$d_{\min_{sym}} = 2D \ln \frac{Q_I G}{Q_C} \sqrt{\qquad (5.12)}$$ For design-hardened cells having asymmetric characteristics, Qc1 > Qc2 and the minimum distance between the two drains is: $$d_{\min\_asym} = 2D \ln \frac{Q_I G}{\sqrt{Q_{C1} Q_{C2}}} \sqrt{\frac{1}{\sqrt{Q_{C1} Q_{C2}}}}$$ (5.13) The most sensitive site is shifted towards the drain with higher critical charge, to a point dividing the distance d into two segments having $\sqrt{Qc1/Qc2}$ ratio. The equivalent critical charge is reduced with the same ratio, compared to the symmetrical cell. A graphical representation of the charge collection distribution functions for the symmetric and asymmetric storage element characteristics is presented in Figure 5.19. Figure 5.20 Dual-node charge collection mechanisms: (a)Drift/funneling + diffusion, (b) Diffusion, (c) Diffusion + Bipolar amplification, (d) Diffusion + Direct channel conduction Detailed models of the topology-related charge collection functions $f_c$ , $f_s$ , $f_a$ , $f_y$ are to be defined from both extensive device simulation and laser test characterization on a future prototype implementing various differently spaced topologies. A graphical representation of the four basic mechanisms of dual-node charge collection to be analyzed is presented in Figure 5.20. Based on qualitative and quantitative results of both laser tests and device simulation, a set of design rules and optimized algorithms for automatic placement and reliable device spacing can then be defined. This will lead to fast and reliable development of first silicon design-hardened ASICs with high immunity to upsets for critical applications. #### 5.5 Conclusion In this chapter we have presented fault tolerance analysis, diagnosis and optimization techniques for a representative class of redundant circuits designed to operate in space radiation environment, based on laser beam and heavy ion test results. Effective analysis strategies based on pulsed laser testing are described, to validate SEP-robust circuit architectures and to uncover problems not detectable using particle-beam measurements. SEU mechanisms have been detected that occur in upset-immune storage cell designs for topology-dependent, closely located dual simultaneously sensitive nodes. This explains the variations in the SEU cross-section among various SEU-hardened designs. Limitations in the effectiveness of design hardening techniques have been identified. They occur if appropriate topological constraints are not enforced, especially in applications using high density submicron technologies. These topological constraints have been modeled and formalized in order to define a set of specific design rules for upset-hardened storage elements to be embedded in a dedicated CAD framework. Device-level simulation of dual-node charge collection phenomena has been used to validate a charge diffusion collection model for dual node upsets. Qualitative and quantitative results are also presented on performing laser test experiments using MOS transistor channels and drain junction surroundings as the target sensitized areas. Small variations have been measured for repetitive measurements on similar locations. This suggests that the laser could also be used for hardness assurance testing in high density technologies or in designs with limited direct access to the sensitive locations, provided that a reliable indirect access is available to stimulate the lateral edges of the sensitive areas. Design optimization solutions are suggested, which should have little impact on both circuit size and performance. A topological model is proposed for charge collection phenomena in general dual-node structures, based on device-level charge diffusion modeling and simulation. Extensive device simulations and laser pulse characterization on test structures are further required to accurately determine the model parameters. Design rules and algorithms can then be developed for automatic topology selection and device placement in design-hardened CMOS storage cells for upset-critical applications. Future work will focus on refining and formalizing the topology models and the design rules for the inclusion in standard CAD and simulation tools. Further test experiments are also scheduled for dynamic upset sensitivity characterization of both DICE prototypes and on DEEP1 prototype chip described in Chapter 4, implemented in a 0.25μm CMOS process. Another intended analysis concerns the sensitivity characterization and calibration of the embedded current-monitors of the SEU-tolerant static RAM described in Section 3. This ongoing analysis did not lead yet to significant results due to measurement difficulties encountered with the available resources (i.e., either inadequate laser equipment or complex measurement setup required). Future laser tests are scheduled at IXL Laboratory in Bordeaux on a newly installed picosecond-range pulsed laser test equipment with 1-μm spot diameter. Both heavy ion tests and upset simulation using a pulsed laser beam fail to provide a reliable test of all the redundant storage elements in a complex memory or ASICs. Permanent faults located in the memory elements may invalidate their SEU hardness without affecting normal circuit operation, thus reducing application reliability. # Chapter 6 # **Concluding Remarks and Future Directions** New approaches for CMOS circuit and system design for on-chip testing, performance and reliability and transient fault tolerance have been described in this thesis. Optimized design techniques are adopted for built-in current sensors to cope with the drastic limitations of speed and accuracy in deep submicron CMOS. Their functionality is extended throughout the ASIC product life cycle to on-line testing, transient fault monitoring and adaptive performance control. The increase in BICS complexity and transistor count is compensated by improved performance and functionality, and by simplified control and calibration requirements. The area overhead incurred by the current measurement part is negligible compared to a largely dominant area contribution of the bypass transistor switch, except for the specific case of low-activity CUT partitions such as memory cell arrays. Synergy effects of correlated implementation of controlled power supply switching for low power operation and supply current monitoring for I<sub>DDO</sub> testing are described in Chapter 2. A hierarchical I<sub>DDO</sub> test modeling approach has been adopted for I<sub>DDQ</sub> test synthesis and optimization. This modeling strategy will be further developed in close correlation with the IP power models to be selected for adoption by the VSIA standardization bodies. It will be also made compatible with the emerging IEEE P1481 delay and power calculation standard. BICS circuits devised as configurable, ready-to-use macrocells and modeled as controlled power supply interconnects are proposed to be included in design automation steps for power simulation, floorplanning and power supply distribution optimization. This will allow subsequent development of $I_{DDQ}$ test synthesis tools and system-level constrained optimization strategies for low power control, $I_{DDQ}$ monitoring, adaptive performance control, safety protection and reconfiguration functions. The upset-tolerant static RAM design based on current monitoring described in Chapter 3 achieves good detection sensitivity with negligible impact on memory access time. However, the significant rate of false alarms observed experimentally and explained by mixed-mode 3D simulation is unavoidable even with supposed tight BICS calibration and thus reduces the effectiveness of this approach. Upset hardening techniques based on redundant logic design presented in Chapter 4 provide a reliable and cost-effective alternative to the process-related hardened designs currently applied to commercial products. Simple and flexible algorithms for memory array conversion have been developed and experimentally validated on two circuit prototype designs. They can be effectively used to adapt and convert the existing memory generators for upset-immune SRAM synthesis according to the widest range of specifications for memory organization, functionality and interface. Such a memory generator tool, added to the developed library of upset-hardened sequential cells, can be subsequently used to synhtesize fast and reliable SEU-immune ASIC designs. An experimental validation of this approach is scheduled for implementation on a space communication modem ASIC for the Spanish NANOSAT experiment, scheduled for launch at the end of year 2000. This task will be performed in cooperation with CNM in Barcelona and the Spanish Space Agency in Madrid. Upset-hardened cell library validation on a designed prototype chip using 0.6 µm CMOS-epi process from AMS is also scheduled for early 2000. Laser test and characterization of topology spacing and isolation techniques for dual-node upset immunity are programmed on this prototype, in cooperation with IXL Bordeaux. This will allow us to accurately quantify the topological model parameters for dual-node charge collection functions presented in Chapter 5 and to validate a set of design rules for dual-node upset immunity to be embedded in a dedicated design kit. Upset-tolerant memory architectures based on cell-level redundancy are shown to offer significant potential for extended functionality for specific multiport operation and security access control. Their extension to transient fault-containment in combinational logic can also provide robust tolerance to multiple transient faults without performance loss in random sequential logic. # **Bibliography** - [1] J.M. Soden, C.F. Hawkins, R.K. Gulati and W. Mao, "IDDQ Testing: A Review," Journal of Electronic Testing: Theory and Applications (JETTA), Vol.3, No.4, Special Issue on IDDQ Testing, Dec.1992, pp. 5-17. - [2] D.P. Vallett, J.M. Soden, "Finding Faults With Deep Submicron ICs," IEEE Spectrum, Oct. 1997, pp.39-50. - [3] S. McEuen, "Reliability Benefits of IDDQ," Journal of Electronic Testing: Theory and Applications (JETTA), Vol.3, No.4, Special Issue on IDDQ Testing, Dec.1992, pp.41-49. - [4] Y. Taur et al., "CMOS Scaling into The Nanometer Regime," Proceedings of the IEEE, Vol.85, No.4, pp. 486-504, April 1997. - [5] R. Gonzales, B.M. Gordon, M.A. Horowitz, "Supply and Threshold Voltage Scaling for Low Power CMOS," IEEE Journal of Solid State Circuits, Vol.32, No.8, Aug. 1997, pp. 1210-1216. - [6] D. Josephson, M. Storey and D. Dixon, "Microprocessor IDDQ Testing: A Case Study," IEEE Design and Test of Computers, Summer 1995, pp. 42-52. - [7] W. Mao and R. Gulati, "QUIETEST: A Methodology for Selecting IDDQ Test Vectors," Journal of Electronic Testing: Theory and Applications (JETTA), Vol.3, No. 4, Dec. 1992, pp. 63-71. - [8] J. Rius and J. Figueras, "Dynamic Characterization of Built-In Current Sensors Based on PN Junctions: Analysis and Experiments," Journal of Electronic Testing: Theory and Applications (JETTA), Vol.7, No. 9, Sept. 1996, pp. 295-310. - [9] K. Baker, "QTAG: A Standard for Test Fixture Based IDDQ/ISSQ Monitors," Proceedings 1994 IEEE International Test Conference, ITC 94, pp. 194-202. - [10] W. Maly and M. Patyra, "Design of ICs Applying Built-In Current Sensing," Journal of Electronic Testing: Theory and Applications (JETTA), Dec. 1992, pp.111-120. - [11] T.L. Shen, J.C. Daly and J.-C. Lo, "On-Chip Current Sensing Circuit for CMOS VLSI," Proc. 1992 IEEE VLSI Test Symposium, pp. 309-314, 1992. - [12] J. Rius and J. Figueras, "Proportional BIC Sensor for Current Testing," Journal of Electronic Testing: Theory and Applications, vol. 1, no. 3, pp.387-396, 1992. - [13] W.W. Weber and A.D. Singh, "Incorporating IDDQ Testing with BIST for Improved Coverage: An Experimental Study, " Journal of Electronic Testing: Theory and Applications (JETTA), Vol.8, No.11, Nov.1997, pp. 147-156. - [14] K.J. Lee, T.C. Huang and M.C. Huang, "Built-In Current Sensor Design based on the Bulk-Driven Technique, Proc. 1997 Asian Testing Symposium, pp. 121-129, Dec. 1997. - [15] K.M. Wallquist, "Achieving IDDQ/ISSQ Production Testing with QuiC-Mon," IEEE Design and Test of Computers, Fall 1995, pp. 62-69. - [16] K. Baker and A. Hales, "Plug-and-Play IDDQ Testing for Test Fixtures," IEEE Design and Test of Computers, Fall 1995, pp. 53-61. - [17] D. B. Feltham, P. J. Nigh, L. R. Carley, W. Maly, "Current Sensing For Built-In Testing of CMOS Circuits," Proceedings, ICCD, Cambridge, MA, USA, 1988 - [18] S.P. Athan, D.L. Landis and S.A. Al-Arian, "A Novel Built-In Current Sensor for IDDQ Testing of Deep Submicron CMOS ICs," Proceedings 14-th IEEE VLSI Test Symposium, pp.214-219, April 1996. - [19] Y. Miura, "An IDDQ Sensor Circuit for Low-Voltage ICs," Proceedings of the 1997 IEEE International Test Conference, pp. 938-947, Oct. 1997. - [20] T. Calin, F. Vargas, M. Nicolaidis, "Upset-Tolerant CMOS SRAM Using Current Monitoring: Prototype and Test Experiments," Proc. 1995 IEEE International Test Conference, ITC 95, pp. 45-53, Oct. 1995. - [21] M. Nicolaidis, T. Calin, "A Theory of Perturbation-Tolerant Asynchronous FSM and its Application on the Design of Perturbation-Tolerant Memories," IEEE European Test Workshop ETW'97, June 1997. - [22] R.K. Iyer, S. E. Butner, and E. J. McCluskey, "A Statistical failure/load relationship; Results of a multicomputer study," IEEE Trans. on Computers, vol. C-31, p. 697-706, July 1982 - [23] R. Iyer and D. Rossetti, "A Measurement Based Model for Workload Dependence of CPU Errors," IEEE Trans. on Computers, vol. C-35, p. 511-519, June 1986. - [24] J.C. Pickel, J.T. Blandford, Jr., "Cosmic Ray Induced Errors in MOS Memory Cells," IEEE Trans. on Nuclear Science, vol. NS-25, n° 6, Dec. 1978. - [25] R. Chillarege, R. K. Iyer, "Measurement-Based Analysis of Error Latency," IEEE Trans. on Computers, vol. C-36, No.5, May 1987, p. 529-537. - [26] R. Meershoek, B. Verelst, R. McInerey, and L. Thijssen, "Functional and IDDQ Testing on a Static RAM," Proc. 1990 International Test Conference, pp. 929-937. - [27] C. Kuo, T. Toms, B. T. Neel, J. Jelemenski, E. A. Carter, P. Smith, "Soft Error Detection Technique for a High Reliability CMOS SRAM," IEEE Journal of Solid-State Circuits, vol. 25, no. 2, Feb. 1990, pp.61-66. - [28] S. Naik, F. Agricola, W. Maly, "Failure Analysis of High Density CMOS SRAMs Using Realistic Defect Modelling and IDDQ Testing," IEEE Design & Test of Computers, June 1993, p. 13-23. - [29] T. Calin, L. Anghel, M. Nicolaidis, "Asynchronous Current Monitors for Transient Fault Detection in Deep Submicron CMOS," IEEE International On-Line Testing Workshop IOLTW'98, Capri, July 1998. - [30] H. Yokoyama, H. Tamamoto, and Y. Narita, "A Current Testing for CMOS Static RAMs," Proc. 1993 Workshop on Memory Testing, Aug. 9-10, 1993, San-Jose, CA, USA - [31] F.L. Vargas, M. Nicolaidis, "SEU-Tolerant SRAM Design Based on Current Monitoring," 24th International Symposium on Fault-Tolerant Computing, FTCS 24, Austin, USA, Jun. 1994. - [32] T.L. Turflinger, M.V. Davey, "Understanding Single Event Phenomena in Complex Analog and Digital Integrated Circuits," IEEE Trans. on Nuclear Science, vol. NS-37, n° 6, Dec. 1990. - [33] A. J. Van de Goor, "Testing Semiconductor Memories, Theory and Practice," John Wiley & Sons Ltd., West Sussex, England, 1991. - [34] Hwang, C., Ismail, M., DeGroat, J.E. "On-Chip Testability Schemes for Detecting Multiple Faults in CMOS Circuits," IEEE Journal of Solid State Circuits, Vol.31, No.5, May 1996, pp.732-739. - [35] J. Segura, M. Roca, D. Mateo and A. Rubio, "Built-In Dynamic Current Sensor Circuit for Digital VLSI CMOS Testing," *Electronics Letters*, Vol. 30, No. 20, pp. 1668-1669, 29 Sept. 1994. - [36] C. Su, K. Hwang, S.-J. Jou, "An IDDQ Based Built\_in Concurrent Test Technique for Interconnects in a Boundary Scan Environment," Proceedings IEEE 1994 International Test Conference., pp. 670-676, Oct. 1994. - [37] Y. Miura and K. Kinoshita, "Circuit Design for Built-In Current Testing," Proceedings IEEE 1992 International Test Conference., pp. 873-881, Oct. 1992. - [38] A. Rubio, E. Janssens, H. Casier, J. Figueras, D. Mateo, P. De Pauw and J. Segura, "A Built-In Quiescent Current Monitor for CMOS VLSI Circuits," Proceedings 1995 IEEE European Test Conference, ETC 95, pp. 581-585, March 1995. - [39] C.W. Hsue and C.-J Lin, "Built-In Current Sensor for IDDQ Test in CMOS," Proceedings IEEE 1993 International Test Conference., pp. 635-641, Oct. 1993. - [40] G.C. Messenger, M.S. Ash, "The Effects of Radiation on Electronic Systems," Van Nostrand Reinhold, New York, 1986. - [41] S. T. Su and R. Z. Makki, "Testing of Static Random Access Memories by Monitoring Dynamic Power Supply Current," Journal of Electronic Testing: Theory and Applications, March 1992, p. 265-278. - [42] G.P. Ansell, J.S. Tirado, "CMOS in Radiation Environments," VLSI System Design, Sep. 1986 - [43] H.-J. Wunderlich, M. Herzog, J. Figueras, A. Carrasco, A. Calderon, "Synthesis of IDDQ-Testable Circuits: Integrating Built-In Current Sensors," Proceedings European Design And Test Conference EDAC/ETC, Paris, March 1995. - [44] K.J. Lee and J.J. Tang, "A Built-In Current Sensor Based on Current-Mode Design," IEEE Transactions on Circuits and Systems Part II, Vol.45, NO.1, Jan. 1998, pp.133-137. - [45] M.E. Dean, D.L. Dill and M. Horowitz, "Self-Timed Logic Using Current-Sensing Completion Detection (CSCD)," Journal of VLSI Signal Processing, July 1994, pp. 7-16. - [46] A. Rubio, J. Figueras, J. Segura, "Quiescent Current Sensor Circuits in Digital VLSI CMOS Testing," Electronics Letters, Vol. 26, No. 15, 19 July 1990. - [47] J.C. Lo, J.C. Daly, M. Nicolaidis, "Design of Static CMOS Self-Checking Circuits Using Built-In Current Sensing," Proc. Symp. of Fault Tolerant Computing, FTCS-22, Boston, USA. Jun. 1992. - [48] M. Nicolaidis, F. Vargas, B. Courtois, "Design of Built-In Current Sensors for Concurrent Checking in Radiation Environments," *IEEE Trans. on Nuclear Sc.*, vol. 40, No. 6, Dec. 1993. - [49] T. Calin, F. Vargas, M. Nicolaidis, R. Velazco, "A Low-Cost, Highly-Reliable SEU-Tolerant SRAM: Prototype and Test Results," IEEE Transactions on Nuclear Science, Vol. 42, No.6, Dec. 1995. - [50] J. Arlat, Y. Crouzet and J.C. Laprie, "Fault Injection for Dependability Validation of Fault-Tolerant Computer Systems," Proc. 19-th Symposium on Fault-Tolerant Computing, FTCS 19, June 1989, pp. 348-355. - [51] Y.K. Malayia, Q. Tong, A. Jayasumana, "Enhancement of Resolution in Supply Current Based Testing for Large ICs," Proc. IEEE VLSI Test Symposium, April 1991, pp.291-296. - [52] L.W. Massengill, D.V. Kerns, Jr., S.E. Kerns, M.L. Alles, "Single Event Charge Enhancement in SOI Devices," *Electron Device Letters*, Feb. 1990, pp. 98-99. - [53] S. Velacheri, L.W. Massengill, S. Kerns, "Single-Event-Induced Charge Collection and Direct Channel Conduction in Submicron MOSFETs," *IEEE Transactions on Nuclear Science*, Vol.41, No.6, Dec. 1994, pp. 2103-2111. - [54] S. Buchner, A.R. Knudson, K. Kang and A.B. Campbell, "Charge Collection from Focused Picosecond Laser Pulses," IEEE Trans. on Nucl. Sc., Vol. 35, No.6, Dec. 1988, pp. 1517-1526. - [55] C.M. Hsieh, P.C. Murley, R.O. O'Brien, "A Field Funneling Effect on the collection of Alpha-Particle Generated Carriers in Silicon Devices," IEEE Electron Device Letters, Vol. EDL-2, pag. 102-105, 1981. - [56] S. Buchner, K. Kang, W. J. Stapor, A. B. Campbell, A.R. Knudson, P. McDonald, S. Rivet, "Pulsed Laser-Induced SEU in Integrated Circuits: A Practical Method for Hardness Assurance Testing," IEEE Trans. on Nucl. Sc., Vol. 37, No.6, Dec. 1990, pp. 1825-1831. - [57] C.A. Gosset, B.W. Hughlock, A.H. Johnston, "Laser Simulation of Single Particle Effects," IEEE Trans. on Nuclear Sc., Vol. 39, No.6, Dec.1992, pp. 1647-1652. - [58] J. S. Melinger, S. Buchner, D. McMMorrow, W.J. Stapor, T. R. Weatherford, A. B. Campbell and H. Eisen, "Critical Evaluation of the Pulsed Laser Method for Single-Event Effects Testing and Fundamental Studies," IEEE Trans. on Nucl. Sc., Vol. 41, No.6, Dec. 1994, pp. 2574-2584. - [59] S. C. Moss, S. D. LaLumondiere, J. R. Scarpulla, K. P. MacWilliams, W; R. Crain and R. Koga, "Correlation of Picosecond Laser-Induced Latchup and Energetic Particle-Induced Latchup in CMOS Test Structures," IEEE Trans. on Nuclear Sc., Vol. 42, No.6, Dec. 1995, pp. 1948-1956. - [60] S. Buchner, J. B. Langworthy, W. J. Stapor, A. B. Campbell and S. Rivet, "Implications of the Spatial Dependence of the Single-Event Upset Threshold in SRAMs Measured with a Pulsed Laser," IEEE Trans. on Nuclear Science, Vol. 41, No.6, Dec. 94, pp. 2195-2202. - [61] R. Velazco, D. Bessot, S. Duzellier, R. Ecoffet and R. Koga, "Two CMOS Memory Cells Suitable for the Design of SEU-Tolerant VLSI Circuits," IEEE Trans. on Nuclear Science, Vol. 41, No.6, pp. 2229-2234, Dec.1994. - [62] L.R. Rockett, "An SEU-hardened CMOS data latch design," IEEE Transactions on Nuclear Science, Vol.35, No.6, pp. 1682-1687, December 1988. - [63] M.N. Liu and S. Whitaker, "Low Power SEU Immune CMOS Memory Circuits," IEEE Trans. on Nuclear Science, Vol.39, No.6, pp. 1679-1684, Dec. 1992. - [64] P.E. Dodd, F.W. Sexton and P.S. Winokur, "Three-Dimensional Simulation of Charge Collection and Multiple Bit Upset in Si Devices," IEEE Trans. on Nucl. Sc., Vol. 41, No.6, Dec. 94, pp. 2005-2017. - [65] G. Fowles, "Introduction to Modern Optics," Holt-Reinehart-Winston, 1975. - [66] A. Johnston, "Charge Generation and Collection in p-n Junctions From a Pulsed Infrared Laser," IEEE Trans. on Nucl. Sc., Vol. 40, No.6, Dec. 94, p. 1694-1699. - [67] J. Pankove, "Optical Processes in Semiconductors," Prentice Hall, New Jersey, 1971. - [68] T. Masson, "Memoire insensible aux perturbations," European Patent EP 0708447, April 1996. - [69] T. Calin, M. Nicolaidis, R. Velazco, "Upset Hardened Memory Design for Submicron CMOS Technology," IEEE Transactions on Nuclear Science, Vol.43, No.6, Dec. 1996, pp. 2874-2878. - [70] R. Velazco, T. Calin, M. Nicolaidis, S. Moss, S.D. LaLumondiere, V.T. Tran, R. Koga, "SEU-Hardened Storage Cell Validation Using a Pulsed Laser," IEEE Trans. Nucl. Sci., Vol.43, No.6, Dec. 1996, pp. 2843-2848. - [71] S. Whitaker, J. Canaris, K. Liu, "SEU Hardened Memory Cells for a CCSDS Reed Solomon Encoder," IEEE Transactions on Nuclear Science, Vol.39, No.6, Dec. 1992, pp. 1471-1477. - [72] S.C. Moss, S.D. LaLumondiere, V.T. Tran and R. Koga, "Laser-Induced Single Event Upset in CMOS Memory Cells Designed to be Single Event Upset Tolerant," Aerospace Report No. ATR-96(8255)-3, The Aerospace Corporation, El Segundo, CA, February 1996. - [73] V. Pouget, T. Calin, H. Lapuyade, D. Lewis, P. Fouillat, R. Velazco, Y. Maidon and L. Sarger, "Elaboration of a New Pulsed Laser System for SEE Testing," Fourth IEEE On-Line Testing Workshop IOLTW'98, Capri, Italy, July 1998. - [74] E.G. Stassinopoulos and J.P. Raymond, "The Space Radiation Environment for Electronics," Proceedings of the IEEE, Vol. 76, No. 11, pp. 1423-1442, Dec. 1988. - [75] P. Yang and J.H. Chern, "Design for Reliability: The Major Challenge for VLSI," Proceedings of the IEEE, Vol.81, No.5, pp. 730-744, May 1993. - [76] D.L. Crook, "Evolution of VLSI Reliability Engineering," Proceedings of the International Reliability Physics Symposium, March 1990, pp. 2-11. - [77] E. Takeda et al., "VLSI Reliability Challenges: From Device Physics to Wafer Scale Syetems," Proceedings of the IEEE, Vol.81, No.5, pp. 653-674, May 1993. - [78] E.L. Petersen, "Soft Errors Due to Protons in The Radiation Belts," IEEE Transactions on Nuclear Science, Vol. 28, No. 6, pp. 3981-3986, Dec. 1981. - [79] P.J. McNulty, Single Event Effects Experienced by Astronauts and Microelectronic Circuits Flown in Space," IEEE Transactions on Nuclear Science, Vol. 43, No. 2, pp. 475-482, April 1996. - [80] G. Brucker et al., "Recovery of Damage in Rad-Hard MOS Devices During and After Irradiation by Electrons, Protons, Alphas and Gamma Rays," IEEE Transactions on Nuclear Science, Vol. NS-30, No. 6, pp. 4157-4161, Dec. 1983. - [81] J. Schwank et al., "Physical Mechanisms Contributing to Device Rebound," IEEE Transactions on Nuclear Science, Vol. NS-31, No. 6, pp. 1434-1438, Dec. 1984. - [82] E.G. Stassinopoulos, G.J. Brucker, D.W. Nakamura, C.A. Stauffer, G.B. Gee and J.L. Barth, "Solar Flare Proton Evaluation at Geostationary Orbits for Engineering Applications," IEEE Transactions on Nuclear Science, Vol. 43, No. 2, pp. 369-382, April 1996. - [83] C. Barillot, P. Calvel, "Review of Commercial Spacecraft Anomalies and Single-Event Effect Occurrence," IEEE Transactions on Nuclear Science, Vol. 43, No. 2, pp. 453-460, April 1996. - [84] R.R. Troutman, "Latchup in CMOS Technology: The Problem and its Cure," Kluwer Academic Publishers, 1986. - [85] J.L. Turflinger, "Single Event Effects in Analog and Mixed Signal Circuits," IEEE Transactions on Nuclear Science, Vol. 43, No. 2, pp. 594-602, April 1996. - [86] J.R. Schwank, "Radiation-Induced Interface State Generation in MOS Devices," IEEE Transactions on Nuclear Science, Vol. NS-33, No. 6, pp. 1178-1184, Dec.1986. - [87] P.S. Winokur et al., "Total Dose Radiation and Annealing studies: Implications for Hardness Assurance Testing," IEEE Transactions on Nuclear Science, Vol. NS-33, No. 6, pp. 1343-1351, Dec.1986. - [88] S. Yoshioka et al., "Radiation-Hardened 32-bit Microprocessor Based on the Commercial CMOS Process," IEEE Transactions on Nuclear Science, Vol. NS-41, No. 6, pp. 2481-2486, Dec. 1994. - [89] K. LaBel and M. Gates, "Single-Event Effect Mitigation from a System Perspective," IEEE Transactions on Nuclear Science, Vol. 43, No. 2, pp. 654-660, April 1996. - [90] B. Davari, R.H. Dennard and G.G. Shahadi, "CMOS Scaling for High Performance and Low Power The Next Ten Years," Proceedings of the IEEE, Vol. 83, No. 5, pp. 586-598, May 1995. - [91] J. Feynman, S.B. Gabriel, "High Energy Charged Particles in Space at One Astronomical Unit," IEEE Transactions on Nuclear Science, Vol. 43, No. 2, pp. 344-352, April 1996. - [92] A.J. Sims, C.S. Dyer, K. Johansson, H. Petersson, J. Farren, "The Single Event Upset Environment for Avionics at High Altitude," IEEE Transactions on Nuclear Science, Vol. 41, No. 6, pp. 2361-2367, Dec. 1994. - [93] T. Ma, P. Dressendorfer, "Ionizing Radiation Effects in MOS Devices and Circuits," Wiley 1989. - [94] F. Faccio et al., "Total Dose and Single Event Effects (SEE) in a 0.25 \_m CMOS Technology," Fourth Workshop on Electronics for LHC Experiments, Rome, 21-25 Sept. 1998. - [95] C. Hu, "Future CMOS Scaling and Reliability," Proceedings of the IEEE, Vol.81, No.5, May 1993, pp. 682-698. - [96] J.C. Pickel, "Single-Event Effect Rate Prediction," IEEE Transactions on Nuclear Science, Vol. 43, No. 2, pp. 483-495, April 1996. - [97] S. Duzellier and R. Ecoffet, "Recent Trends in Single-Event Effect Ground Testing," IEEE Transactions on Nuclear Science, Vol. 43, No. 2, pp. 671-677, April 1996. - [98] A.H. Johnston, "Radiation Effects in Advanced Microelectronics Technologies," Proc. Fourth European Conference on Radiation and its Effects on Components and Systems, RADECS'97, pp. 1-16, Sept. 1997. - [99] S.E. Kerns and B.D. Shafer, eds, "The Design of Radiation-Hardened ICs for Space: A Compendium of Approaches," Proc. IEEE, Nov. 1988, pp. 1470-1509. - [100] S. Yoshioka, H. Kamimura, M. Akiyama, "A Radiation-Hardened 32-bit Microprocessor Based on the Commercial CMOS Process," IEEE Trans. Nucl. Sci., vol. 41, n° 6, Dec. 1994, pp. 2481-2486. - [101] E.L. Petersen, P. Shapiro, J. H. Adams, Jr., and E.A. Burke, "Calculations of Cosmic Ray Induced Soft Upsets and Scaling in VLSI Devices," IEEE Trans. on Nuclear Science, vol. NS-29, n° 6, pp. 2055-2063, Dec. 1982. - [102] F.L. Yang, R.A. Saleh, "Simulation and Analysis of Transient Faults in Digital Circuits," IEEE Journal of Solid State Circuits, Vol. 27, No.3, March 1992, pp. 258-264. - \_\_\_\_ - [103] R.L. Pease, A.H. Johnston, J.L. Azarewicz, "Radiation Testing of Semiconductor Devices for Space Electronics," Proceedings of the IEEE, Vol. 76, No.11, Nov. 1988, pp. 1510-1526. - [104] R.L. Johnson and S.E. Diehl, "An Improved Single Event Resistive Hardening Technique for CMOS Static RAMs, IEEE Trans. on Nuclear Science, vol. NS-33, no. 6, pp. 4122-4127, Dec. 1986. - [105] M.R. Shaneyfelt, P.S. Winokur, T.L. Meisenheimer, F.W. Sexton, S.B. Roeske and M.G. Knoll, "Hardness Variability in Commercial Technologies," IEEE Transactions on Nuclear Science, Vol. 41, No. 6, pp. 2536-2543, December 1994. - [106] R.H. Lin, Y.C. Tsu, T.T. Hwang, "Cell Height Driven Transistor Sizing in a Cell-Based Static CMOS Module Design," IEEE Journal of Solid State Circuits, Vol. 31, No.5, May 1996, pp.668-676. - [107] W. Snoeys et al., "Layout Techniques to Enhance the Radiation Tolerance of Standard CMOS Technologies Demonstrated on a Pixel Readout Chip," Eighth European Symposium on Semiconductor Detectors, Schloss Elmau, June 1998. - [108] C. Brothers et al., "Total Dose and SEU Characterization of 0.25 micron CMOS/SOI Integrated Circuit Memory Technologies," IEEE Transactions on Nuclear Science, Vol. NS-44, No. 6, pp. 2134-2140, Dec. 1997. - [109] O. Musseau, J.L. Leray, V. Ferlet-Cavrois, Y.M. Coic and B. Giffard, "SEU in SOI RAMs A Static Model," IEEE Transactions on Nuclear Science, Vol. NS-41, No. 3, pp. 607-612, June 1994. - [110] M.L. Alles, "SPICE Analysis of the SEU Sensitivity of a Fully Depleted SOI CMOS RAM Cell," IEEE Transactions on Nuclear Science, Vol. NS-41, No. 6, pp. 2093-2097, Dec. 1994. - [111] B.S. Chang, G. Kim and W. Kim, "A Low Voltage Low Power CMOS Delay Element," Proceedings of the European Solid State Circuit Conference ESSCIRC'91," pp. 222-225, 1991 - [112] R.L. Maziasz and J.P. Hayes, "Layout Minimization of CMOS Cells," Kluwer Academic Publishers, 1992. - [113] T. Calin, R. Velazco, M. Nicolaidis, S.C. Moss, S.D. LaLumondiere, V.T. Tran, R. Koga, K. Clark, "Topology-Related Upset Mechanisms in Design-Hardened Storage Cells," Proc. Fourth European Conference on Radiation and its Effects on Components and Systems RADECS'97, Cannes, Sept. 1997, pp. 484-488. - [114] M.L. Pflum, D.B. Witt, "A High-Speed Latch Circuit Including Multiple Transmission Gates and a Pipelined Microprocessor Employing The Same," US Patent Application No. 08378175, 1995, and World Patent Application No. 96/23355, 1996. - [115] M. Nogawa and Y. Ohtomo, "A Data-Transition Look-Ahead DFF Circuit for Statistical Reduction in Power Consumption," IEEE Journal of Solid State Circuits, Vol. 33, No. 5, May 1998, pp. 702-706. - [116] T. Uehara, M. Van Cleemput, "Optimal Layout of CMOS Functional Arrays," IEEE Transactions on Computers, Vol. C-30, No.5, May 1981. - [117] M.E. Levitt, J.A. Abraham, "Physical Design of Testable VLSI: Techniques and Experiments," IEEE Journal of Solid State Circuits, Vol. 25, No.2, April 1990, pp.474-481. - \_\_\_\_ - [118] M. Guruswami et al, "CELLERITY: A Fully Automatic Layout Synthesis System for Standard Cell Libraries," Proceedings IEEE/ACM 1997 Design Automation Conderence, pp. 327-332, June 1997. - [119] T. Varga, R.C. Armstrong, J. Duh, T.G. Matheson, "Developing a Concurrent Methodology for Standard Cell Library Generation," Proceedings IEEE/ACM 1997 Design Automation Conderence, DAC'97, Anaheim, CA, June 1997, pp. 333-336. - [120] T. Calin, M. Nicolaidis, R. Velazco, "Design of Radiation Hardened Memories," IEEE International On-Line Testing Workshop, IOLTW'96, pp.34-37, Biarritz, July 1996. - [121] J.Velasco-Medina, T. Calin, M. Nicolaidis, "Fault Detection in Linear Analog Circuits Using Current Injection," Design, Automation and Test in Europe Conference, DATE'98, Paris, Feb. 1998. - [122] A. Singh and J. Hurst," Incorporating IDDQ Testing in BIST: Improved Coverage through Test Diversity," Proc. IEEE VLSI Test Symposium, April 1994, pp. 374-379. - [123] Z. Kohavi, "Switching and Finite Automata Theory," McGraw-Hill, 1970. - [124] M. Nicolaidis, T. Calin, F. Vargas, "A Global Current Testing Approach," IEEE International On-Line Testing Workshop, IOLTW'95, Nice, July 1995. - [125] H. Cha, E.M. Rudnick, G.S. Choi, J.H. Patel, R.K. Iyer, "A Fast and Accurate Gate-Level Transient Fault Simulation Environment," Proc. 23-rd Fault-Tolerant Computing Symposium, FTCS 93, pp. 234-242, June 1993. - [126] T. Houston et al., "A Radiation-Hardened 1-M Bit SRAM on SIMOX Material," IEEE Radiation Effects Data Workshop, Madison, WI, June 1994, pp. 7-10. - [127] M. Sachdev, "Deep Submicron IDDQ Testing: Issues and Solutions," Proceedings European Design and Test Conference, pp. 271-278, March 1997. - [128] J. Abraham, E. Davidson, J. Patel, "Memory System Design for Tolerating Single Event Upsets," IEEE Trans. Nucl. Sci., vol. NS-30, n° 6, Dec. 1983, pp. 4339-4344. - [129] R. Johnson, S. Diehl, "An Improved SEU Resistive Hardening technique for CMOS Static RAMs," IEEE Trans. Nucl. Sci., vol. NS-33, n° 6, Dec. 1986, pp. 1727-1733. - [130] M. Sachdev, "Separate IDDQ Testing of Signal and Bias Paths in CMOS ICs for Defect Diagnosis," Journal of Electronic Testing: Theory and Applications (JETTA), Aug. 1996, pp. 203-214. - [131] D. Bessot, R. Velazco, "Design of SEU-Hardened CMOS Memory Cells: The HIT Cell," Proceedings 1993 RADECS Conference, pp. 563-570. - [132] J.M. Soden and C.F. Hawkins, "IDDQ Testing: Issues Present and Futures," IEEE Design and Test of Computers, Winter 1996, pp. 61-65. - [133] R.L. Pease, A.H. Johnston, J.Z. Lazarewicz, Radiation Testing of Semiconductor Device for Space Electronics, Proceedings of the IEEE, Vol. 76, No. 11, Nov. 1988. - [134] J.R. Srour, J.M. McGarity, Radiation Effects on Microelectronics in Space, Proceedings of the IEEE, Vol. 76, No. 11, Nov. 1988. - [135] J.S. Browing, J.W. Griffee, D.B. Holtkamp, W.C. Priedhorsky, "An Assessment of the Radiation Tolerance of Large Satellite Memories in Low Earth Orbits," *Proceedings of the First European Conference on Radiation and its Effects on Devices and Systems.* France, Sep. 1991. - [136] G. Seydel, E. Bohl, W. Glauert and A. Soukup, "The Concept of the Fail-Stop Controller AE-11 using BIST and IDDQ," Proc. Fourth Intl. On-Line Testing Workshop, IOLTW'98, pp.121-128, July 1998. - [137] Y. Moreau, S. Duzellier, J. Gasiot, "Evaluation of The Upset Risk in CMOS SRAM Through Full Three Dimensional Simulation," IEEE Transactions on Nuclear Science, vol. NS-42, No.6, Dec. 1995, pp. 1789-1796. - [138] K.M. Fukuda, T. Anbo, T. Tsukada, T. Matsuura and M. Hotta, "Voltage-Comparator-Based Measurement of Equivalently Sampled Substrate Noise Waveforms in Mixed-Signal Circuits," IEEE Journal of Solid State Circuits, Vol. 31, No.5, May 1996, pp. 726-731. - [139] D.A. Freitas, K.W. Current, "CMOS Current Comparator Circuit," Electronics Letters, Vol.19, No.17, 18 Aug. 1983, pp. 695-697. - [140] J. Ramirez-Angulo, "Low Voltage Current Mirrors for Built-In Current Sensors," Proceedings of The 1994 International Symposium on Circuits and Systems, ISCAS 94, Vol.2, pp.529-532. - [141] D.G. Nairn, "Amplifiers for High Speed Current-Mode Sample-and-Hold Circuits, Proceedings of The 1992 International Symposium on Circuits and Systems, ISCAS 92, Vol.4, pp.2045-2048. - [142] V.I. Prodanov and M.M. Green, "CMOS Current Mirrors with Reduced Input and Output Voltage Requirements," Electronics Letters, Vol. 32, No.2, 18 Jan. 1996, pp. 104-105. - [143] A. Rodriguez-Vasquez, R. Dominguez-Castro, F. Medeiro and A.M. Delgado-Restituto, "High Resolution CMOS Current Comparators: Design and Applications to Current-Mode Function Generation," Analog Integrated Circuits and Signal Processing, Vol.1, No.7, pp. 149-165, July 1995. - [144] T. Seki, E. Itoh, C. Furukawa, I. Maeno, T. Ozawa, H. Sano and N. Suzuki, "A 6-ns 1-Mb CMOS SRAM with Latched Sense Amplifier," IEEE Journal of Solid State Circuits, Vol.28, NO.4, April 1993, pp. 478-482. - [145] T. Sakurai, "High Speed Circuit Design with Scaled-Down MOSFET's and Low Supply Voltage," Proceedings ISCAS '93, pp. 1487-1490. - [146] T. Kobayashi, K. Nogami, T. Shirotori and Y. Fujimoto, "A Current-Controlled Latch Sense Amplifier and a Static Power-Saving Input Buffer for Low-Power Architecture," IEEE Journal of Solid State Circuits, Vol.28, NO.4, April 1993, pp. 523-527. - [147] T. N. Blalock and R.C. Jaeger, "A High-Speed Clamped Bit-Line Current-Mode Sense Amplifier," IEEE Journal of Solid State Circuits, Vol.26, NO.4, April 1991, pp. 542-548. - [148] D.J. Allstot, "A Precision Variable-Supply CMOS Comparator," IEEE Journal of Solid State Circuits, Vol.17, No.6, Dec.1982, pp. 99-106. - [149] L.N. Fenstermaker, K.J. O'Connor, "A Low-Power Generator-Based FIFO Using Ring Pointers and Current-Mode Sensing," Proceedings International Solid State Circuit Conference, ISSCC 93, Paper FA 16.1. - [150] E. Seevinck et al., "Current Mode Techniques for High Speed VLSI Circuits with Application to Current Sense Amplifier for CMOS SRAM's," IEEE Journal of Solid State Circuits, Vol.26, No.4, Apr.1991, pp. 525-536. - [151] R. Sarpeshkar, J.L. Wyatt, Jr., N.C. Lu and P.D. Gerber, "Mismatch Sensitivity of Simultaneously Latched CMOS Sense Amplifier," Proceedings ISCAS'93, pp. 2224-2227, June 1993. - [152] F. Forti and M.E. Wright, "Measurement of MOS Current Mismatch in the Weak Inversion Region," IEEE Journal of Solid State Circuits, Vol.29, No2, Feb. 1994, pp. 138-142. - [153] S.J. Lovett, M. Welten, A. Mathewson and B. Mason, "Optimizing MOS Transistor Mismatch," IEEE Journal of Solid State Circuits, Vol.33, No1, Jan. 1998, pp. 147-150. - [154] R. Sarpeshkar, J.L. Wyatt, Jr., N.C. Lu and P.D. Gerber, "Mismatch Sensitivity of Simultaneously Latched CMOS Sense Amplifier," Proceedings ISCAS'93, pp. 2224-2227, June 1993. - [155] M. Peercy and P. Banerjee, "Fault Tolerant VLSI Systems," Proceedings of the IEEE, Vol.81, No.5, pp.745-758, May 1993. - [156] T. Kuroda et al., "A 0.9V, 150MHz, 10mW, 4mm2, 2-D Discrete Cosine Transform Core Processor with Variable Threshold-Voltage (VT) Scheme," IEEE Journal of Solid State Circuits, Vol. 31, No. 11, Nov. 1996, pp. 1770-1777. - [157] V. Szekely, M. Rencz, S. Torok, B. Courtois, "IDDQ Testing of Submicron CMOS by Cooling," Proceedings of The 1998 IEEE Asian Test Conference, pp. 105-108. - [158] J.C. Kalb, Jr., "Method for testing a semiconductor device by measuring quiescent currents (IDDQ) at two different temperatures," US Patent 5 742 177, April 1998. - [159] J. Kao, A. Chandrakasan, D. Antoniadis, "Transistor Sizing Issues and Tool for Multi-Threshold CMOS Technology," Proceedings 1997 IEEE/ACM Design Automation Conference, pp. 409-414. - [160] S. Mutoh, S. Shigematsu, Y. Gotoh and S. Konaka., "Design Method for MTCMOS Power Switch for Low Voltage High-Speed LSIs," 1998 IEEE ASPDAC Proceedings, pp. 113-116. - [161] T. Kawahara et al., "Subthreshold Current Reduction for Decoded-Driver by Reverse Biasing," IEEE Journal of Solid State Circuits, Vol. 28, no.11, nov. 1993, pp. 1136-1143. # ANNEX A # SEU-TOLERANT SRAM PROTOTYPE CIRCUIT USING ON-LINE CURRENT MONITORING # **Circuit Specifications** Process: AMS CMOS/epi 1.2 μm CAD Tool: Cadence Edge Core Area: $890 \times 3420 \mu m^2$ Cell area: $49.6 \times 19.8 \mu m^2$ BICS Area: $49.6 \times 135.6 \mu m^2$ ## **Circuit Description** Memory organization: 1k bit, 128 lines x 8 columns Access time: 21 ns (18 ns on 0.8 mm CMOS) Supply voltage range: 3.5 ... 6 V Error detection delay: 16 ns BICS sensitivity: V<sub>DD</sub>-referenced BICS: 130 fC V<sub>SS</sub>-referenced BICS: 80 fC Chip Microphotograph On the lower left side of the RAM array: two current pulse injection circuits for current-mode BIST operation Detailed Microphotograph of $V_{DD}$ -referenced BICS rows (upper side photo) and $V_{SS}$ -referenced BICS rows (lower side photo) embedded into the RAM cell array Layout drawing of a 4-cell module and detailed view of the substrate isolation technique for supply current monitoring Layout drawings of the $\ensuremath{V_{\text{DD}}}\xspace$ referenced and $\ensuremath{V_{\text{SS}}}\xspace$ referenced BICS circuits Measured access time waveforms Current pulse injection circuit: layout drawing and detail of chip microphotograph # **ANNEX B** # "ST-RAM" IC PROTOTYPE SEU-TOLERANT SRAM USING "DICE" REDUNDANT CELL #### **Circuit Specifications** Process: AMS 0.8 µm CMOS/epi CAD Tool: Cadence OPUS 4.3.3 Chip Area: $1529 \times 3031 \, \mu m^2$ Core Area: $974 \times 2340 \, \mu \text{m}^2$ Memory Organization: 1k x1 bit (Std.) + 512 bit x 1 (DICE) Std. Cell area: $36.2 \times 14.9 \,\mu\text{m}^2$ DICE Cell area: $36.2 \times 29.8 \,\mu\text{m}^2$ Access time: $18 \text{ ns } (V_{DD}=5V, T=25^{\circ} \text{ C})$ Supply voltage range: 3 ... 6 V #### **Circuit Description** ST-RAM chip contains two CMOS static RAM arrays, STD and DICE, organized as 8 columns x 128/64 rows. The two arrays share common control logic and address decoders, but have isolated data I/O circuits. DICE array has redundant cells of double size that are addressed in a special decoding mode that activates two separate, adjacent word lines. This dual word line selection architecture allows us to use the same row decoder for selecting memory cells in STD and DICE memory blocks. Read operation from DICE cells can be performed by activating either one of the two word lines. Write operation is performed by activating simultaneously both word lines. A special addressing mode has been implemented in the row decoder to allow the parallel activation of two adjacent word lines. This will be described in the sequel. Two control signals, memory enable (ME) and write enable (WEB) are used to perform the read/write operations. The memory access starts on the active "high" transition of the ME signal pulse. The logic state of the write enable signal selects the type of operation (read or write) to be performed during the current memory access. When in a "low" state (WEB = 0), this signal selects write operation. The data available at the two input lines (DI\_STD and DI\_DICE) during the ME pulse are written to the memory cells selected by the address inputs. The address is latched internally on ME rising edge. Read cycle are started on the rising edge of ME signal, if the write enable signal is "high" (WEB=1). The data read from the memory is available at DO\_STD and DO\_DICE outputs after the access time delay (typically 18 ns). A mode select control signal, MDS, is used to select either the normal addressing mode or a special DICE addressing mode. The DICE mode allows us to perform write operations into the DICE cells by activating simultaneously two topologically adjacent decoder outputs. MDS=1 selects the normal addressing mode, and MDS=0 activates the DICE addressing mode. Read operation from DICE cell array using the normal addressing mode selects twice each cell in DICE array, for adjacent even and odd row addresses in the same column. A0-A2 are column address lines, and A3 input represents the LSB address of the row decoder (i.e., A0 to A2 and A4 to A9). Write operation in DICE cell array using the normal mode (MDS=1) allows us to implement BIST algorithms for both permanent and transient failure detection in the redundant storage element array. In DICE mode, the row decoder selects two topologically adjacent rows ( $W_i$ , $W_{i+1}$ ), allowing simultaneous access to the 4 nodes of the DICE structure. In the STD block, DICE addressing mode selects two adjacent cell locations in parallel. A write operation to the STD block in DICE mode loads two consecutive locations with the same data. A read operation from the STD block in the DICE mode is valid only if the two cells addressed in parallel store the same data, otherwise the contents of one of the two cells is corrupted. A test algorithm for this hybrid memory architecture consists of typical march test sequences in both addressing modes, e.g.: - 1). Write operation in DICE addressing mode. - DICE array is loaded with a test pattern. STD array stores each data vector at two consecutive memory locations. - 2). Read operation in DICE mode. The test pattern is read in sequence in the DICE array. Only the even addresses in the STD array are checked. - 3) Write operation in normal addressing mode. - STD block is fully loaded with a test pattern. The contents of DICE memory locations are modified depending on their contents and the written data (i.e., only write 1 function is active in normal mode). - 4). Read operation in normal mode. The memory locations in the DICE block are read twice for each pair of addresses which differ only in the value of A3. Various test patterns applied in normal mode cover a wide range of permanent and transient faults in the DICE array. The test pattern stored in the STD block is fully checked. Using successive march sequences of read/write cycles with adequately selected patterns in both normal and DICE mode, we can implement high coverage tests for transient and permanent faults in the redundant DICE memory array. \_\_\_\_\_ #### Static SEU testing vs. dynamic SEU testing Conventional SEU tests are performed while the memory array is not accessed (i.e., in static mode). For large memory arrays, with hierarchical block-level architectures, the addressed cells represent a reduced percentage of memory capacity. Their sensitivity to upsets is not significantly affected neither during write nor during read cycles. The control and decoding circuits at the periphery generally use larger transistors and are less sensitive to SEU. However, the upset mechanisms during memory accesses might be significant for high density, low power submicron SRAMs in applications with a high "activity level" (i.e., with very frequent memory accesses). For DICE storage cell applications in high speed, pipelined sequential control logic, the analysis of dynamic behavior is mandatory in order to assess its hardness against SEU, since the hardness of the DICE cell is destroyed during cell access. Dynamic SEU tests induced in SEU-hardened memory arrays during active access cycles have much lower probability of occurrence, but they have increased probability of low latency. Read operation performed in normal mode avoid dynamic upset occurrence and significantly increase the immunity to low latency upsets in high activity redundant SRAM arrays. The STD block in the test chip uses a different design hardening technique to cope with SEU: each column of the memory array has current monitoring circuits (built-in current sensors - BICS) on the two supply lines. These current sensors detect the transient current pulses induced when a heavy particle hits a cell in that column and inverts its logic state. Radiation tests have shown that the BICS circuits also detect weaker transient currents (false alarms), that do not flip the cell. Chip Microphotograph and Layout Drawing ST-RAM Timing Diagram for Read/Write OPeration Read Waveforms measured at 50 MHz Clock Frequency A four-cycle loop read test sequence (R1/R0/R1/R0) is applied. # **ANNEX C** # "ST-REG" PROTOTYPE IC SEU-TOLERANT SHIFT REGISTER ARRAY USING 1.2 μm CMOS/EPI PROCESS ## **Circuit Specifications** Process: AMS BiCMOS 1.2 µm Bulk/epi CAD Tool: Cadence DFII V.4.4.3 Chip Area: 2191 x 1331 μm<sup>2</sup> Core Area: $1583 \times 788 \, \mu \text{m}^2$ Cell Areas: STD. Cell: 86.4 x 66 µm<sup>2</sup> DICE Cell: 159.4 x 66 μm<sup>2</sup> Optimized DICE F/F Cell: 137.4 x 39.6 µm<sup>2</sup> ### **Circuit Description** Three shift registers with separate input, output and clock terminals: Reg. 1: 64 bit, using SEU-hardened DICE standard cell D flip-flops Reg. 2: 56 bit, using standard, unhardened D flip-flops Reg. 3: 16 bit, using DICE flip-flops with optimized layout Shift operation is activated on the trailing edge of the clock signal. Output delay: Reg. 1: 8.3 ns Reg. 2: 7.3 ns Reg. 3: 10.9 ns Chip Layout and Microphotograph Layout of standard CMOS DFF, DICE standard cell and optimized DICE latch ### ANNEX D #### "DEEP1" TEST CHIP # SEU IMMUNE REGISTER CELL LAYOUT WITH ENCLOSED TRANSISTOR GEOMETRIES IN 0.25 MM CMOS PROCESS Standard D Flip-Flop Cell Layout. Layout of Redundant D-Flip-Flop Cell using a Single DICE Latch ### ANNEX E SEU-HARDENED 0.6 µm CMOS FLIP FLOP LIBRARY CELLS PRELIMINARY SPECIFICATIONS #### Comparison Tables for Std. and SEU-Hardened Flip-Flop Cells Table E.1 Area and Power Dissipation Comparison | Load Capacitance<br>L = 0.1 [pF] | | ea<br>ls²] | Power<br>[µW/MHz] | | | |----------------------------------|------|------------|-------------------|------|--| | Cell name | Std | RH | Std | RH | | | DFC | 1.35 | 4.8 | 11.6 | 19.1 | | | DFE | 2.03 | 4.8 | 13.2 | 21.7 | | | DFSA | 2.0 | 5.3 | 4.6 | 18.4 | | | DFS9 | 1.76 | 5.1 | 4.26 | 18.0 | | Table E.2 CK-Q Output Delay Comparison | | t <sub>pd</sub> [ns] (C | $C_{\rm L} = 0.1 \ \rm pF)$ | tpd [ns] (0 | $C_L = 0.7 \text{ pF}$ | $t_{pd}$ [ns] (C <sub>L</sub> = 1.0 pF) | | | |-----------|-------------------------|-----------------------------|-------------|------------------------|-----------------------------------------|------|--| | Cell name | std | RH | std | RH | std | RH | | | DFC | 0.55 | 0.64 | 1.26 | 1.11 | 1.60 | 1.33 | | | DFE | 0.58 | 0.66 | 1.26 | 1.13 | 1.58 | 1.35 | | | DFSA | 1.57 | 0.67 | 2.98 | 1.15 | 3.68 | 1.37 | | | DFS9 | 1.44 | 0.62 | 2.85 | 1.12 | 3.50 | 1.35 | | Table E.3 Worst-Case vs. Typical Power Dissipation in RH-Cells | 1 | Pd [µW/MH | z] (L = 0.1 pF) | |-----------|-----------|-----------------| | Cell Name | O/P. | worst-case | | DFC-RH | 19.1 | 33.7 | | DFE-RH | 21.7 | 38.3 | | DFSA-RH | 18.4 | 32.6 | | DFS9-RH | 20.8 | 34.0 | Table E.4 Worst-Case vs. Typical Propagation delays in RH-Cells | Delay C to Q [ns] | $t_{pd}$ [ns] (C <sub>L</sub> = 0.1 pF) | | tpd [ns] | $C_L = 0.7 \text{ pF}$ | $t_{pd}$ [ns] (C <sub>L</sub> = 1.0 pF) | | | |-------------------|-----------------------------------------|------------|----------|------------------------|-----------------------------------------|------------|--| | Nom cellule | Op. | worst-case | typ. | worst-case | typ. | worst-case | | | DFC-RH | 0.64 | 1.42 | 1.11 | 2.35 | 1.33 | 2.80 | | | DFE-RH | 0.66 | 1.46 | 1.13 | 2.41 | 1.35 | 2.84 | | | DFSA-RH | 0.67 | 1.45 | 1.15 | 2.50 | 1.37 | 2.98 | | | DFS9-RH | 0.62 | 1.41 | 1.12 | 2.37 | 1.35 | 2.88 | | — DFC-RHT+ — 0.6µm смоз DFC-RH is a fast static, Radiation Hardened to SEU, master-slave D flip-flop with 1x drive strength. | | Truth Table | | | | | | | | |---|-------------|---|----|--------|--|--|--|--| | | D | С | Q | QN | | | | | | Ē | Н. | 1 | н | . L | | | | | | | L | † | L | н | | | | | | | X | 4 | no | change | | | | | | Capacitance | | | |-------------|---------|--| | | Ci (pF) | | | D | 0.023 | | | C1 | 0.028 | | | C2 | 0.029 | | Power 15 µW/MHz Area 4.75 mils<sup>2</sup> Delay [ns] = tpd.. = f(SL, L) with SL = input Slope [ns]; Output Slope [ns] = op\_sl.. = f(L) with L = Output Load [pF] Delay [ns] = tpd.. = f(SL, L) L = Output Load [pF] ### AC CHARACTERISTICS ( Tj = 27°C VDD = 5.0 V TYPICAL PROCESS ) | Characteristics | Symbol | SL =0,1 | | | SL = 2.0 | | | | |------------------------|-----------|---------|-------|--------|----------|---------|---------|--| | | - Cynnoon | L=0.1 | L=0.7 | L= 1.0 | L=0.1 | L = 0.7 | L = 1.0 | | | Delay C1 to Q | tpdogr | 0.64 | 1.11 | 1.33 | 0.69 | 1.17 | 1.38 | | | | tpdogf | 0.66 | 1.09 | 1.27 | 0.66 | 1.10 | 1.28 | | | Delay C1 to QN | tpdcqnr | 0.76 | 1.70 | 2.16 | 0.80 | 1.75 | 2.20 | | | | fpdcqnf | 0.84 | 1.86 | 2.34 | 0.83 | 1.87 | 2.35 | | | Output Slope C1 to Q | op_slogr | 0.42 | 1.37 | 1.83 | 0.42 | 1.36 | 1.85 | | | | op_sicof | 0.41 | 1.00 | 1.30 | 0.41 | 1.01 | 1.31 | | | Output Slope C1 to QN | op_sicqnr | 0.20 | 1.05 | 1.46 | 0.20 | 1.04 | 1,45 | | | - Typi diops of 10 git | op_slognf | 0.26 | 1.31 | 1.82 | 0.26 | 1.33 | 1.84 | | | Characteristics | | Symbol | [ns] | Characteristics | Symbol | [ns] | |---------------------------------------------------------------|------|------------------|------|--------------------------|-----------------|------| | Min Delay C1 to C2 | | tdc1c2 | 0.4 | | | | | Min D Setup Time to C1 | High | taudoh<br>taudol | 0.65 | Min D Hold Time to C1 Hi | thdch<br>w thdd | 0.16 | | Min C1 Width = Min Delay<br>C1 to C2<br>(C1 Width = C2 Width) | High | twoh<br>twol | 0.4 | | - | | DFC-RHT+ = 0.6µm CMOS DFC-RH is a fast static, Radiation Hardened to SEU, master-slave D flip-flop with 1x drive strength. | | Truth | Table | | |---|-------|-------|--------| | D | C | Q | QN | | н | † | 14 | | | L | 1 | L | н | | × | 4 | no | change | | Capa | Capacitance | | | | | |------|-------------|--|--|--|--| | | CI (pF) | | | | | | D | 0.029 | | | | | | C1 | 0.035 | | | | | | C2 | 0.037 | | | | | Area 4.75 mils<sup>2</sup> Power 15 µW/MHz Delay [ns] = tpd.. = f(SL, L) Output Slope [ns] = op\_sl.. = f(L) with SL = Input Slope [ns]; with L = Output Load [pF] L = Output Load [pF] #### AC CHARACTERISTICS (Tj = 125°C VDD = 4.5 V TYPICAL PROCESS) | Characteristics | Symbol | SL =0.1 | | | | SL = 2.0 | | | |-----------------------|-----------|---------|---------|---------|---------|----------|---------|--| | Similar tables | - Symbon | L = 0.1 | L = 0.7 | L = 1.0 | L = 0.1 | L = 0.7 | L = 1.0 | | | Delay C1 to Q | tpdogr | 1.42 | 2.35 | 2.80 | 1.50 | 2.43 | 2.88 | | | | tpdogf | 1.42 | 2.14 | 2.44 | 1.48 | 2.22 | 2.48 | | | Delay C1 to QN | tpdoqnr | 1.60 | 3.26 | 4.06 | 1.68 | 3.34 | 4.15 | | | Design of the Cit | tpdcqnf | 1.81 | 3.74 | 4.63 | 1.87 | 3.82 | 4.70 | | | Output Slope C1 to Q | op_slogr | 0.82 | 1.75 | 3.29 | 0.82 | 2,43 | 3.26 | | | warper angle of 10 G | op_slogf | 0.77 | 1.83 | 2.33 | 0.77 | 1.81 | 2.33 | | | Output Slope C1 to QN | op_slognr | 0.36 | 2.37 | 2.53 | 0.36 | 1.80 | 2.54 | | | anger enge of to are | op_slognf | 0.46 | 2.26 | 3.13 | 0.47 | 2.24 | 3.15 | | | Characteristics | | Symbol | [ns] | Characteristics | Symbol | [ns] | |--------------------|-------------|------------------|------|-----------------------------------|----------------|------| | Min Delay C1 to C2 | | tdc1c2 | 1.00 | | | | | | High<br>Law | tsudch<br>tsuddl | 1.80 | Min D Hold Time to C1 High<br>Low | thách<br>tháci | 0.29 | | C1 to C2 | High<br>Low | twoh<br>twol | 1.00 | | | | E - 5 ### DFE-RHT+ === 0.6µm CMOS DFE-RH is a fast static, Radiation Hardened to SEU, master-slave D flip-flop with 1x drive strength. RESET is asynchronous and active low. | | Truth Table | | | | | | | | | |----|-------------|---|----|--------|--|--|--|--|--| | RN | D | С | Q | QN | | | | | | | L | × | × | L | Н | | | | | | | H | н | 1 | H | L | | | | | | | H | L | 1 | L | н | | | | | | | н | Х | 1 | no | change | | | | | | | Gapacitance | | | | | | |-------------|--|--|--|--|--| | Cl (pF) | | | | | | | 0.024 | | | | | | | 0.028 | | | | | | | 0.030 | | | | | | | 0.063 | | | | | | | | | | | | | Area 4.77 mils<sup>2</sup> Power 20 µW/MHz Delay [ns] = tpd.. = f(SL, L) Output Slope [ns] = op\_si.. = f(L) with SL = Input Slope [ns]; with L = Output Load [pF] L = Output Load [pF] #### AC CHARACTERISTICS (Tj = 27°C VDD = 5.0 V Typical Process ) | Characteristics | Symbol | | SL =0.1 | | | SL = 2.0 | | |-----------------------|-----------|-------|---------|---------|---------|----------|---------| | | wymas. | L=0.1 | L = 0.7 | L = 1.0 | L = 0.1 | L=0.7 | L = 1.0 | | Delay C to Q | tpdoqr | 0.66 | 1.13 | 1.35 | 0.72 | 1.18 | 1,40 | | | tpdcqf | 0.68 | 1.11 | 1.29 | 0.69 | 1.12 | 1.29 | | Delay C to QN | tpdcqnr | 0.78 | 1.72 | 2.18 | 0.83 | 1.76 | 2.24 | | STATE OF STATE | tpdcqnf | 0.85 | 1.88 | 2.36 | 0.86 | 1,89 | 2.38 | | Delay RN to Q | tpdrng | 0.52 | 0.91 | 1.07 | 88.0 | 1.26 | 1.42 | | Delay RN to QN | tpdmgn | 0.68 | 1.66 | 2.13 | 1,05 | 2.02 | 2,49 | | Output Slope C to Q | op_slcqr | 0.42 | 1.36 | 1.85 | 0.42 | 1.38 | 1.87 | | Traper Supe S to 4 | op_slcqf | 0.41 | 1.01 | 1.33 | 0.42 | 1.02 | 1.33 | | Output Slope C to QN | op_slognr | 0.20 | 1.03 | 1.46 | 0.20 | 1.03 | 1.47 | | | op_slcnqf | 0.26 | 1.32 | 1.85 | 0.26 | 1,32 | 1.84 | | Output Slope RN to Q | op_simg | 0.36 | 0.94 | 1.24 | 0.42 | 0.97 | 1.28 | | Output Slope RN to QN | op_simgn | 0.25 | 1.31 | 1.85 | 0.26 | 1.31 | 1.86 | | Characteristics | | Symbol | [ns] | Characteristics | | Symbol | [ns] | |---------------------------------------------------------------|-------------|------------------|--------------|------------------------|-------------|----------------|------| | Min Delay C1 to C2 | | tdc1c2 | 0.50 | | | | | | Min D Satup Time to C1 | High<br>Low | taudoh<br>taudol | 0.60 | Min D Hold Time to C1 | High<br>Low | thdch<br>thdcl | 0.16 | | Min RN Setup Time to C1 | Low | tsumc | 0.00 | Min RN Hold Time to C1 | Low | thmo | 0.26 | | Min C1 Width = Min Delay<br>C1 to C2<br>(C1 Width = C2 Width) | High<br>Low | twoh<br>twol | 0.50<br>0.50 | | | - | | | Min RN Width | Low | twm | 0.64 | | | | | ### = DFE-RHT+ 0.6µm CMOS DFE-RH is a fast static, Radiation Hardened to SEU, master-slave D flip-flop with 1x drive strength. RESET is asynchronous and active low. | Truth Table | | | | | | | | |-------------|---|---|----|--------|--|--|--| | RN | D | С | 0 | QN | | | | | L | х | х | L | н | | | | | н | H | 1 | н | 1 1 | | | | | н | L | 1 | L | н | | | | | H | X | 4 | no | change | | | | | Capacitance | | | | |-------------|--|--|--| | CI (pF) | | | | | 0.029 | | | | | 0.035 | | | | | 0.037 | | | | | 0.074 | | | | | | | | | Āroa 4.77 mils<sup>2</sup> Power 20 µW/MHz Delay [ns] = tpd.. = f(SL, L) Output Slope [ns] = op\_sl.. = f(L) with SL = Input Slope [ns]; with L = Output Load [pF] L = Output Load [pF] #### AC CHARACTERISTICS (Tj = 125°C VDD = 4.5 V Typical Process ) | Characteristics | Symbol | | SL =0.1 | | | SL = 2.0 | | |-----------------------|-----------|---------|---------|---------|---------|----------|-------| | 0.101.00.001.00.00 | ojiioo. | L = 0.1 | L = 0.7 | L = 1.0 | L = 0.1 | L = 0.7 | L=1.0 | | Delay C to Q | tpdcqr | 1.46 | 2.41 | 2.84 | 1.54 | 2.48 | 2.93 | | 7.77 | tpdcqf | 1.45 | 2.17 | 2.47 | 1.52 | 2.24 | 2.53 | | Delay C to QN | tpdognr | 1.84 | 3.32 | 4.09 | 1.72 | 3.40 | 4.21 | | 2012 2 10 211 | tpdoqnf | 1.84 | 3.79 | 4.70 | 1.91 | 3.85 | 4.76 | | Delay RN to Q | tpdmq | 1,05 | 1.71 | 1.96 | 1.54 | 2.19 | 2.45 | | Delay RN to QN | tpdrngn | 1.43 | 3.30 | 4,14 | 1.92 | 3.77 | 4.65 | | Output Slope C to Q | op_sicqr | 0.84 | 2.39 | 3.16 | 0.82 | 2.43 | 3.18 | | Torqui Groups C to G | op_sicqf | 0.79 | 1.84 | 2.38 | 0.79 | 1.83 | 2.35 | | Output Slope C to QN | op_sicgnr | 0.37 | 1.79 | 2.46 | 0.36 | 1.83 | 2.51 | | Sorpar Grope G to GIT | op_slongf | 0.47 | 2.29 | 3.20 | 0.48 | 2.26 | 3.15 | | Output Slope RN to Q | pmla_qa | 0.71 | 1.70 | 2.22 | 0.72 | 1,71 | 2.24 | | Output Slope RN to QN | op_simgn | 0.46 | 2.24 | 3.12 | 0.45 | 2.22 | 3.10 | | Characteristics | | Symbol | [ns] | Characteristics | | Symbol | [ns] | |---------------------------------------------------------------|-------------|------------------|--------------|------------------------|-------------|--------|------| | Min Delay C1 to C2 | | tdc1c2 | 1.20 | | | | | | Min D Setup Time to C1 | High<br>Low | taudch<br>taudci | 1.60 | Min D Hold Time to C1 | High<br>Low | thách | 0.49 | | Min RN Setup Time to C1 | Low | tsumc | 0.00 | Min RN Hold Time to C1 | Low | thrnc | 0.90 | | Min C1 Width = Min Delay<br>C1 to C2<br>(C1 Width = C2 Width) | High<br>Low | twoh<br>twol | 1.20<br>1.20 | | T | | | | Min RN Width | Low | twm | 1.11 | | | | | E - 8 ### DFSA-RHT+ = 0.6µm CMOS DFSA-RH is a static, Radiation Hardened to SEU, master-slave Scan D flip-flop with 1x drive strength. SCAN ENABLE switches between normal DATA input and SCAN DATA input. RESET is asynchronous and active low. | Truth Table | | | | | | | | | | |-------------|----|----|---|---|-----|--------|--|--|--| | RN | SE | SD | D | C | Q | QN | | | | | H | L | X | Н | 1 | · H | L | | | | | H | L | x | L | 7 | L | H | | | | | H | Н | н | X | Ť | н | L | | | | | H | Н | L | X | Ť | L | H | | | | | Н | X | x | X | 1 | no | change | | | | | L | Х | X | X | Х | L | н | | | | | - | Cl (pF) | |----|---------| | D | 0.17 | | C1 | 0.028 | | C2 | 0.030 | | RN | 0.063 | | SD | 0.012 | | SE | 0.028 | Capacitance Area 5.32 mils<sup>3</sup> Power 22 µW/MHz Delay [ns] = tpd.. = f(SL, L) Output Slope [ns] = op\_sl.. = f(L) with SL = input Slope [ns]; with L = Output Load [pF] L = Output Load [pF] #### AC CHARACTERISTICS (T) = 27°C VDD = 5.0 V Typical Process ) | Characteristics | Symbol | | SL =0.1 | | SL = 2.0 | | | | |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|---------|---------|-------|----------|---------|---------|--| | THE STATE OF | Joymoon | L = 0.1 | L = 0.7 | L=1.0 | L=0.1 | L = 0.7 | L = 1.0 | | | Delay C to Q | tpdogr | 0.67 | 1.15 | 1.37 | 0.84 | 1.20 | 1.43 | | | 27.07.07.0 | tpdcqf | 0.68 | 1.11 | 1.29 | 0.70 | 1.10 | 1.31 | | | Delay C to QN | tpdcqnr | 0.78 | 1.73 | 2.19 | 0.84 | 1.77 | 2.25 | | | 300,000 | tpdcqnf | 0.85 | 1.89 | 2.37 | 0.87 | 1.88 | 2.40 | | | Delay RN to Q | tpdrng | 0.55 | 0.99 | 1,16 | 0.80 | 1.25 | 1.42 | | | Delay RN to QN | todman | 0.73 | 1.76 | 2.25 | 0.98 | 2.02 | 2.52 | | | Output Slope C to Q | op_slogr | 0.41 | 1.37 | 1.90 | 0.20 | 1.34 | 1.91 | | | Soper Supple C to Q | op_slogf | 0.41 | 1,01 | 1.31 | 0.41 | 1.01 | 1.32 | | | Output Slope C to QN | op_slognr | 0.21 | 1.03 | 1,45 | 0.44 | 1.06 | 1.48 | | | Output Stope C to WN | op_slongf | 0.26 | 1.31 | 1.86 | 0.27 | 1.32 | 1.85 | | | Output Slope RN to Q | op_simq | 0.41 | 1.05 | 1.32 | 0.53 | 1.09 | 1.36 | | | Output Slope RN to QN | op_simon | 0.27 | 1.31 | 1.82 | 0.27 | 1.31 | 1.84 | | | Characteristics | | Symbol [no | [ns] | Characteristics | | Symbol | [ns] | |---------------------------------------------------------------|-------------|--------------------|------|------------------------|-------------|------------------|------| | Min Delay C1 to C2 | | tdc1c2 | 0.50 | | | | 0.00 | | Min D Setup Time to C1 | High<br>Low | tsudch<br>tsudcl | 1.00 | Min D Hold Time to C1 | High<br>Low | thdch<br>thdcl | 0.00 | | Min SD Setup Time to C1 | High<br>Law | tsusdch<br>tsusdcl | 0.80 | Min SD Hold Time to C1 | High<br>Low | thedch<br>thedcl | 0.00 | | Min SE Setup Time to C1 | High<br>Low | tsusech<br>tsuseci | 0.60 | Min SE Hold Time to C1 | High<br>Low | thsech | 0.00 | | Min RN Setup Time to C1 | Low | tsumo | 0.00 | Min RN Hold Time to C1 | Low | thme | 0.20 | | Min C1 Width = Min Delay<br>C1 to C2<br>(C1 Width = C2 Width) | High<br>Low | twoh<br>twal | 0.50 | | | | | | Min RN Width | Low | twm | 0.40 | | | | | ## DFSA-RHT+ \_\_\_\_\_ 0.6µm CMOS DFSA-RH is a static, Radiation Hardened to SEU, master-slave Scan D flip-flop with 1x drive strength. SCAN ENABLE switches between normal DATA input and SCAN DATA input. RESET is asynchronous and active low. | Truth Table | | | | | | | Capac | citance | | |-------------|----|----|---|---|-----|--------|----------------|---------|---------| | RN | SE | SD | D | C | Q | QN | | | Ci (pF) | | Н | L | X | H | 1 | · H | L | | D | 0.020 | | H | L | X | L | † | L | н | SD | C1 | 0.035 | | H | н | н | X | 1 | н | L | DFSA-RH | C2 | 0.037 | | H | H | L | X | 1 | L | H | SE | RN | 0.074 | | H | × | х | X | 1 | no | change | -CZ PN QND- | SD | 0.015 | | L | X | × | X | × | L | H | <del>- 0</del> | SE | 0.034 | | | | | | | | | | | 22200 | Area 5.32 mils<sup>3</sup> Power 22 µW/MHz Delay [ns] = tpd.. = f(SL, L) with SL = input Slope [ns]; Output Slope [ns] = op\_si.. = f(L) with L = Output Load [pF] L = Output Load [pF] #### AC CHARACTERISTICS (Tj = 125°C VDD = 4.5 V Typical Process) | Characteristics | Symbol | SL =0.1 | | | SL = 2.0 | | | |-----------------------|-----------|---------|---------|---------|----------|---------|-------| | One acteriore | aymou | L = 0.1 | L = 0.7 | L = 1.0 | L=0.1 | L = 0.7 | L=1.0 | | Delay C to Q | tpdogr | 1.45 | 2.50 | 2.98 | 1.51 | 2.59 | 3.07 | | 5002.500 | tpdoqf | 1.45 | 2.16 | 2.45 | 1.53 | 2.23 | 2.49 | | Delay C to QN | tpdaqnr | 1.64 | 3.42 | 4.25 | 1.69 | 3.47 | 4.33 | | 200,000 | todconf | 1.85 | 3.78 | 4.66 | 1.92 | 3.82 | 4.71 | | Delay RN to Q | todrng | 1.13 | 1.88 | 2.15 | 1.80 | 2.33 | 2.64 | | Delay RN to QN | tpdmgn | 1,53 | 3,49 | 4.39 | 1.99 | 3.96 | 4,87 | | Output Slope C to Q | op_slogr | 0.82 | 2.53 | 3.41 | 0.82 | 2.63 | 3.42 | | 23 per 010pe 0 to 4 | op_sloaf | 0.80 | 1.82 | 2.36 | 0.80 | 1.78 | 2.36 | | Output Slope C to QN | op_slognr | 0.36 | 1.86 | 2.54 | 0.36 | 1.86 | 2.55 | | | op_slcnqf | 0.46 | 2.25 | 3.15 | 0.48 | 2.25 | 3.17 | | Output Slope RN to Q | op_sirnq | 0.80 | 1.88 | 2.42 | 0.78 | 1.89 | 2,40 | | Output Slope RN to QN | op_simgn | 0.48 | 2.23 | 3.14 | 0.47 | 2.30 | 3.13 | # DFSA-RHT+ \_\_\_\_\_ 0.6μm cmos | Characteristics | Symbol<br>tdc1c2 | [ns] | Characteristics | Symbol | [an]<br>0.00 | | | |---------------------------------------------------------------|------------------|--------------------|-----------------|------------------------|--------------|------------------|------| | Min Delay C1 to C2 | | | | | | | | | Min D Setup Time to C1 | High<br>Low | tsudch<br>tsudci | 2.40<br>2.41 | Min D Hold Time to C1 | High<br>Low | thdch<br>thdcl | 0.00 | | Min SD Setup Time to C1 | High<br>Low | tsusdch<br>tsusdcl | 1.35 | Min SD Hold Time to C1 | High<br>Low | thedch<br>thedcl | 0.00 | | Min SE Setup Time to C1 | High<br>Low | tsusech<br>tsusecl | 1.50 | Min SE Hold Time to C1 | High<br>Low | thsecl | 0.00 | | Min RN Setup Time to C1 | Low | tsumc | 0.00 | Min RN Hold Time to C1 | Low | thmc | 0.80 | | Min C1 Width = Min Delay<br>C1 to C2<br>(C1 Width = C2 Width) | High<br>Low | twch<br>twcl | 1.10 | | | | | | Min RN Width | Low | twm | 1.92 | | | | | E - 13 #### RESUME Cette thèse propose des nouvelles méthodes de conception et de test des systèmes CMOS intégrés, permettant d'augmenter la fiabilité et la tolérance aux pannes en technologies submicroniques profonds, et répondre à l'augmentation des défauts non décelables au test de fabrication et à la sensibilité accrue aux aléas dus aux rayons cosmiques. Pour améliorer la détection de fautes dans les circuits CMOS complexes, des capteurs de courant intégrés à haute vitesse et sensibilité fonctionnant sous faible tension d'alimentation sont proposés. Les algorithmes de mesure de courants $I_{DDQ}$ , développés parallèlement, sont analysés et optimisés en synergie avec des techniques de conception à faible consommation. L'utilisation de capteurs de courant a été étendue à un test en ligne qui permet de détecter les fautes permanentes dans les applications critiques, et de corriger les erreurs dans les mémoires SRAM par codage de parité. Cette approche a été validée par des tests sous rayonnement sur des circuits prototypes. Une stratégie de conception de circuits CMOS immunes aux aléas indépendante de la technologie utilisée a été ensuite développée, basée sur des techniques de redondance locale. Sa validation expérimentale par des tests sous rayonnement a été effectuée sur des circuits prototypes réalisés en technologies CMOS commerciales 1,2, 0,8 et 0,25 microns. L'analyse des techniques de durcissement implantées a été faite à l'aide de méthodes de test intégré et en utilisant des équipements laser aux impulsions. Des mécanismes d'erreurs et une sensibilité aux aléas liés à la topologie ont été mis en évidence et caractérisés. En réponse, on a élaboré des règles de conception spécifiques, conduisant à un durcissement topologique aux aléas. Une bibliothèque de cellules séquentielles durcies a été développée, en vue de son utilisation dans un modem ASIC dédié à un satellite expérimental qui sera mis en orbite en 2001. **Mot clés:** conception CMOS submicronique, test IDDQ, capteurs de courant intégrés, redondance et tolérance aux pannes, test en ligne, conception durcie aux aléas, test au rayonnement, simulation d'upsets aux impulsions laser. #### **ABSTRACT** High performance ICs manufactured in deep submicron CMOS show reduced operating margins for timing, power and noise, and increased device sensitivity to contamination, size variations and cosmic ray effects. As a consequence, radiation-induced soft errors and small manufacturing defects that escape voltage-mode testing represent a chief concern in deep submicron CMOS. This thesis describes design and test techniques for high reliability and fault tolerance to cope with soft failures and soft errors in both commercial and safety-critical system applications. To improve the $I_{DDQ}$ test effectiveness in detecting soft failures, we developed highly sensitive Built-In Current (BIC) sensor designs operating at high speed and low supply voltage. Optimized $I_{DDQ}$ test algorithms with embedded current monitors are proposed, and synergetic effects with low power design techniques are explored. On-chip $I_{DDQ}$ monitoring techniques are subsequently extended to on-line testing in safety-critical CMOS system applications. An upset-tolerant static RAM design is described that uses current monitoring and parity coding for error detection and correction. Radiation test results on prototype circuits validate this approach. In order to avoid soft error occurrence in deep submicron CMOS applications, upset-immune design techniques using technology-independent local redundancy are described and analyzed. They are validated on memory and register array prototypes using commercial 1.2, 0.8 and 0.25 $\mu$ m CMOS processes. On-chip test techniques are implemented for redundancy assessment of fault-tolerant CMOS architectures. Upset mechanisms in SEU-hardened CMOS storage elements are detected and analyzed using a focused pulse laser equipment, and specific design rules are devised for topology-related hardening. An upset-hardened sequential cell library has been designed in 0.6µm CMOS to be employed in an ASIC modem chip for an onboard satellite experiment scheduled for 2001. **Keywords:** deep submicron CMOS design, IDDQ testing, Built-In Current Sensors (BICS), fault tolerance and redundancy, on-line testing, upset hardened design, radiation testing, upset simulation using a pulsed laser beam. Techniques de l'Informatique et de la Microélectronique pour l'Architecture d'Ordinateurs TIMA - 46, Av. Félix Viallet - 38031 Grenoble Cedex ISBN 2\_913329\_06\_3 broché ISBN 2\_913329\_07\_1 version électronique