A high performance fault tolerant cache memory for multiprocessors




Luo, Xiao

Journal Title

Journal ISSN

Volume Title



In multiprocessor systems, cache memories serve two purposes, namely the reduction of the average access time to the shared memory and the minimization of interconnection network requirements for each processor. However, in a cache, interference between operations from the processor and operations for data coherence from other caches degrades the cache performance. We propose a cache with only one single dual-port directory which can be operated for both processor accesses and coherence operations simultaneously. The cache can reach high performance at low cost. This cache also has a data-coherence-protocol-independent structure. To evaluate the cache performance in a multiprocessor environment, two simulation models are created. The system performance is extensively simulated. The results show that the single dual-port directory cache system has higher performance than that obtained by a system with single one-port directory caches. Other design parameters such as cache size, line size, and associativity on system performance are also discussed. Furthermore, simulations indicate that use of multiple buses significantly increases system performance. In order to improve the reliability of the proposed cache, we design a tag self-purge mechanism and a comparator checker at low cost in the cache management unit. We also propose a new design that provides combinational totally self-checking checkers for 1/n codes in CMOS technology, which can be used to build such a checker for the 1/3 code. Moreover, the total hardware overhead is less than 42%, as compared to the traditional single directory cache management unit. The dissertation includes a new optimal test algorithm with a linear test time complexity, which can be used to test the cache management unit by either the associated processor or external test equipment. An efficient built-in self-testing variant of the proposed algorithm is also discussed. The hardware overhead of such a scheme is much less than the traditional approach.