The VLSI design of a general purpose FFT processing node

Date

1985

Authors

McKinney, Brian Clifford

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This thesis presents the design of a general-purpose floating-point FFT processing node. The node is currently configured to operate in the Pipe­lined Cascade implementation of a radix-2 decimation in frequency algo­rithm. The implementation is capable of performing a 1024 point FFT in approximately 0.9 ms, or in 1.2 ms if on-chip twiddle factor updating is employed. The output signal-to-noise ratio for a 1024-point FFT is calcu­lated to be better than 60 dB. The general-purpose processing node is designed in a three-micron single metal ISO-CMOS technology. The node design occupies some 6400 X 6400 micron2 and contains approximately 18,000 transistors. Power dissipation estimates are placed at 1.5 watts per processing node. The node is currently designed to handle a modified IEEE standard 32-bit floating-point format in which the 23-bit mantissa is truncated down to 14 bits in length. Analysis performed on the effects of finite register length on FFT calculation has indi­cated that in order to achieve a signal-to-noise ratio greater than 60 dB for a 1024-point FFT, a mantissa length of only 14 bits is required. Central to the processing node operation is a fast floating-point arith­metic processing unit, or APU. The APU design is based on two multiple access pipelines or MAP structures operating concurrently. This architec­ture combines the flexibility of bus structures with a highly-concurrent pipe­line processor to realize a fast highly-flexible processing unit. The APU is microprogram controlled, and is capable of performing the decimation-in­frequency butterfly operation in under 700 ns. Processor node control is achieved by means of a master-slave control configuration in which the microprogrammed APU controller acts as slave to the node-level master controller. The node-level master controller is a sequential logic unit which is responsible for on-chip data routing, and for overseeing the proper sequencing of the APU operation. The Pipelined Cascade implementation is a highly flexible architecture. This thesis shows that with slight modification of the present design, fast calculation of inverse and multidimensional FFT's are possible. The speed and flexibility of the processing node design and proposed network imple­mentation are ideal for many task-specific problems requiring fast spectrum analysis.

Description

Keywords

Citation