Local Descriptor Image Matching Acceleration and its Hardware Implementation




Soleimani, Parastoo

Journal Title

Journal ISSN

Volume Title



Computer vision algorithms have been used in an increasing number of applications during the past decades. One of the foremost challenges for using computer vision algorithms in practical applications is computational intensity which in turn may impact performance. In this dissertation, the focus is on improving speed performance by proposing novel algorithmic and hardware design techniques. Contributions are described for feature extraction and image matching. Histogram of Oriented Gradients (HOG) is one of the commonly-used algorithms for feature extraction. In order to increase the speed of computation, a hardware-software co-design is presented. The proposed design makes four contributions, including a new task allocation method which reduces resource utilization, logarithm-based bin assignment which reduces latency, parallel histogram generation for latency reduction, and a simplified block normalization technique for reducing resource utilization. The proposed design of the HOG algorithm attains comparable frame rates and is shown to use fewer hardware resources in comparison with existing work in the literature. Further contributions of this dissertation are related to the various steps of image matching algorithms, including scale-space generation, descriptor computation, and descriptor matching. For scale-space generation, a real-time FPGA-based implementation of the AKAZE algorithm with non-linear scale-space generation is proposed. The proposed implementation makes two main contributions, that include (1) mapping the two passes of the AKAZE algorithm onto a hardware architecture for parallel processing of multiple image sections, and (2) designing multi-scale line buffers for reducing resource utilization. A frame rate of 304 frames per second for a 1280×768 image resolution is achieved which is shown to be faster in comparison with other published work. For feature description, a novel circular shifting binary descriptor is proposed which leads to an efficient rotation invariant image matching. This new method eliminates complex operations such as multiplication and division from the orientation estimation step and thus significantly lowers the number of operations for descriptor computation. For descriptor matching, a novel content-addressable memory (CAM) architecture is proposed which significantly accelerates the matching step of the image matching pipeline. The time complexity of the proposed modified CAM approach to binary descriptor matching is O(n) while typically-used methods for matching have time complexity of O(n^2). The resource utilization and timing metrics for several experiments are reported to demonstrate the efficacy of the proposed design. Finally, the circular binary shifting descriptor and novel CAM matching design are applied to an experimental real-world application in aerial image matching to demonstrate the capabilities of the proposed methods.



Content-addressable memory, Image matching, Hardware acceleration, Binary descriptor