Optimized hardware accelerators for data mining applications

Kanan, Awos

Optimized hardware accelerators for data mining applications

dc.contributor.author	Kanan, Awos
dc.contributor.supervisor	Gebali, Fayez
dc.contributor.supervisor	Ibrahim, Atef
dc.date.accessioned	2018-02-19T20:27:12Z
dc.date.available	2018-02-19T20:27:12Z
dc.date.copyright	2018	en_US
dc.date.issued	2018-02-19
dc.degree.department	Department of Electrical and Computer Engineering
dc.degree.level	Doctor of Philosophy Ph.D.	en_US
dc.description.abstract	Data mining plays an important role in a variety of fields including bioinformatics, multimedia, business intelligence, marketing, and medical diagnosis. Analysis of today’s huge and complex data involves several data mining algorithms including clustering and classification. The computational complexity of machine learning and data mining algorithms, that are frequently used in today’s applications such as embedded systems, makes the design of efficient hardware architectures for these algorithms a challenging issue for the development of such systems. The aim of this work is to optimize the performance of hardware acceleration for data mining applications in terms of speed and area. Most of the previous accelerator architectures proposed in the literature have been obtained using ad hoc techniques that do not allow for design space exploration, some did not consider the size (number of samples) and dimensionality (number of features in each sample) of the datasets. To obtain practical architectures that are amenable for hardware implementation, size and dimensionality of input datasets are taken into consideration in this work. For one-dimensional data, algorithm-level optimizations are investigated to design a fast and area-efficient hardware accelerator for clustering one-dimensional datasets using the well-known K-Means clustering algorithm. Experimental results show that the optimizations adopted in the proposed architecture result in faster convergence of the algorithm using less hardware resources while maintaining the quality of clustering results. The computation of similarity distance matrices is one of the computational kernels that are generally required by several machine learning and data mining algorithms to measure the degree of similarity between data samples. For these algorithms, distance calculation is considered a computationally intensive task that accounts for a significant portion of the processing time. A systematic methodology is presented to explore the design space of 2-D and 1-D processor array architectures for similarity distance computation involved in processing datasets of different sizes and dimensions. Six 2-D and six 1-D processor array architectures are developed systematically using linear scheduling and projection operations. The obtained architectures are classified based on the size and dimensionality of input datasets, analyzed in terms of speed and area, and compared with previous architectures in the literature. Motivated by the necessity to accommodate large-scale and high-dimensional data, nonlinear scheduling and projection operations are finally introduced to design a scalable processor array architecture for the computation of similarity distance matrices. Implementation results of the proposed architecture show improved compromise between area and speed. Moreover, it scales better for large and high-dimensional datasets since the architecture is fully parameterized and only has to deal with one data dimension in each time step.	en_US
dc.description.embargo	2019-12-31	en_US
dc.description.scholarlevel	Graduate	en_US
dc.identifier.uri	http://hdl.handle.net/1828/9079
dc.language	English	eng
dc.language.iso	en	en_US
dc.rights	Available to the World Wide Web	en_US
dc.subject	Data Mining	en_US
dc.subject	Parallel Algorithms	en_US
dc.subject	Hardware Acceleration	en_US
dc.subject	Systolic Arrays	en_US
dc.subject	Design Methodology	en_US
dc.title	Optimized hardware accelerators for data mining applications	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Kanan_Awos_PhD_2018.pdf
Size:: 2.13 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Electronic Theses and Dissertations (ETD)