The status of research in sparsity/pruning of deep neural networks (DNNs)

Gnanaratnam, Nirmala2022-01-172022-01-1720222022-01-17http://hdl.handle.net/1828/13712Even though deep neural networks (DNNs) were first proposed around 1960s there was a rapid progress in related research starting from about 2012. This was due to the availability of large public datasets, cheap compute that could efficiently run these data driven algorithms, the rise of open source ML platforms and the resulting spread of open source code and models. In addition DNN research has attracted a lot of funding and is of high commercial interest. All of these reasons have contributed to a high volume of research papers; for example in sparsity/pruning DNNs about one paper per couple of days published on arXiv and growing exponentially. Pruning is about training a network that is larger than necessary and then removing parts that are not needed during inference so that lesser resources are required to store it and less compute to execute the trained network. Even from early days researchers observed that neural networks converge easily while training if the network is large and used it as an experimental heuristic. The published literature on ‘pruning’ show many ways to identify the aforementioned useless parts or removing them before, during or after training. It even turns out that not all kinds of pruning actually allow for accelerating neural networks, which is supposed to be the whole point of pruning. Moreover, due to the fact that these research areas are quite new and in a rapidly developing stage based mostly on experimental methods there is some concern in the research community about the quality of published research. The purpose of this report is to consider research conducted in deep learning in general and sparsity/pruning of neural networks in particular from the viewpoint of diverse stakeholders in the research community as related to the status of published research, empirical rigor and reporting results and some technical issues related to efficient deployment.enAvailable to the World Wide Webpruningsparsityneural networksThe status of research in sparsity/pruning of deep neural networks (DNNs)project