Library usage analysis in the C++ codebase of Fedora Linux 37
Date
2024
Authors
Deng, Jiachao
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
C++ source code analysis is conducted at scale. A framework is proposed for analyzing the C++ codebase of operating systems that employ the dnf package manager, such as Fedora Linux and Red Hat Enterprise Linux. The framework can run an arbitrary static analysis tool over software packages that contain C++ code from compatible operating systems. In order to evaluate the effectiveness of the framework and to better understand how the C++ language is used in practice, a C++ analysis tool is developed to study library usage with a fine level of granularity, considering instances of uses of types, type aliases, member/non-member functions, variables, and enumerators.
Our framework, combined with the C++ library usage analysis tool, is used to analyze 2 379 software packages from the codebase of Fedora Linux 37. The number of packages analyzed is two to three orders of magnitude larger than that of previous C++ research. We applied our library usage analysis tool to nearly 400 million lines of C++ code across these packages. Leveraging the Clang compiler front-end libraries, our tool extracts information from correctly parsed C++ code, which is an improved approach compared to many existing studies. As a result, the tool provides an accurate collection of library usage instances from C++ software.
Numerous observations are made regarding various aspects of library usage that can facilitate improved teaching of C++, aid in the refinement of C++ libraries, and help guide the future evolution of the C++ standard. For example, our analysis reveals that C++ programmers rarely use some C++ standard library algorithms designed for specialized purposes or combined operations. These algorithms often appear in less than 1% of all C++ software packages investigated. We suggest that the standard library exercise caution when adopting infrequently needed algorithms to maintain a streamlined interface. Such observations summarize current trends in C++ library usage and provide recommendations for improving the C++ language and its libraries.
Description
Keywords
C++, Library usage, Source-code analysis, Fedora Linux, DNF package manager, Clang