The JAFARDD processor: a Java architecture based on a Folding Algorithm, with reservation stations, dynamic translation, and dual processing

Date

2018-11-07

Authors

El-Kharashi, Mohamed Watheq Ali Kamel

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Java’s cross-platform virtual machine arrangement and its special features that make it ideal for writing network applications, also have a tremendous negative impact on its operations. In spite of its relatively weak performance, Java’s success has motivated the search for techniques to enhance its execution. This work presents the JAFARDD (a Java Architecture based on a Folding Algorithm, with Reservation stations, Dynamic translation, and Dual processing) processor designed to accelerate Java processing. JAFARDD dynamically translates Java bytecodes to RISC instructions to facilitate the use of a typical general-purpose RISC core. This enables the exploitation of the instruction level parallelism among the translated instructions using well established techniques, and facilitates the migration to Java-enabled hardware. Designing hardware for Java requires an extensive knowledge and understanding of its instruction set architecture which were acquired through a comprehensive behavioral analysis by benchmarking. Many aspects of the Java workload behavior were collected and the resulting statistics were analyzed. This helped identify performance-critical aspects that are candidates for hardware support. Our analysis surpasses other similar ones in terms of the number of aspects studied and the coverage of the recommendations made. Next, a global analysis of the design space of Java processors was carried out. Different hardware design options and alternatives that are suitable for Java were explored and their trade-offs were examined. We especially focused on the design methodology, execution engine organization, parallelism exploitation, and support for high-level language features. This analysis helped identify innovative design ideas such as the use of a modified Tomasulo’s algorithm. This, in turn, motivated the development of a bytecode folding algorithm that integrates with the reservation station concept in JAFARRD. While examining the behavioral analysis and the design space exploration ideas, a list of global architectural design principles started to emerge. These principles ensure JAFARRD can execute Java efficiently and are taken into consideration while the various instruction pipeline modules were designed. Results from the behavioral analysis also confirmed that Java’s stack architecture creates virtual data dependencies that limit performance and prohibit instruction level parallelism. To overcome this drawback, stack operation folding has been suggested in the literature to enhance performance by grouping contiguous instructions that have true data dependencies into a compound instruction. We have developed a folding algorithm that, unlike existing ones, does not require the folded instructions to be consecutive. To the best of our knowledge, our folding algorithm is the only one that permits nested pattern folding, tolerates variations in folding groups, and detects and resolves folding hazards completely. By incorporating this algorithm into a Java processor, the need for, and therefore the limitations of, a stack are eliminated. In addition to an efficient dual processing configuration (i.e., Java and RISC), JAFARDD is empowered with a number of innovative design features, including: an adaptive feedback fetch policy that copes with the variation in Java instruction size, a smart bytecode queue that compensates for the lack of a stack, an on-chip local variable file to facilitate operand access, an early tag assignment to dispatched instructions to reduce processing delay, and a specialized load/store unit that preprocesses object-oriented instructions. The functionality of JAFARDD has been successfully demonstrated through VHDL modeling and simulation. Furthermore, benchmarking using SPECjvm98 showed that the introduced techniques indeed speed up Java execution. Our bytecode folding algorithm speeds up execution by an average of about 1.29, eliminating an average of 97% of the stack instructions and 50% of the overall instructions. Compared to other proposals, JAFARDD combines Java bytecode folding with dynamic hardware translation, while maintaining the RISC nature of the processor, making this a much more flexible and general approach.

Description

Keywords

Java (Computer program language), Reservation stations, Applied sciences

Citation