The available Instruction Level Parallelism in Java bytecode (Java-ILP) is not readily exploitable due to dependencies involving stack operands. The sequentialization due to stack dependency can be overcome by identifying bytecode-traces, which are sequences of bytecode instructions that when executed leave the operand-stack in the same state as it was at the beginning of the sequence. Instructions from different bytecode-traces have no stack-operand dependency and hence can be executed in parallel on multiple operand-stacks. We propose a simultaneous multi-trace instruction-issue (SMTI) architecture for a processor that can issue instructions from multiple bytecode-traces to exploit Java-ILP. Extraction of bytecode-traces and nested bytecode folding are done in software during the method verification stage. SMTI combined with nested folding resulted in an average ILP speedup of 54% over the base in-order single-issue Java processor, when experimented with SPECjvm98, Scimark and Linpack benchmarks.
Citation:
R. Achutharaman, R. Govindarajan, G. Hariprakash, Amos R. Omondi, "Exploiting Java-ILP on a Simultaneous Multi-Trace Instruction Issue (SMTI) Processor," ipdps, pp.76a, International Parallel and Distributed Processing Symposium (IPDPS'03), 2003