HotpathVM: An Effective jit compiler for Resource-constrained Devices

Yüklə 217,56 Kb.

Pdf görüntüsü

səhifə	3/8
tarix	07.11.2018
ölçüsü	217,56 Kb.
	#78936

1 2 3 4 5 6 7 8

2.2

Recording Traces

Traces are recorded by manipulating the threaded code branch tar-

get stored in the intermediate representation of each instruction as it

is executed by JamVM’s direct threaded interpreter [4]. The entire

interpreter of JamVM is located in a single C method. Each Java

Virtual Machine Language (JVML) instruction is implemented as

a code block that is preceded by a label. The address of this la-

bel is stored in the intermediate representation, and JamVM uses

GCC’s [12] computed goto statement to jump to the corresponding

label of each instruction as it is executed.

To record a trace, the monitor manipulates the threaded code

branch target in the instruction immediately following the ﬁrst in-

struction of the trace to redirect execution to a recording code

block. This special code block can be understood as a meta instruc-

tion that is (virtually) inserted after each executed instruction. The

recording block records the current value of the program counter,

the opcode of the executed instruction, and the value on top of the

stack in a trace structure. Every time the recording block is invoked

it reverses the change to the previously executed instruction that

caused the recording block to be invoked and patches the next in-

struction label instead. Using this technique the recording block

essentially chases the execution, always making sure that it is one

step ahead of the program counter.

While our approach triples the number of hard to predict (and

thus costly) indirect dispatches during trace recording, the overhead

is essentially zero when the trace recorder is disconnected and is

not recording traces. The only noteworthy overhead is incurred

for JVML branch instructions, because we have to increment the

invocation frequency counter and compare it to a threshold value.

To eliminate this overhead we redirect branch instructions to code

blocks that do no longer perform proﬁling once a second stop-loss

threshold is exceeded. In this case we assume that no suitable traces

could be extracted from the bytecode in question.

While recording traces, the direction (taken or not taken) of

every conditional branch is recorded. These decision points in

the original program become guard instructions in the recorded

trace. At runtime, these guard instructions ensure that the control

ﬂow still follows the previously encountered path. Depending on

whether the original branch was taken or not, a guard instruction

is emitted that either checks for the branch condition itself or its

complement.

A taken branch instruction that branches if the value is equal

zero (jeq), for example, becomes a guard if not equal zero in-

struction (gne), because the compiled trace has to be aborted if the

original assumption (value is equal zero) does not hold at the time

the trace is executed. The same branch instruction (jeq) would be

converted into a guard if equal zero instruction (geq) if it was not

taken in the initial code, because now the compiled trace has to be

guarded against the opposite condition.

Lookup table dispatches (JVML instructions lookupswitch

and tableswitch) are used in Java to efﬁciently implement the

switch

/case construct of the Java high-level language. When

encountering a lookup table dispatch instruction, the virtual ma-

chine determines the bytecode address where to continue execution

based on a table of address and value pairs. We treat lookup table

dispatches as if they were simple conditional branches that com-

pare a value with a constant. When the trace is executed, we expect

the same value to be encountered by the guard instruction, or the

trace is terminated through a side exit.

Optimizing virtual method invocations is a non-trivial task for

traditional compilers. Since our compiler always follows the ac-

tually executed instruction stream, we only have to insert a guard

We will discuss in the next section how to merge several traces so we can

compile lookup table dispatches that have several hot paths.

instruction that checks that the predicted target method is still valid.

For this, the guard instruction compares the actual type of the object

the virtual method is invoked on with the type that was encountered

when the trace was recorded. If the two types still match, execution

of the trace continues with the already recorded and compiled in-

struction stream.

If any guard instruction fails at runtime, control is transfered

back to the VM to resume execution at the bytecode level. For this,

the mapping between machine code registers and Java stack and

local variables has to be preserved. Our intermediate representation

embeds this information in every guard instruction and the code

generator creates side exit stubs that write back the values into the

appropriate local variable locations.

Our initial implementation maintained mapping information for

both, the Java operand stack as well as Java local variables. Pro-

ﬁling of our ﬁrst prototype showed that the code generator never

actually had to generate write-back code for stack locations. While

in principle legal in JVML, the Java compiler does not generate ba-

sic blocks that have more than one predecessor and a non-empty

stack at the entry point. Instead, the stack is always empty after

branch instructions, except for some special constructs such as the

“? :” operator. However, the latter can never be a loop header and

thus will never trigger the recording of a trace and is thus irrelevant

in this context. Based on this observation we removed support for

traces that start with a non-empty stack or contain side-exits with a

non-empty stack.

The last map is recorded at the very end of the trace where

execution returns to the loop header (trace entry point). This map is

special as it indicates which Java local variables are altered during

each loop iteration. To ensure proper semantics, all values altered

during a loop iteration have to be written back at all side exists,

because the value could still be pending from a previous loop

iteration (Figure 2).

2.3

Stop Conditions

Recording ends in two cases—either we decide to abort the record-

ing due to certain circumstances, or we successfully ﬁnish the

recording.

As we are only interested in the loop body itself, we deﬁne

a number of abort conditions which terminate the recording of a

trace. If an exception is thrown, for example, we immediately ter-

minate recording, because exceptions are by deﬁnition rare events

and by its very nature JIT compilation only focuses on frequently

executed code areas. Similarly, we also abort tracing when we en-

counter a native method invocation. The rationale of this design

decision is that native method invocations are already fairly expen-

sive, no matter if compiled or interpreted, and we expect them to

happen infrequently. Also, by not having to generate code for native

code invocation, our backend compiler is simpliﬁed signiﬁcantly.

When execution returns to the original loop header where a trace

started, the recording of the trace ends and the newly recorded trace

is sent to the JIT compiler for translation to optimized machine

code.

Compiling Traces

Once a trace has been recorded, our trace compiler translates it to

directly executable machine code. As mentioned previously, our

compiler is specialized in compiling traces only and expects its

input to be free of branch instructions except for a single, ﬁnal,

unconditional branch at the end of the trace, which jumps back

to the loop header. The code generated for the trace is optimized

with the assumption that the trace will be executed repeatedly

(loop), until it is left through a side exit (guard instructions). In

our system, traces do not have a generic exit point, as they always

Yüklə 217,56 Kb.

Dostları ilə paylaş:

1 2 3 4 5 6 7 8