CS232, Spring 2008

JVM Programming assignments JVM1-1 & JVM1-2

due Monday, April 14

You are to work with your assigned partner on these assignments.

The Java Virtual Machine

The Java Virtual Machine (JVM) is a stack-based machine, in that all data is stored and all computations are done on the stack.  For example, the "iadd" (integer add) instruction pops two integers off the top of the stack, adds them, and then pushes the sum.  Unlike the Wombat machines, there are no machine instructions for the JVM that allow you to indicate which registers to use for your computations.  The reason is that the JVM was designed to be simulated by interpreters on many different hardware platforms, some of which have more registers than others, and so the JVM does not assume there are any general-purpose registers on your machine.  As a result, any use of registers is handled at the microinstruction level in hardware or at the interpreter level in software and is not visible to the machine-language programmer.

In implementing the JVM using CPU Sim, we will try to imitate the IJVM instruction set (the subset of the JVM discussed in our text in Chapter 4) as closely as possible.  However, due to limitations in the size and scope of CPU Sim and the programs we will be writing, we will have to ignore some IJVM instructions and we will have to add a few new instructions for the IJVM.  Note that we will not deal with any of the object-oriented features of Java, such as the creation of objects, message passing, and inheritance.

The JVM1

The JVM1 is our first attempt at simulating the IJVM.  The JVM1 is available here (right-click on this link and save the target to a file with the name "JVM1.cpu"). You can also view the details of the JVM1 machine in HTML here. The JVM1 machine-language programs (called "bytecode"), like the real JVM, use opcodes that are 8-bits (one-byte) long.  The data used by the opcodes for their calculations are found either in the next few bytes of the program (as operands in the machine instructions) or in the top few locations on the stack or both.  In our implementation of the JVM using CPU Sim, we will use a separate RAM for the stack, just as we did for the Wombats, and the stack will grow upward from address 0.  The word-size for the stack is 32 bits (or 4 bytes).

             Opcode
  Mnemonic (dec)(hex)  Semantics
  nop         0   0    does nothing
  exit      252  FC    stop execution of the computer
  input     254  FE    get input from the user and push it on the stack
  output    255  FF    pop the top value and output it to the user
  iload      21  15    push on the stack the local variable whose  
                       index is found in the next byte of the program
  istore     54  36    pop the top value and store it in the local 
                       variable whose index is found in the next byte 
                       of the program
  iinc      132  84    increment the local variable whose index is 
                       found in the next byte of the program by the 
                       8-bit 2's complement value found in the  
                       following byte of the program
  iadd       96  60    pop the top two integer values from the stack, 
                       add them, and push the result on the stack
  isub      100  64    similar to iadd (the top value is subtracted 
                       from the second-to-top value)
  imul      104  68    similar to iadd
  idiv      108  6C    similar to isub
  ior       128  80    pop the top two integer values from the stack, 
                       compute the bitwise OR, and push 
                       the result on the stack
  iand      126  7E    pop the top two integer values from the stack, 
                       compute the bitwise AND, and 
                       push the result on the stack
  dup        89  59    duplicate the top value on the stack and push 
                       the duplicate on top
  dup_x1     90  5A    duplicate the top word and put it 2 words down 
                       on the stack
  dup_x2     91  5B    duplicate the top word and put it 3 words down 
                       on the stack
  pop        87  57    pop off and throw away the top value on the 
                       stack
  swap       95  5F    swap the top two values on the stack
  iconst_-1   2   2    push -1 on top of the stack
  iconst_0    3   3    push 0 on top of the stack
  iconst_1    4   4    push 1 on top of the stack
  ldc_w      19  13    push onto the stack the nth 32-bit integer 
                       constant in the CP, where n is the value in 
                       the next 2 bytes in the program
  goto      167  A7    jump unconditionally to the new location whose
                       relative address is in the next 2 bytes in the program
  ifeq      153  99    pop the top value off the stack; if that value 
                       is 0, jump to the new location whose relative address 
                       is in the next 2 bytes of the program
  iflt      155  9B    pop the top value off the stack;  if that value 
                       is less than 0, jump to the new location whose 
                       relative address is in the next 2 bytes in the program
  if_icmpeq 159  9F    pop the top two values off the stack; if they 
                       are equal, jump to the new location whose 
                       relative address is in the next 2 bytes in the program

It should be pointed out that all of these JVM1 instructions are real JVM machine instructions except that the exit, input, and output instructions are new.  Also, the ldc_w instruction in the real JVM can refer to data in the CP (the constant pool) other than 32-bit integers.

Note that the goto, ifeq, iflt, and if_icmpeq instructions find the pc-relative address to jump to in the next 2 bytes of memory and hence a JVM program can jump forward or backward up to 32768 bytes.

Sample Program

Consider the following JVM1 assembly language program (this program is available on the CS232 web page as "JVM1-0.a") that reads in an integer x and writes a 1 if x >= 0 and writes a -1 if x < 0:

        input           ; read x and push it on the stack
        iflt less       ; pop x and if x < 0, then jump to label less
        iconst_1        ; push a 1 on the stack
        goto end        ; jump to the label end
less:   iconst_-1       ; push a -1 on the stack
end:    output          ; pop the 1 or -1 and write it out
        exit            ; halt    

This program compiles into the following eleven bytes (given in unsigned decimal format):  254  155  0  8  4  167  0  9  2  255  252.

Note that each line of JVM1 assembly language in this program is either one byte or three bytes long.

JVM1 Implementation

The JVM1 is implemented with 11 registers:

All registers are 32 bits wide except for the status register, which is 1 bit wide, and the mbr, which is 8 bits wide.

Note that the JVM1 actually contains 3 RAMs:  Main, Stack, and Constant Pool (or "CP").  In a real computer, there is only one RAM and so there needs to be a pointer to the constant pool CP section of RAM.  That is what the cpp ("Constant Pool Pointer") register is for.  With CPU Sim, we will be using a separate RAM for the CP and so the cpp register is unnecessary.  However, we will leave it in our JVM1 for compatability with the IJVM in our text.

As with the Wombat, the pc contains the address of the next instruction to be executed.  In the fetch sequence, the next instruction is read from memory and copied into the mbr (in the Wombat computers, this register was called the ir), where the instruction is decoded. Arithmetic computations are done using the mdr and the h registers. The stack consists of 4-byte words, and so, for example, every time a word is pushed on the stack, the value of sp is actually incremented by 4.

Note that the sp in the JVM1 (like the IJVM) always points to the topmost value on the stack.  Therefore, theoretically, sp should have the value -4 when the stack is empty, but we will allow sp to be 0 in that case, which means in practical terms that there will always be a 0 automatically pushed on the stack for you at address 0 that you can ignore or use if you wish and the first value you push on the stack will appear at address 4.

Using iload, istore, iinc

The iload, istore, and iinc instructions deserve special explanation.  All three instructions refer to local variables that are specified by a non-negative index, which appears in the next byte of the program (after the iload, istore, or iinc opcode) as an unsigned integer.  For example, the instruction "iload 2" means to load (that is, push on the operand stack) the value of the local variable at index 2.  Since the index is stored in one byte and is not stored in two's complement notation, its value can be any integer in the range 0 to 255.

The local variables reside in the stack and the lv register contains the (absolute) address of the bottom-most local variable (the variable with index 0).  Therefore to determine the address on the stack of the local variable whose value is to be pushed when executing the iload instruction, the iload execute sequence multiplies the index of the local variable by 4 (since each variable occupies 4 bytes) and then adds it to the lv register's value.  For example, if the index is 2, then the local variable whose value is to be pushed onto the stack can be found 2 words (or 8 bytes) up (toward the top of the stack) from the location pointed to by the lv register.  Therefore, if the lv register has the value 12, then the value of local variable 2 is found at address 20 on the stack.

The iinc instruction takes 2 operands, the first of which is a local variable index, exactly like the operand to the iload and istore instructions.  The second operand is an 8-bit 2's complement number (in the range -128 to 127).  The iinc instruction increments the value of the local variable with the given index by the amount in the second operand.  For example, the instruction "iinc 2 -1" increments by -1 (that is, it decrements) the value of the local variable at index 2.  Nothing is pushed on or popped off the stack by the iinc instruction.

When you write a program for the JVM1 using local variables and the iload, istore, and iinc instructions, you will need to make space on the stack for those local variables.  So, for example, if you have 3 local variables, then you will need to set aside three memory locations (12 bytes) on the stack before beginning execution.  There are several ways to do this.  You can just manually set sp to 8 before beginning execution or you can initialize sp to 0 (which indicates that there is one local variable at address 0 on the stack) and then start your program with two instructions, such as the iconst_0 instruction, that push initial values on the stack for the other two local variables. The latter way is probably easiest. You will want the lv register to be initially 0.

You are expected to use local EQUs to name your local variables.  For example, the local variable with index 0 could be given the name "x".  Then you can write assembly instructions in the form "iinc x 1" or "iload x", which are much more readable than "iinc 0 1" or "iload 0".

JVM1 Exercises

Exercise JVM1-1

Add a new instruction to the JVM1 (to form the "JVM1.1") called "bipush", with opcode 10 (hex).  The bipush instruction takes the next byte in the program (the byte following the bipush opcode) and pushes it on the stack.  It treats that byte as an 8-bit 2's complement integer (in the range -128 to 127).  In this way, you have something more general than the iconst_-1, iconst_0, and iconst_1 instructions; that is, the bipush instruction lets you push any of 256 values on the stack instead of just the three values -1, 0, and 1.  When implementing the bipush instruction, don't forget to copy into the TOS register the value being pushed onto the stack, since the TOS is always supposed to contain a copy of the top value on the stack.

Note:  As mentioned above, the value in the byte after the bipush opcode contains an 8-bit 2's complement number and so be sure that it retains its sign when you load it into a 32-bit register or stack location.  All the microinstructions you will need to implement the bipush instruction are already present in the JVM1 (although you are free to add more microinstructions if you wish).

Now write a program called "JVM1-1.a" for the JVM1.1 that reads in two integers a and b and computes and outputs the value of (a + 25)*(b - 18). 

On a separate sheet or as comments at the bottom of your program, outline what you would have to do to write this program if you didn't have the bipush instruction and if you couldn't ask the user to input the integers 25 and 18.

Exercise JVM1-2

To practice using loops, conditionals, and local variables, write a JVM1 or JVM1.1 program that reads in an arbitrary number of positive integers and computes and outputs the average of those integers. It stops reading integers when a negative integer is read. The negative integer is just a sentinel value and is not included in the average. Use at least one local variable in your program.

What to hand in

As usual, hand in a hard copy of your assembly language programs JVM1-1.a and JVM1-2.a and drag a copy of both assembly language program files and your JVM1.1 machine file "JVM1.1.cpu" into a folder called "JVM1.your-last-name(s)" and zip up the folder and send it to me as an email attachment.