This project is dedicated to the memory of William Morris (aka Frags), who was the main contributor to the bounty but was unable to see the final result.

Sunday, May 19, 2013

One small step for mankind, a giant leap for the project

I have no idea how did I manage to achieve this much in this update, but it is certainly a confident step forward. For this time the list is long and diversified:
  • Implementation of Bcc.x addr, BCHG.B Dx,mem, BCHG.L Dx,Dy, BCLR.B Dx,mem, BCLR.L Dx,Dy, BRA.x abs, BSET.B Dx,mem, BSET.L Dx,Dy, BSR.x abs, BTST.B #imm,mem, BTST.B Dx,#imm, BTST.B Dx,mem, BTST.L Dx,Dy, CMP.x #imm,mem, CMP.x mem,Dy, CMP.x reg,Dy, CMPA.L reg,Ax, CMPA.W reg,Ax, CMPA.x mem,Ay, DBF.W Dx,addr, EOR.x #imm,mem, JMP.L abs, JMP.L mem, JSR.L abs, JSR.L mem, NEG.x mem, NOT.x mem, RTS, TAS.B Dx, TAS.B mem instructions.
  • Cache invalidation fix for OSX 10.3.9 and below. (Thanks to Mike Blackburn again.)
  • Fixed mask handling in BCHG.B Dx,mem instruction.
  • Fixed missing register mapping in ASL.x #imm,Dy implementation.
  • Fixed input dependency overwriting in certain memory-related allocation functions.
  • Fixed dependency for destination memory pointer register in special memory reading.
  • Fixed post address handler for condition code addressing modes, previously it might crash or call some random handler from the other addressing modes.
  • Fixed instructions where temporary registers are allocated but not free'ed.
  • Optimized masking for register to register bit instruction.
  • Optimized the temporary register usage in helper_test_bit_register_register function.
  • Optimized flag extraction in several shifting operation.
  • Branch scheduling is more flexible: adding multiple interleaved branches is possible.
  • Comment on missing implementation for an exception on loading odd address into PC.

A few highlights

First of all, let me brag around a little bit about the number of freshly implemented instructions. Right now 237 instructions are implemented out of 388, a solid 61% is done. (Previously the ratio was ~46%.)

More MacOSX versions are supported now, Mike fixed up the cache flushing a little bit and added the pre-10.4 versions too. Please read the included README file regarding the compiling instructions.

While I was working on the instructions I discovered a few bugs and glitches, which are now fixed in this release thus improving the overall stability.

I have also managed to optimize the compiled code for some instructions. Together with the implementation of some yet missing instructions the results for the Mandelbrot test (mandel_though_hw.kick.gz among the test kick files) improved a bit compared to the previous results:

Interpretive: 108 seconds (no change there...);
JIT compiled without optimization: 44 seconds (previously it was 52 seconds);
JIT compiled with optimization: 27 seconds (previously it was 32 seconds).

That was the time for the self-polishing and now back to work...