##
Multiplication and Division Routines for Coldfire

Here are a few routines that I've put together to make up for Coldfire's
lack of 32 x 32 -> 64 and 64 / 32 -> 32 instructions.

mulu64.S
is the basic sum of partial products using four MULU.W instructions.
The basic problem here is to get the 16-bit pieces in the proper places
so they can be summed.

muls64_m.S
This one uses the new fractional mode of the MAC to get the upper 32 bits
of the signed product.
(Requires "J" mask 5307 or 5407.)
0x80000000 squared is a special case for the MAC.
This is treated as -1 * -1, but as the MAC fractional mode does
not have a representation for 1, the MAC V bit is set.

mulu64_m.S
This is the unsigned version of muls64_m.S.
It applies a correction to convert the signed multiply into an
unsigned one.
(The same correction may be applied to mulu64.S also.)

divu64_1.S
A basic, bit-by-bit, unsigned divide routine.
It uses a few tricks to speeds things up a bit.

divu64_2.S
Pretty much the same as divu64_1.S, but uses Coldfire's one conditional
data instruction (Scc) to avoid doing some conditional branches.

divs64_1.S
Basic signed divide.
Calculates abs(a)/abs(b) and remainder, then fixes up the signs.

divs64_2.S
Another signed divide routine.
This one is based on divu64_1.S, but builds up the quotient in a seperate
register.

divu64_c.S
Unsigned "long" division routine.
Uses the DIVU.W instruction to speed things up a bit. This runs in about
6 us on a SBC5206elite board with the cache enabled,
vs about 9 us for the bit-by-bit versions.

Last modified: April 4, 2003

Wayne Deeter -
wrd@deetour.net