Path: utzoo!utgpu!news-server.csri.toronto.edu!mailrus!wuarchive!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!ark1!oasys!mimsy!chris
From: chris@mimsy.umd.edu (Chris Torek)
Newsgroups: comp.arch
Subject: hardware multiply/divide (was F.P. vs. arbitrary-precision)
Message-ID: <26453@mimsy.umd.edu>
Date: 10 Sep 90 10:10:39 GMT
References: <3755@osc.COM> <4513@taux01.nsc.com> <119244@linus.mitre.org> <2530@l.cc.purdue.edu>
Organization: U of Maryland, Dept. of Computer Science, Coll. Pk., MD 20742
Lines: 61
Xref: dummy dummy:1
X-OldUsenet-Modified: added Xref
In article <2530@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:
[much deleted]
>If we only have 32x32 -> 32, are effectively limited to 16-bit units. ...
[argues for 32 x 32 => 64 bit multiply, and 64 / 32 => 32 bit quotient +
32 bit remainder]
>As to why these, and other things, are not in C, I presume that the
>founders of C did not think of them.
Perhaps, but more likely it is the same as the reason old C did all
arithmetic in double precision: this was easier on the PDP-11.
Remember, C was originally designed as a nice way to clean up the Unix
system (some time around Unix V4 or so) so as not to have to maintain
all that assembly. Portability arguments now help sustain the `int
times int yeilds int' system: C is intended to match the machines
fairly closely, and a sufficiently large fraction of machines stick one
with N-bits-times-N-bits-gives-only-N-bits instructions that it would
probably be a mistake to put stricter requirements on C compilers.
Personally, I feel that in this particular case Herman Rubin is correct:
whenever hardware multiply and divide are justified (and as we have
recently heard, they seem to be justified on most general-purpose chips),
they should be N-times-N-gives-2N and 2N-over-N-gives-N-remainder-N
instructions.
In another article (sorry, no reference this time) someone else portrays a
somewhat confused idea of what ANSI C requires from multiply expressions.
The Standard says that, given two `int' values v1 and v2, the result of
`v1*v2' is one of the following:
a) the proper mathematical v1*v2, if v1*v2 `fits' in an `int';
b) otherwise, anything at all, including `overflow trap, programmer
shot; please call the morgue'.
Because of (b), code of the form
long res = v1 * v2;
is allowed to set `res' to the `proper' value that would be obtained by
an N-by-N-yeilds-2N multiply. It is *not* required to do so, and on
many machines it will produce, e.g., -2 when v1=32767 and v2=2. The
proper way to compute v1*v2, if `long's are known to be twice as long
as `int's, is
long res = (long)v1 * v2;
or long res = v1 * (long)v2;
or long res = (long)v1 * (long)v2;
Any compiler worthy of being labelled `optimizing' should be able to
compile any of the latter three expressions into one N-by-N-yeilds-2N
instruction.
(The optimization required to correctly handle
long sixtyfourbitdividend;
int thirtytwobitdivisor, thirtytwobitquot, thirytwobitrem;
thirtytwobitquot = sixtyfourbitdividend / thirtytwobitdivisor;
thirtytwobitrem = sixtyfourbitdividend % thirtytwobitdivisor;
is considerably more tricky, but doable.)
--
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 405 2750)
Domain: chris@cs.umd.edu Path: uunet!mimsy!chris