Path: utzoo!attcan!uunet!lll-winken!sun-barr!apple!agate!linus!linus!bs
From: bs@linus.mitre.org (Robert D. Silverman)
Newsgroups: comp.arch
Subject: Re: benchmark for evaluating extended precision
Keywords: extended precision,multiply,benchmark,arithmetic
Message-ID: <119933@linus.mitre.org>
Date: 12 Sep 90 22:28:29 GMT
References: <3989@bingvaxu.cc.binghamton.edu> <119858@linus.mitre.org> <3994@bingvaxu.cc.binghamton.edu>
Reply-To: bs@gauss.UUCP (Robert D. Silverman)
Organization: The MITRE Corporation, Bedford MA
Lines: 88
Xref: dummy dummy:1
X-OldUsenet-Modified: added Xref
In article <3994@bingvaxu.cc.binghamton.edu> kym@bingvaxu.cc.binghamton.edu.cc.binghamton.edu (R. Kym Horsell) writes:
:In article <119858@linus.mitre.org> bs@linus.UUCP (Robert D. Silverman) writes:
:\\\
:>You are totally confused. Nowhere in your benchmark did I see any assembler
:>code to support 32 x 32 -- > 64. You are STILL working in radix 2^15
:>[or maybe radix 2^16; I didn't check closely], whereas allowing 32 x 32 --> 64
:>would allow you to work in radix 2^30 or 2^31 [depending on how careful
:>you want to be about sign bit and 1's complement/2's complement portability
:>problems]. Change of radix ALONE would account for a factor of 4 FEWER
:>multiplies. Secondly, you are working with tiny examples. Working in a
:>larger radix in a REAL WORLD PROGRAM means less storage required for each
:>multi-precise variable, less loop overhead, fewer subroutine calls, fewer
:>cache misses, etc.
:\\\
:
:Funny, I don't _fell_ confused. Some of the points you make _are_
:valid in some sense, but we're diverging from the topic.
:
:My whole point is that 32x32->64 is not _required_ in order to support
It is if you want any speed out of your code.
:extended precision arithmetic. There are various definitions of
:_required_ here: ``logically required'', ``convenience'', etc.
Executing in 1 day instead of 10 isn't a convenience. It's a necessity.
:Obviously the operation isn't _logically_ required, and as I've
One can implement a ALU that has only 8 or so instructions. Logically,
nothing else is required. So why bother at all with the rest that
get implemented in practice? Logically, they are not required at all.
They get put in for reasons of SPEED. Those doing large scale computations
want computers to be FAST. This is hardly a matter of convenience.
:The fact that I'm doing multiply naively is in _favour_ of
:your argument that
: ``[without 32x32->64 high prec arithmetic will] run 10 times slower''.
:
:If I had written this code using _better algorithms_ the difference
:between using 16x16->16 and 32x32->32 would be _even less_,
But one can apply these same 'better algorithms' with 32 x 32 --> 64 just
as readily.
:thereby weakening your claim further.
:
:Consider, we all know of fairly simply divide-and-conquer
:high-prec multiply with complexity of O(n^lg(3)). (Actually we
:_can_ do them in almost linear time, but the cost isn't worth
Have you ever implemented Karatsuba's algorithm? Tried it in practice?
What are the constants in your big O estimates? In practice, one does
not recusively bisect all the way down to the word level. One stops
the bisection at about 8 machine words. It is NOT n^lg 3 until one reaches
many hundreds (or thousands depending on implementation). Furthermore, the
method is only good when multiply numbers of the same size.
Karatsuba does NOTHING to change the basic argument. One can apply it in
radix 2^30 as easily as in 2^15.
:Instead of using 32x32->64 arithmetic we _could_ use
:32x32->32 arithmetic and only use the low-order bits
:for multiply.
:
:If we do this we must perform 4 times more ``small''
:multiplications to perform the ``big'' one; but each of these
:``small'' multiplies is performed on, effectively, 16-bit
:operands and should therefore be several times faster.
Wrong. with proper HARDWARE there is no reason why 32 x 32 --> 64 can't
be done in as few cycles as 16 x 16 --> 32. It just takes more gates.
I'm through arguing with you. It's obvious that you've never actually done
any large scale multi-precise calculations and in absence of that, your
arguments have little weight behind them. Your mind seems to be made up
and you refuse to accept the arguments of others who have VASTLY more
experience in doing this kind of arithmetic.
--
Bob Silverman
#include
Mitre Corporation, Bedford, MA 01730
"You can lead a horse's ass to knowledge, but you can't make him think"