loading...
Series Approximation Methods for Divide and Square Root in the Power3(TM) Processor
Adelaide, Australia April 14-April 16
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ARITH.1999.76283614th IEEE Symposium on Computer Arith ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Martin S Schmookler, IBM Corporation
Ramesh C. Agarwal, IBM Corporation
Fred G. Gustavson, IBM Corporation
The Power3 processor is a 64-bit implementation of the PowerPC(TM) architecture and is the successor to the Power2(TM) processor for workstations and servers which require high performance floating point capability. The previous processors used Newton-Raphson algorithms for their implementations of divide and square root. The Power3 processor has a longer pipeline latency, which would substantially increase the latency for these instructions. Instead, new algorithms based on power series approximations were developed which provide significantly better performance than the Newton-Raphson algorithm for this processor. This paper describes the algorithms, and then shows how both the series based algorithms and the Newton-Raphson algorithms are affected by pipeline length. For the Power3, the power series algorithms reduce the divide latency by over 20% and the square root latency by 35%.
Citation:
Martin S Schmookler, Ramesh C. Agarwal, Fred G. Gustavson, "Series Approximation Methods for Divide and Square Root in the Power3(TM) Processor," arith, pp.116, 14th IEEE Symposium on Computer Arithmetic (ARITH-14 '99), 1999
Usage of this product signifies your acceptance of the Terms of Use.