Jerome,
I find your discussion of multiword precision math very interesting although I
don't have any applications
that need that sort of thing. The idea of extending the instruction set to do high speed
external math
routines reminds me of the folks using GPUs to do high speed math operations.
I recently visited the University of Illinois NCSA facility where the new "Blue
Waters" system went into operation late last summer. ( See
https://bluewaters.ncsa.illinois.edu/hardware-summary )
It was designed by Cray and has 22,640 Cray XE6 nodes and 4,228 Cray XK7 nodes that
include NVIDIA graphics processor acceleration. It can achieve 13+ petaflops, but the
power bill is a killer as it draws 24 megawatts but that includes the water chillers and
the 9,000 gallons per minute of cooling water flowing through it.
It is very neat that Eratz-11 has that kind of extensibility in it. I would love to play
with it's multiprocessor capabilities sometime with RSX11M+ like Johnny B. has done.
Good Luck with your project.
Mark
On Jun 23, 2015, at 8:05 AM, Jerome H. Fine wrote:
About 10 years ago, I was using an algorithm which
required more than
15 digits of precision. I wrote some PDP-11 assembler code which
could handle unsigned values up to 2^512 (just under 10^160) plus
fractional numbers with 1024 bits that had a precision on the right hand
side of the decimal point equal to the integer portion - 512 bits for each.
Actually, there were three levels of precision: 128 bits, 256 bits and the
maximum at 512 bits. The FORTRAN 77 integer symbols were LU...,
MU... and NU... while the corresponding integer / fractional symbols
were LX..., MX... and NX..., all allocated as CHARACTER *n variables.
These subroutines are designed to be used under FORTRAN 77, so any
PDP-11 operating system (such as RT-11 and RSX-11) can easily make
use of them. While these routines include ADD, SUBTRACT and
MULTIPLY, DIVISION is not available, although that is easily remedied
via a FORTRAN 77 subroutine which arrives at the result via the standard
approximation algorithm. Also available are ENCODE and DECODE
routines to convert between internal binary and external decimal values.
In addition, there are routines to convert back and forth between all six
sizes of variables and DOUBLE PRECISION floating point or REAL *8
variables.
Of late, I realized that a signed variable aspect is required, so I have begun
to consider what is needed. ALSO, because I so often run the PDP-11
code under the Ersatz-11 emulator, I will consider supporting the use of
six additional PDP-11 instructions (for each ONLY one combination of
registers will be used as operands - Ersatz-11 supports a DLL):
UMUL16 - unsigned multiple two 16 bit variables
MUL32 - signed multiple two 32 bit variables
UMUL32 - unsigned multiple two 32 bit variables
UDIV16 - unsigned divide a 32 bit variable by a 16 bit variable
DIV32 - signed divide a 64 bit variable by a 32 bit variable
UDIV32 - unsigned divide a 64 bit variable by a 32 bit variable
the UMUL16 and UMUL32 instructions being especially important to
perform multi-precision MULTIPLY. I will also consider the possibility
of a single PDP-11 instruction to perform multi-precision arithmetic of
values contained in memory using that ability of the Ersatz-11 emulator
to LOAD a user written DLL, namely to convert many of the PDP-11
multi-precision assembler subroutines to a single PDP-11 instruction
which would then be executed using x86 instructions at a much higher
speed, sort of like a CIS for multi-precision variables. In that case,
much larger sized variables could easily be supported due to the much
higher speed of execution. In addition, the (approximately) 16KB
of subroutine instruction / data memory within the emulated PDP-11
could be substantially reduced.
If there is sufficient interest and support, complete algorithms might be
implemented which could directly make use of the x86's huge GB
memory to solve particular problems - sort of like a SLAR auxiliary
processor CPU (which for example performed an FFT on a KB
sized array in virtual memory) implemented in software rather than
hardware.
I hope that some interest is expressed. Commercial inquiries for a
specific algorithm would obviously receive priority, but hobby users
are expressly encouraged to express all of their needs as well.
Jerome Fine