Newsgroups: comp.lang.apl
Path: watmath!watserv1!utgpu!news-server.csri.toronto.edu!rpi!zaphod.mps.ohio-state.edu!mips!darwin.sura.net!haven.umd.edu!socrates!socrates!rockwell
From: rockwell@socrates.umd.edu (Raul Deluth Miller-Rockwell)
Subject: Re: APL execution efficiency revisited
In-Reply-To: andrew@rentec.com's message of 6 Apr 92 22:36:54 GMT
Message-ID: <ROCKWELL.92Apr7091904@socrates.umd.edu>
Sender: rockwell@socrates.umd.edu (Raul Deluth Miller-Rockwell)
Organization: Traveller
References: <1992Mar30.203818.15221@yrloc.ipsa.reuter.COM> <783@kepler1.rentec.com>
	<1992Apr5.180150.41@yrloc.ipsa.reuter.COM> <794@kepler1.rentec.com>
Date: Tue, 7 Apr 1992 14:19:04 GMT

Robert Bernecky:
   >The point I am STILL trying to make (and will give up if this
   >iteration fails...) is that APL already HAS all the information at
   >hand to optimally (from the standpoint of ordering the matrix
   >products to minimize ops) determine the best ordering.

Andrew Mullhaupt:
   Yes. If you are doing +.x / you can optimize this. It is much
   harder to say something about 'on-line' matrix chain product, where
   you are supposed to compute the partial product as you go along, as
   in the case of a loop.

This is a good argument against using loops.  The significant question
is _why_ are you using a loop?  If the reason is that you must deal
with time (say, interactive input, where you must display partial
results), there isn't a whole lot you should do about this.  If the
reason is that you couldn't think of anything better, well, that's a
personal problem.

If the reason is that the whole problem couldn't fit into memory, you
can take an APL solution and expand it -- once you've got a decent
algorithm.  Its tempting to recommend that the language support this
transparently, but virtual memory might be a way around this problem.
[Also, it is harder to write fast code that's general when there's
many orders of magnitudes difference between different kinds of
storage.]

   >A. A posting by a single programmer is not necessarily optimal for any
   >   language.

   Yes, but that posting was claimed to be optimal for APL2. I have
   reason to believe that it is very fast in APL2, and nobody seems to
   have posted better, so maybe I'll just claim "best known", although
   you might want to be conservative about it and say that it's hard
   enough to do better that nobody is motivated to post better...

Well, there's a number of reasons I favor J over APL2.  One is that it
seems much easier to optimize.  At present, this is barely even a
theoretical advantage, but... we'll see.

           1 + max / B x T jot.- T <- iota rho B <- 0 unequal M

Um... shouldn't you ravel that before doing the max reduce?  Also, are
you sure you don't want to also take the absolute value at the same
time?

   which computes the bandwidth of a matrix M. I have not seen interpreted

   You're right. I should have said something other than most.
   Although almost all APL interpreters I used have call-out, very few
   supported call-in.  I used APL2 (which now has it) and STSC and
   Sharpe APL. One of our resident APL programmers here doesn't think
   you can do it with Dyalog. Maybe we're not talking about the same
   thing.

I understand that STSC now has it as well (quadNA).  J has it [of
course].  I don't know about Sharp, but J is (to my mind) its
successor.  I don't know about Dyalog either -- but if you want it,
why not contact the people who produce it?

-- 
Raul Deluth Miller-Rockwell                   <rockwell@socrates.umd.edu>
