                                   Chapter 16

                                DEVELOPMENT OF J


      Some General Questions about J
      J 3.0 words BNF
      Parallelizability of APL?
      Strength Reduction for J
      bracket indexing .vs. merge (and space wars)
      implementing @:
      implementing J
      mixing it up: The role of fractional axis brackets
      Statistical Functions in J
      J 3.2 - Early vs. Late Verb Bindings
      what is j?
      brackets
      J3.2


*========================================
# Some General Questions about J

+------------------
| Christopher Browne
| <1991Feb23.234505.5960@csi.uottawa.ca>

I'm in the process of trying to assimilate some sort of understanding of
the J language.  It looks like it has distinct advantages over APL,
notably in that it doesn't use "funny characters."  That makes it:

a) Easier to port to multiple machines
  i.e. - it's MUCH LESS hardware dependent.  One APL solution is to have
  a standard "character set generator" for each machine - but you still have
  to put together SOMETHING that's massively machine dependent when you use
  classical APL.

b) Nicer to use with foreign file systems.
  It uses "standard ASCII" (and EBCDIC support wouldn't be too hard to add...)
  so that files will be far easier to move between different machines &
  versions of the language.  I don't know how "classical APL" put together
  workspaces, but I'm pretty sure that the ONLY sensible way to edit it was
  by using APL (or some customized version of XEDIT...).  This adds Yet
  Another Improvement:

c) You can use ANY editor to edit programs/data.
  Old APL workspaces were ugly in MANY ways...  Being able to load vars/
  functions from files is a NICE improvement.  I can use Emacs, you can use
  Brief, someone else can use WordPerfect (ick!)...  Others may use vi...

BUT
---

There seem to be a few downsides.  Related to one of them, I have a public
question...

a) Doesn't use the "funny characters"
  That special character set WAS good in that it was easy to tell the various
  operations apart.  Now, instead of looking like strange-character gibberish,
  the code looks like punctuation gibberish...

  It'll take some work with the system to be able to figure out the operator
  set, especially since MANY have characters in common.

  This is a lose-lose situation.  There are both advantages and disadvantages
  to using either a) Special APL characters or b) Combinations of standard
  punctuation characters.

  I think that the benefits of moving to ASCII outweigh the costs, but there
  IS certainly a cost...

b) Documentation at FTP archives

  This isn't a language issue - just a distribution issue.  All I could find
  on watserv1 was a document entitled "J Implementation Status," showing the
  various verbs, adverbs, conjunctions, pronouns, copulae, and conjunctions,
  with (in most cases) not much explanation of what they're about.

  T'would be nice to have some sort of brief documenting describing:

   I)  What is J?  How did APL work lead to J?
   II) What are the major differences?
   III)What are the operators?  What do they do? (e.g. - giving a little more
         than just the operators' names.)

  I've been able to figure out many of my confusions by looking at the
  sundry tutorial files, but it would be nice to have some sort of overview
  of the system.

  I realize that there's a $24 J Reference Manual available, but I'm a
  poor grad. student who can't really afford every manual that comes
  available...   If I start using J really extensively, then maybe...

c) Matrix Referencing (the "" operation)
  This shouldn't be TOO hard...

  In original APL, one could grab part of a matrix by doing something like
  the following:

  (I'll add a little J in to substitute for those operators that are not
  in ASCII...)

  EXPENDITURES=. 2 3 6 $ 161 161 161 164 164 164 9 3 11 6 4 4 10 17 22 9 11 20
  130 132 184 197 163 0 1 7 9 10 5 6 21 24 8 11 17 161

  EXPENDITURES
161 161 161 164 164 164
9 3 11 6 4 4
10 17 22 9 11 20

130 132 184 197 163 0
1 7 9 10 5 6
21 24 8 11 17 161

  I'd then like to add things up with respect to the second index.

  I'd use the operation:
     +/[2]EXPENDITURES
getting:
180 181 194 179 179 188
152 163 201 218 185 167

In J, it's something annoying like
+/ {Ugliesomitted{MoreUglies{EXPENDITURES

Question 1)
  What should those two sets of uglies be?

Question 2)
  There must be some general J idiom for doing this sort of thing.
  What might it be?

Question 3)
  Why was this operation changed?  I can see that the new way is far more
  explicit (and may be able to do things the old way couldn't).  The square
  braces aren't used for anything in J now - they could have been used for
  the old operation.  Why not?

  (I'm sure there are good reasons for these things.  I'd just like to know
  why...)


+------------------
| Michael J. A. Berry
| <MJAB.91Feb25105438@godot.think.com>


   From: cbbrowne@csi.uottawa.ca (Christopher Browne (055908))
   Date: Sat, 23 Feb 91 23:45:05 GMT

   I'm in the process of trying to assimilate some sort of understanding of
   the J language.

    .
    .
    .

   c) Matrix Referencing (the "" operation)
     This shouldn't be TOO hard...

I'm not too sure what you mean by the "" operation.

     In original APL, one could grab part of a matrix by doing something like
     the following:

     (I'll add a little J in to substitute for those operators that are not
     in ASCII...)

     EXPENDITURES=. 2 3 6 $ 161 161 161 164 164 164 9 3 11 6 4 4 10 17 22 9 11
20
     130 132 184 197 163 0 1 7 9 10 5 6 21 24 8 11 17 161

     EXPENDITURES
   161 161 161 164 164 164
   9 3 11 6 4 4
   10 17 22 9 11 20

   130 132 184 197 163 0
   1 7 9 10 5 6
   21 24 8 11 17 161

     I'd then like to add things up with respect to the second index.

     I'd use the operation:
        +/[2]EXPENDITURES
   getting:
   180 181 194 179 179 188
   152 163 201 218 185 167

Actually, assuming an index origin of 0 as in J, that expression would be
+/[1]EXPENDITURES

   In J, it's something annoying like
   +/ {Ugliesomitted{MoreUglies{EXPENDITURES

   Question 1)
     What should those two sets of uglies be?

These are not the "uglies" you want.  In both traditional APL and J what
you are doing is not a selection of elements or "grabbing part of a matrix"
(what { does) but rather, specifying how to carve up EXPENDITURES for +/.
The confusion probably comes from the fact that square brackets were used
for both selection and axis specification in APL.  The two uses are
unrelated.

   Question 2)
     There must be some general J idiom for doing this sort of thing.
     What might it be?

In J, one would write +/"2 EXPENDITURES.  This is read "sum rank two of
EXPENDITURES"  or "apply summation to the rank-2 cells of EXPENDITURES."

The rank operator is not new with J.  It was in Dictionary APL and Sharp
APL (and maybe others as well).  The [] notation was used for several
different things in APL.  In all cases, the syntax was anomalous since a
pair of symbols was used where other APL functions and operators use a
single symbol.  Even the $24 documentation of J may not tell you all you
want to know.  If you want to read more about the rank operator, I suggest:

Berneky, Robert "An Introduction to Function Rank" in the Proceedings of
APL 88 (ACM).

To understand more of the roots of J, I suggest:

Iverson, Kenneth E. "A Dictionary of APL" APL Quote-Quad Vol. 18, Number 1,
September 1987.

The J distribution includes a file "references" with pointers to articles
on J itself.

Here is a sample J session illustrating the use of the rank operator to
solve your problem:

$ /public/j/sun4/j
J Version 2.9   Copyright (c) 1990 1991, Iverson Software Inc.

   EXPENDITURES=. 2 3 6 $ 161 161 161 164 164 164 9 3 11 6 4 4 10 17 22 9 11 20
130 132 184 197 163 0 1 7 9 10 5 6 21 24 8 11 17 161

   EXPENDITURES
161 161 161 164 164 164
  9   3  11   6   4   4
 10  17  22   9  11  20

130 132 184 197 163   0
  1   7   9  10   5   6
 21  24   8  11  17 161

   +/"2 EXPENDITURES
180 181 194 179 179 188
152 163 201 218 185 167


   Question 3)
     Why was this operation changed?  I can see that the new way is far more
     explicit (and may be able to do things the old way couldn't).  The square
     braces aren't used for anything in J now - they could have been used for
     the old operation.  Why not?

All the various things which [] used to do are done in J in different ways.
The primary use of [] in APL was subscripting.  Unlike {, however, it was
impossible to write a rank-independent selection expression with the []
notation because you had to know how many semicolons to put in.

     (I'm sure there are good reasons for these things.  I'd just like to know
     why...)


+------------------
| Robert Bernecky
| <1991Feb26.210503.2798@yrloc.ipsa.reuter.COM>


I won't try to deal with all your concerns about J vs apl, but
here are two comments:

a. character set: The apl character set is an ONGOING problem:
   Every new editor, printer, keyboard, etc., would need special
   tinkering to support it. Ergo, we give up and go ascii.

b. hard-to-read squiggles: You get used to them, just as you got
   used to apl squiggles. For example, x#y is replicate or compression.
   $ is reshape (with slightly different semantics due to "item"
   orientation of J.

c. (ok, so I can't count- that's why I use computers).
   how do I do reductions along specific axes, etc? Like this.
   Assume T is your rank 3 array:
     +/t   gives you first axis summation
     +/"3 t would be the same.
     +/"2 t  gives you second axis summation. Basically, it says:
            "apply plus reduction to the rank-2 objects of t".
     +/"1 t gives last axis reduction (sum each row of t).

There are two reasons for abandoning the +/[k] approach of apl:
a. brackets are syntactically anomalous, and are not functions, operators,
   or anything else that is generally applicable to apl.

b. brackets have special and unique meanings for EACH primitive.
   You can't tell someone: foo[k] means thus and such.

   In J, the rank adverb ("k) has a specific and CONSISTENT meaning
  for all verbs, including user defined ones: Apply the verb/function
  to the rank k objects of the argument.

Simple, straightforward, and consistent.
See my paper on "An Introduction to Function Rank" in the acm sigapl
apl88 conference proceedings (apl quote quad vol 18, no 2) for
discussion of function rank in an apl environment.


+------------------
| Roger Hui
| <1991Feb27.045522.5924@yrloc.ipsa.reuter.COM>

a. The special symbols APL are elegant, but they are very expensive.
You yourself outlined some of the costs, by pointing out things made
possible by eschewing the special symbols.

As to the possible disadvantages of the ASCII spelling, anything new
will look strange at first.  The Language Summary in the dictionary is
worthy of study. You may come to appreciate that the ASCII spelling
is no less mnemonic than the special symbols.

b. Documentation. "status.doc" is not meant to be sufficient for
learning J. It just tells you what's been implemented and what has not.
$24 buys you a printed manual called "Tangible Math and the ISI Dictionary
of J" (author: K.E. Iverson).  If you feel that you can not afford the
cost, there are alternatives:

- Hui/Iverson/McDonnell/Whitney, APL\?, APL90 Conference Proceedings.
[Introduced J, "philosophic" discussions, reasons behind design].
- K.E. Iverson, Dictionary of J, Vector (Journal of the British
APL Association), October 1990. [This version of dictionary now
slightly out-of-date.]
- K.E. Iverson, J, Vector, July 1990.
- The tutorial frames in the distribution package.

c. The bracket axis "operator" has been superseded by the rank (")
conjunction. See the msg from Bob Bernecky filed 1991 2 26 21 05 GMT.


+------------------
| Michael J. A. Berry
| <MJAB.91Feb28132704@nanna.think.com>

In article <1991Feb26.162714.28837@csrd.uiuc.edu> jaxon@sp27.csrd.uiuc.edu
(Greg P. Jaxon) writes:
    .
    .
    .

   I was sure I understood the rank operator until I read Mike Berry's reply.
   What I can't understand now is how the summation chose the `middle' axis.
   Am I right that there are 2 `rank-2 cells' in EXPENDITURES, and they are
   3x6 arrays?  If so shouldn't summing them yield a 3x6 result?
   Or is J's rank operator asking sum to produce a rank-2 result?, in
   which case I want to know how it picked the axis to reduce.

   I'm just guessing here, but has +/ been changed to work by default on the
   leading axis of its argument?  Applying +/[#IO] to the rank-2 cells of
   EXPENDITURES would give 2 6-element results.

Yes, that is what happened.  In J, *everything* favors the first axis.  /
does what slash-bar used to do, comma does what comma-bar used to do.  This
(I assume) is to make the rank operator more convenient to use.

   Regards,
   Greg Jaxon  Univ. of Ill. Center for Supercomputing R&D
   jaxon@csrd.uiuc.edu


+------------------
| Robert Bernecky
| <1991Mar2.050935.4427@yrloc.ipsa.reuter.COM>

<... first axis operations were done... to make use of the rank operator
    easier...>

There's more to it than that.

Consider that traditional APL had 3 (count 'em, three!) flavors of
many operations. For reduction, there was reduction along the first
axis, written as slash overstruck with minus. There was reduction along
the last axis, written as slash. And, there was reduction along axis k
written as slash leftbracket k rightbracket.

J unifies all those into a single reduction, written with slash, in which
you achieve the other axes via the rank adverb.

Because the rank adverb has consistent semantics, you only have to learn how
it works once. Because reduction has consistent semantics, ditto.
But you end up with a whole family of reductions, and REDUCE the number
of symbols required to express the notation.

So, it'
it's not just to make rank easier. Function rank is the key to simplifying
APL, and the first axis is its prophet. Oopsie. Apologies to Asimiv.
Asimov. ... the first axis enables those simplifications.


+------------------
| Edward M. Cherlin
| <39727@cup.portal.com> 2 Mar 91 17:34:48 GMT


cbbrowne asked for help in understanding J and its rationale. He can find some
of the answers in the article "The Principles of J" in APL News, 22,3 (1990).
The article explains most J features with examples for the more confusing
new features, and shows the development from Sharp (or Reuter:file) APL to
J, starting from before nested arrays and including Dictionary APL. Back
issues are available from the publisher, Springer-Verlag New York, at 800-
SPRINGER or 212/460-1500.

To answer one particular question: instead of using the axis operator to
do reductions in a particular direction, J uses the rank operator. By default,
reduction in J applies along the first axis rather than the last. The rank
operator partitions an array by selecting a specified number of dimensions
from the end. So +/"n applied to an m-dimensional array reduces the (m-n)th
dimension, counting in origin 0.


*========================================
# J 3.0 words BNF

+------------------
| Raul Rockwell
| <26MAR91.22244391@uc780.umd.edu>

Experimenting with J to find out what its definition is is
educational, but at times I wish I had a more thorough definition of
what it's doing.

Of course, J changes so fast such things might be considered silly 8-)

Anyways, here's what I've come up with for what J is doing with the
words function (;:):

This is what you apply ;: to.  Result is a list of the words (in the
          same order they originally appeared).  As is usual for this
          sort of thing, rules apply to the largest sequence possible.

sentence =:: (BEGINING_OF_LINE *space *(word *space) END_OF_LINE)


word =:: [(alpha *[alpha numeric] *dot)
          (glyph *dot)
          (numeric *[alpha numeric] +dot)
          (numeric *[alpha (decimal numeric) numeric (space numeric)])
          (quote *[quotable (quote quote)] quote)]

---------------------------- character sets ----------------------------

alpha    =:: [a-zA-Z]
decimal  =:: [.]
dot      =:: [.:]
glyph    =:: [!-&(-,---/<-@[-`{-~]
numeric  =:: [_0-9]
quote    =:: [']
quotable =:: [^']
space    =:: [ \t]

Notes:  except for decimal, and quotable, all character sets are unique.
        I haven't tried control characters and ascii characters beyond
            ~, but I suspect they should be considered "glyphs"
        \t represents a tab character
        - between two characters represents a range of ascii characters
        () indicates serial ordering
        [] indicates any of the options
        ^ indicates negation (all characters except following)
        * indicates 0 or more repetitions of next thing (yeah, I
            reversed it... seems easier to read though)
        + indicates 1 or more repetitions of next thing

Has anyone with 3.0 seen anything different from this?

Anybody from ISI care to comment on how stable this is likely to be?
(Yeah, I know, it's stable till it changes...)


*========================================
# Re: Parallelizability of APL?

+------------------
| Raul Rockwell
| <1991Aug2.185356.22113@hubcap.clemson.edu>

Doug Merritt:
   APL was (is?) often popular for programs that have lots of obvious
   vector math in their inner loops because of its powerful set of
   vector operators, but is often considered a nuisance for other
   types of programs because of the need to force algorithms into the
   Procrustean Bed of those vector operations.

Ouch.  That's a loaded statement if I've ever seen one.

First off, it is true that APL has SIMD (vector) instructions, but
that's not all it has.  There are also SISD instructions, and newer
variants of APL are begining to have MIMD instructions.

On the other hand, it appears that 'APL' is being frozen into the form
of IBM's APL2 dialect.  Most of the real neat MIMD stuff is present
only in the recent J dialect.

   My question: is there a body of tricks and techniques for writing
   elegant APL versions of nominally non-vector-oriented algorithms,
   that could be or have been borrowed for parallelization in other
   languages?  Or would that be a wild goose chase?

Well, in my opinion, APL2 is something of an evolutionary dead end.
But that's not really up to me to decide.

However, the one "significant" technique that all variants of APL
encourage is the pre-computation of information that would normally be
computed on the fly (e.g. branch decisions).  Newer dialects of APL
only introduce more elegant ways of using/generating this sort of
information.

And, hmm... J also encourages the idea of that control structures
belong inside expressions rather than the other way around.  In other
words, instead of saying
  if (predicate) then x := a + b  else x := a - b fi
or
  x := if (predicate) then a + b  else  a - b  fi
you would say the equivalent of
  x := a if (predicate) then + else - fi b

And, just for the record, the default J syntax for this statement
would be
  x =.  a  -`+ @.  predicate  b
though you could easily say something like
  x =.  a  - but + iff (predicate)  b
or, instead
  x =.  a (predicate) then + else -  b
or, if you don't mind that 'if' and 'fi' are effectively no ops,
  x =.  a  if (predicate) then + else - fi b
if that suits your preference.  (All these variants assume that the
words are defined appropriately.  Contact me if you wish to know what
these definitions are, but don't have time to figure them out for
yourself.)

It should probably go without saying, but (a) you may use arbitrary
functions in place of + and - in the above.  And, (b) this is one of
the features of J that lends itself well to MIMD situations.  And, (c)
functions are first class objects in J.



*========================================
# Strength Reduction for J

+------------------
| Raul Rockwell
| <26MAR91.21460528@uc780.umd.edu>

J is slower than I think it should be (Obviously a world shaking
problem 8-)

There are a couple techniques for speeding up code which I think might
be applicable here.  (1) moving code which computes "constants"
outside of loops.  (2) using simpler functions where applicable.

If I were working on the language itself, I might try implementing (1)
by designing all of the primitives to handle data of rank
(1+advertised rank).  (Thus duplicating some of the code for " a
number of times).  I think this would allow you to cut a lot of the
type checking overhead.  And, I'd try implementing (2) by choosing 1
case which I felt common for each function (say, multiplying by a
boolean, or modulo by a power of 2) and try and optimize for that
specific case.

But I'm not working on the interpreter...

So, (1) can still be implented by writing so as to minimize the number
of function calls.  (For each application of a function, you seem to
get 1 fn call for each array cell.  Calls to derived functions imply m
calls for each application of the derived function.)  In other words,
do not modularize at a low level.

In my example of uuencode, cutting out the "uul" function, and writing
the whole thing as a single transformation (then sticking on
delimiters, and so on...) seems to speed things up a bit.

And (2), I guess, simply requires a fair amount of experimentation.
(For example, for uuencode, I ought to try using a table lookup to
convert from ascii numerics to binary).

Any other thoughts on this subject?



*========================================
#  bracket indexing .vs. merge (and space wars)

+------------------
| Bob Bernecky
| <1991Jan17.194702.21887@yrloc.ipsa.reuter.COM>



A major advantage of merge is that it IS functional, allowing compilers
to do swell things without side effects getting in the way.

As an implementor of APL interpreters and compilers (appeal to
authority, doncha know...), I claim it's not very hard to map
many merge expressions into expressions which will not generate
copies of arrays at run time.

Note that APL interpreters already make checks of this sort. For example,

a assign b assign  iota 5

(I'm using words, so that the net won't eat apl symbols)

will create only one copy of iota 5, and point BOTH a and b to it,
at least in SHARP APL. indexed assign into b ...    b[3] assign 55
has to check the reference count for iota 5 before doing the assign.
In this case, it would start by copying the array, and giving b its
own copy of the array.

Similar problems arise when type changes force coercions: b[3] assign 5.678.

These are fairly trivial to check for, and a compiler should not have much
trouble with copy avoidance.

On a related topic, that of functional languages creating and destroying
lots of temps, I recommend looking at some of the work in SISAL and
that area(I have a reference to a paper which talks about that work,
but it is buried somewhere in the Mt. Fuji paper pile on my desk.
When it surfaces, I'll forward the reference. Basically,
a bit of compile time analysis (hard to do this with an interpreter)
allows functional languages to match or beat Fortran on supercomputers.

Bob
PS: I agree that a much more accessible introduction to the language
is required if J is to ever become more than a curiousity. Also required
are more formal definitions of the primitives -- As an implementor of
a compiler, I am left in the dark even after reading the documentation.
I guess I feel it's fine to make puzzle books, and it's fine to make
computer manuals, but I derive little pleasure from "tinkering" my
way through a definition, when a few words would make things a LOT
more obvious to the untutored.



*========================================
# implementing @:

+------------------
| Raul Rockwell
| <19APR91.00433381@uc780.umd.edu>

There are various parts of J which are rather slow.  I've been looking
at implementing J myself, and I think I can see a lot about the
implementation from what's slow and what's fast.

Essentially, it looks like J is written in a highly self-referent
fashion.  This is execellent for generality, and is even better for
making the language consistent with itself.  There is a speed price
though.

One way to improve speed would be to selectively substitute faster
expressions for overly general expressions.  I haven't explored this
idea much, yet.

Another way to improve speed is to re-do the underlying
implementation.  For example, if you consider J to be a generalized
array language, then one step back would be a generalized vector
language.  Maybe implementing J in terms of only scalar and vector
routines would be a win.  (Presumably, one would use J instructions
limited to rank 1...)

Finally, you could go whole-hog and implement every J function as a
"leaf" function in the implementation language (or, call system
routines, but nothing else).  This might increase code-size quite a
bit, but might run really fast.

With that out of the way, I'd like to bring up a tiny algorithm.
Consider  @: v
where v is a numeric vector (not boxed)

[[NB. sws.  this is now   A. v  ]]

I think that what J is doing for this case is building a table of all
permutations of i.#v, and using a transitive i. to find the row which
matches v.  This is nice and concise, but seems slow in the current
incarnation of J.

It's occurred to me that on a typical (scalar) machine, there is a
simple O(n squared) algorithm that should run pretty fast:

(1) find length of v (if 12 or less, assume an integer result, if 170
or less, assume a floating point result -- quite possibly will
overflow if greater than 170)

(2) build a vector of i.#v (call it N), and another vector of 0#~#v
(call it M).

(3) loop over elements of v (left to right, call loop index j)
       if j{N is positive, make j{M equal to j{N
                           then make j{N be _1
                           then do N =. N - j < i.#N
       if j{N is negative, abort -- result of @: is !#v

(4) result of @: is +/M*!].i.#M

--------------------

Note that this algorithm would be pretty fast if M could be
constructed in one pass (rather than having to re-build it at every
iteration through loop(3)).

I suppose that some attention should be paid to "intellegently"
pre-allocating memory for results.  Perhaps if catenate could somehow
be told to allocate extra memory for future catenates?  (this would
have to be fairly transparent to the user)

Has anyone thought about this sort of condition?

I think J is losing a lot of time, currently, in catenation.

(Of course, I may be wrong)


+------------------
| Robert Bernecky
| <1991Apr21.161301.9172@yrloc.ipsa.reuter.COM>

In article <19APR91.00433381@uc780.umd.edu> cs450a03@uc780.umd.edu writes:
>There are various parts of J which are rather slow.  I've been looking
>
>One way to improve speed would be to selectively substitute faster
>expressions for overly general expressions.  I haven't explored this
>idea much, yet.
>
>Another way to improve speed is to re-do the underlying

Of course, special casing and embedding support for idiomatic expressions
in functional languages is one way to get lots of speed. In SHARP APL,
we obtained factors of 5-500 speedups for primitives such as
catenate-with-rank, base-value-with-rank, and so on. Too bad we never
got time to finish that work...

My J compiler is going to have some capabilities in this area -- it
has to do so, if it's going to be successful (fast, that is).

I think Roger's design as it stands is excellent for a prototyping platform:
Radical design changes can be propagated through the entire interpreter
with minimal effort. Once you start bolting special case code into an
interpreter, the design tends to become frozen to a large extent.

Also, special cases DO take time, and given the limited budget (temporal
and otherwise) which J is being developed under, it is understandable that
correct operation takes precedence over speed. ( I am told that some
other language designers do not share this view...   8^}
)


+------------------
| Roger Hui
| <1991Apr21.182401.10670@yrloc.ipsa.reuter.COM>

Raul Rockwell writes:

> With that out of the way, I'd like to bring up a tiny algorithm.
> Consider  @: v
> where v is a numeric vector (not boxed)
>
> I think that what J is doing for this case is building a table of all
> permutations of i.#v, and using a transitive i. to find the row which
> matches v.  This is nice and concise, but seems slow in the current
> incarnation of J.

[[ NB. sws. Replace all @: with A. for the current version]]



There is **no way** I would implement @:y with an order !n=.#y algorithm.
FYI this is a model of the current implementation of @: , computing the
atomic representation of a permutation p (possibly in nonstandard form):

   ord  =. >:@(>./)
   std  =. ((i.@ord -. ]) , ]) @ (ord ] ])
   rfd  =. +/@({.>}.)\."1
   base =. (- i.)@ord
   ar   =. base #. rfd@std

ord computes the order of a permutation, possibly in nonstandard form.

std converts a permutation into standard form.  std is equivalent to
'((i.n)-.n]y.),n]y. ] n=.1+>/y.' : ''

rfd computes the reduced representation of a permutation from its
direct (standard) representation.  rfd and its inverse dfr were
written by E.E. McDonnell, Iverson Software PARC:
   dfr =. /:^:2@,/"1
   rfd =. +/@({.>}.)\."1

base computes the value n-i.n .

ar p computes @:p, the value of the reduced representation of p in the
base (n-i.n) numbering system.


*========================================
# implementing J

+------------------
| Raul Rockwell
| <19APR91.08155946@uc780.umd.edu>

This is a continuation of my last post:  vauge ruminations about how
to implement J efficiently.
[[NB. apd. reference: <19APR91.00433381@uc780.umd.edu> above]]

Problem:  building a list of length n from n scalar operations is O(n
squared), rather than O(n)

LISP people would laugh, and say you need cons cells.  That might
actually be a way to go:  have catenate build an internal structure
specially optimized for catenate, then the first time something
besides catenate operates on that object, a post processing stage
kicks in and produces a flat list for use by whatever this new
function is.

There would be a tradeoff there--for some things this could really
waste time.  Also note that this is similar to using APL's indexed
assignment assignment to build arrays, although indexed assignment
takes advantage of being able to estimate storage requirements ahead
of time.

Most likely, the people at ISI and IPS have already gone over this
issue a number of times, and already have some favored solutions.  Any
comments from over there?


+------------------
| Robert Bernecky
| <1991Apr21.162659.9338@yrloc.ipsa.reuter.COM>

In article <19APR91.08155946@uc780.umd.edu> cs450a03@uc780.umd.edu writes:
>
>Problem:  building a list of length n from n scalar operations is O(n
>squared), rather than O(n)
>
>There would be a tradeoff there--for some things this could really
>waste time.  Also note that this is similar to using APL's indexed
>assignment assignment to build arrays, although indexed assignment
>takes advantage of being able to estimate storage requirements ahead
>of time.

The way that I originally implemented function rank at I.P. Sharp, way
back in the dark ages around 1980 or so, was as follows:

Build an array of pointers the size of the frame. Call the function to which
rank applied once per element in that frame, building an array of the cell
contents for each call. This went through all the primitive function
conformability checking, storage management, etc. for each call. Pricey,
but general, it worked, and was easier than modifying 300 thousand lines
of assembler code. When all the cells had been processed, COPY all those
cell results (which had been attached to the above array, ala BOX), to a NEW
array, which held the logically flattened result.

The special casing for catenate and other verbs I mentioned in my previous
message consisted largely of removing the above sort of silliness, and
placing a new outer loop around the primitive's loops, after conformability
checks had been done.

Both of these techniques were chosen because of the huge amount of code
which not only made assumptions about data structures, but also made
assumptions about space "near" data structures. FOr example: "I know
I can set the rank and type of the result at the same time, because
I KNOW they are adjacent cells in the same word", and "I KNOW I can
index off the end of this array when storing into it, because storage
management ensures that I have about 32 bytes of slop off the end of ANY
just-allocated array".

The last one was a killer. Still is. These sorts of assumptions are nasty
because they do NOT exist in visbibly in the code -- they just lurk there,
waiting to strike at the unwary. These are the sorts of coding tricks which
can kill.

But I digress. The way we got around it allocated storage once, and
built the cell results in situ. (Situ is located near Cleveland, somewhere).


+------------------
| Roger Hui
| <1991Apr21.173221.10254@yrloc.ipsa.reuter.COM>

Bob Bernecky writes:

>But I digress. The way we got around it allocated storage once, and
>built the cell results in situ. (Situ is located near Cleveland, somewhere).

Bob, I gather that by "located near Cleveland", you mean it borders on
the eerie ...


*========================================
# mixing it up: The role of fractional axis brackets

+------------------
| Robert Bernecky
| <1991Apr29.142754.17015@yrloc.ipsa.reuter.COM>

A recent posting suggests that the rank adverb and the  mix[.1] (etc)
sort of axis brackets are equivalent.

It is certainly true that APL2 expressions with axis brackets can
(or ISO APL with axis brackets) can perform many of the same operations
as those expressed with the rank adverb. FOr a few examples of this,
you can read my APL88 paper (ACM SIGAPL Quote Quad, vol.18, no. 2).

However, there are two problems with axis brackets which make them inferior
to the rank adverb:

a. The syntax is anomalous: The brackets do not conform to the syntax
   of functions (verbs), nor of operators( Adverbs), nor any other
   APL object except brackets. This complicates syntax analysis, and
   makes good fodder for religious jihads.

b. The SEMANTICS of bracket axis notation is NOT defined. That is,
   there is NO place you can go and read a definition of what they
   mean. Like the caterpillar,  axis brackets mean exactly what the
   implementor meant and nothing more. \cite{alice}

   The meaning of brackets is different for EACH function to which
   they may apply, and the only way to determine that meaning is to
   have a reference manual by your side. FOr example (I am doing
   this from memory, so please bear with me if I screw up), the
    APL22 definition of ravel with axis allows specification of a set
   of axes in brackets, but requires that the set of axes be contiguous
   numbers. Other functions (perhaps mix or disclose?) allow any
   set, or fractional sets, etc.

   This may not sound important at first glance, but consider the
   difference between the expressions:
      apl2:   X f[k] Y
      J:      X f"k  Y

   If you do not know the definition of f (it might be user-defined!),
   you cannot tell anything about the action of brackets in APL2.
   In J, by contrast, it has a CLEAR and CONSISTENT definition, which
   applies to user-defined verbs in exactly the same way as it does
   to primitive verbs.

Why is this difference important?

a. It makes the language easier to learn and use. You don't have to
   keep running to a reference manual to learn when you can use brackets,
   and when you have to split/apply/mix with brackets instead.

b. There is LESS Language to learn: This makes it easier to teach, and
   easier to implement.

c. By removing the special cases which axis brackets represent, we end
   up with a language with simpler syntax, fewer chances to introduce
   bugs, and thereby create more reliable programs.


+------------------
| Edward M. Cherlin
| <42021@cup.portal.com> 5 May 91 20:35:22 GMT


To me, the real problem with anomalous APL constructs is that they are
not functional. This means that they cannot be used in operator
expressions, in particular, and suffer from all of the deficiencies that
functional programming is intended to remedy, in general. See my
forthcoming paper, "Pure Functions in APL and J", APL91 conference,
Stanford, Aug. 1991, and my articles in recent issues of APL News on
J.

I include brackets (axis, subscript), diamond, and QuadFX as anomalies,
following Iverson. These are the features he felt most needed fixing, and
he has fixed them in J.

+------------------
| Gfeller, Martin
| <1991May6.204056.26134@yrloc.ipsa.reuter.COM>


Branch and assignment arrows are not functional as well. J maps them
both to one, and calls it copula. I don't think copulas can be
modified by adverbs, so they're not strictly functional in J either.
But this is a hard area. /Martin


+------------------
| L.J.Dickey
| <1991May7.135226.25211@watmath.waterloo.edu>

In article <1991May6.204056.26134@yrloc.ipsa.reuter.COM>
        mgf@ipsaint.ipsa.reuter.COM (Gfeller, Martin) writes:
>no. 5166696 filed 18.58.03  mon  6 may 1991
>from mgf
>subj Re: mixing it up: The role of fractional axis brackets
>
>Branch and assignment arrows are not functional as well. J maps them
>both to one, and calls it copula. I don't think copulas can be
>modified by adverbs, so they're not strictly functional in J either.
>But this is a hard area. /Martin

I dont understand this, on two counts.

    (1) There are two copulas, not one:  "=."   and  "=:"  .
    (2) Order of execution of lines in a function is controlled
        by "suite", ($.), and the suite may be modified within a
        function by doing an assignment.  Is this what was meant
        Martin Gfeller means?

It is not quite right to say that branch is mapped to assignment.



*========================================
# Statistical Functions in J

+------------------
| Roger Hui
| <1991May12.145907.19563@yrloc.ipsa.reuter.COM>

The definitions of mean (m.), normalize (n.), and spread (s.) have
changed in response to a detailed critique by Professor Fraser Jackson
of the Victoria University of Wellington (uunet!matai.vuw.ac.nz!jackson).
Professor Jackson's comments are as follows (minor editing by me):

--------------------------------------------------------------------

The concept of expectation is central to both probability and
statistics.  Indeed it is quite possible to axiomatise
probability theory in terms of expectations rather than
probabilities.  One didactic advantage of doing so is that the
whole apparatus of measure theory essential for an
axiomatisation based on probabilities can be introduced much
later.  From the classical viewpoint, P. Whittle, Probability,
Penguin, 1970?  (To appear in a revised edition by Springer
Verlag) gives an illustration of this alternative approach in
a text designed to cover the subject from first principles.
He illustrates numerous results from quantum theory to
investment analysis where the expectation is the natural
function to use.  From the subjective probability perspective,
De Finneti's classic published by Wiley also provides an
axiomatisation based on expectation.

m.

I believe therefore that the function  m.  should be the
expectation.   This would give

     m.  y        as the usual mean
     x  m.  y     as the mean formed with weights given
                  by the vector  x%+/x

Hence the monadic case would just be  1 m. y

This definitions permits a very simple treatment of something
which appears at the beginning of virtually all statistical
texts - frequency distributions.

For a frequency distribution with frequencies f and values y
this gives  (in expressions which I am sure can be written
more simply)

     the mean as   f m. y
     deviations from the mean   y  -  f m. y
     the mean squared deviation   f m. *: y - f m. y

In doing calculations with density functions the weighted mean
is always required to find statistics of the distribution, and
if there is any transformation of the variable then that also
requires use of a weighted mean calculation to find parameters of
the transformed distribution.  Since variable transformation is
often required the function proves very useful in numerical studies.

The expectation of a function g of y under a density  p y
becomes just  (p y) m. g y  or better  (p m. g) y

Using an obvious notation, the two commonest price index
numbers become

     laspeyres      (p0*q0) m. p1%p0
     paasche        (p0*q1) m. p1%p0

In the Dictionary, the concept of using the left argument for
the degrees of freedom is introduced.  The degrees of freedom
are generally only important in second moment calculations
with unweighted data.  It seems to me that is giving a special
case unwarranted importance.  The degrees of freedom are
simply corrected for using a multiplier.  In the simplest case
it is just  (#%(_1&+&#)) y  , but a whole range of other cases
are equally simply dealt with.

I note that Table 1 uses  moment  as the description of the
dyadic case of  m.  If that represents an earlier (or later
idea) it again seems to me inappropriate.  The moments are
highly specialised means and best dealt with using the mean of
some transformation of the variable, e.g. in the monadic case

     m. (y - m. y)^4

with the dyadic case

     f  m. ( y - f m. y)^4

n.

I would define  n.  to be consistent with  m.  Hence the
monadic case would be with divisor  #y, and the dyadic case
would give normalised deviations from a weighted mean.  Again
if there is a need to adjust degrees of freedom for a
particular case I would prefer to see a multiplier used since
in many cases items have non-integer weights.  Alternatively,
a similar procedure could be adopted as with the fit
conjunction, this time however using a degrees of freedom
conjunction.  That solution could give I believe the features
you want with the added flexibility and power permitted by
generalising to weighted means.  The DF conjunction would then
only be needed in cases where the data were individual
observations and the adjustment was required for a particular
statistic or calculation.

The dictionary definition of the dyadic case seems to me to
give an expression almost never used since apart from squared
deviations we seldom use a mean with divisor less than #y.
In the squared deviations case when it is used it is not then
common to normalise the squared values in any calculations I
have had occasion to use.

s.

I believe a case can be made here for two functions.  The
first would give a measure of spread for the variables
considered individually, and be defined across arbitrary arrays.
For that I would prefer either the sum of squared deviations or
the mean squared deviation on the grounds noted below.

The second is a function which generates information about the
covariation of the variables in each item of the data object.
If there is to be just one function, I would favour this, in
view of the importance of many areas of multi-variate analysis.

The standard deviation is a useful measure but very
specialised to the univariate case.  Further the standard
deviation is not an additive measure like the variance (using
it now with divisor #y) and the sum of squares.  Variances
can be simply combined, using the m. function above.  (For a
population the variance is just the weighted mean of the
variances if the populations consists of subsets with
different variances.)  The standard deviation needs to be
squared before it can be combined.  As soon as you move to
multivariate statistics it is the covariance matrix which is
the basic tool, not the correlation matrix or the variable
standard deviations.  In multivariate analyses it has been my
experience that the speed of formation of a cross product
matrix  X'X  is crucial.  This is so important that it deserves
to be a special function and programmed for speed of execution.
So my primitive function would be to form  (|:y) +/ . * y  in the
monadic case.  A case can be made for using the mean value of
the outer products of the items of y for consistency with the
previous definitions though I do not find that compelling.
The avoidance of centering within the function is deliberate.
Nearly all of the multivariate texts and the works on regression
express the theory in terms of the original observation units.
The case where the data are deviations from the mean then
becomes a special case.  With the  m.  primitive function it is
easily treated as just that.

The conventional covariance matrix then is

     (#y)%~(|:yd) +/ . * yd =. y - m. y

but for the occasions where that is required, if  s.  is the
cross product verb it is simple enough to enter

     (#y)%~ s. y - m. y

Using the normalised values we obtain the correlation matrix as

     (#y)%~ s. n. y

The choice of dyadic form is more difficult.  I believe that
consistency with the previous definitions would lead to choice
of the dyadic form  x s. y  as weighting the items in y by
%:x  where x is a vector of weights.  In this way each item in
y is given a weight in the crossproduct matrix equal to the
corresponding item in x.  The weights are not normalised to sum to 1
since in the applications a range of normalisations will be used.

This provides a very powerful primitive function, encompassing
all of the weighted regression models, and the powerful tools of
generalised least squares as outlined by McCullough and Nelder in
'Generalised Linear Models' (2nd ed) Chapman and Hall, 1989.
It would also provide a primitive function for many methods based
on weighted observations in Econometrics, in particular methods
due to H. White of constructing consistent estimators in the
common case when there are heteroscedastic errors.  It would also
be a core function in those methods of non-linear optimisation
designed to find the extreme of a likelihood function.

An alternative would be to define it as forming the expression
(|:x)+/ . * y  since this is also often required.  This would
be of value if the code for it was much faster than the using
the general expression above.

I prefer the definition which preserves consistency with
the concept of weighted values.  It is a more natural
extension consistent with the usage in the definitions of m.
and n. proposed.  By introducing a weighting function it is
signficantly generalising the monadic operation.  Second, and
less important from a language design persepective (but
perhaps as important for potential users), it is rather messy
to program otherwise, and exploiting the special structure of
the function could lead to important speed gains in execution.

I recognise that speed of execution has not been a primary
factor in your design, but it is an important consideration,
especially with the data intensive calculations likely in
multi-variate studies.  If the functions selected contribute
both to ease of statement of the problem, and fast execution
then they seem to me to have a strong case for inclusion.

Looking at these three functions now as a group, and
endeavouring to put them into Dictionary style --

m.

MEAN (_)  m. is the mean value, and  m. y  is  1 m. y
MEAN (_ _)  x m. y  is the weighted mean of the values of y
with weights given by x.  The weights will be normalised to
sum to one in forming the weighted mean.

(It may be desirable to restrict the dimensionality of the
arguments to (1 _) though the general case is often useful.)

n.

NORMALIZE (_)  The result of  n. y  is y translated to have a
mean of zero and a mean squared deviation of 1.  It is defined
as  1 n. y  .

NORMALIZE (_ _)  x n. y  provides normalised values of y
which when applied with weights x sum to zero and have mean
squared deviation equal to 1.  The weights x will be
normalised to sum to 1.  The result of  x n. y  is therefore
(y-x m. y)% %:x m.*:y-x m. y   Thus  x m. x n. y  is 0
and  x m. *: x n. y  is 1.  In the case where the weights are
frequencies and it is desired to standardise so that variance
is 1 the values must be multiplied by  %:(+/x)%(+/x)-1

(Again a restriction on the arguments to (1 _) could be
considered.)

s.

SPREAD (2)  s. y  is the crossproduct matrix of the values in
y and equal to  (|:y)+/ . * y  .  Consistent with the usage of
the other statistical functions,  s. y  is  1 s. y  .  The
covariance matrix is just  (#y)%~ s. y - m. y   and the
correlation matrix is  (#y)%~ s. n. y

SPREAD (1 2)  x s. y  is the crossproduct matrix
s. (%:x)*"0 1 y .  Note that the weights are not normalised
to sum to one in this case.  The dyadic form weights the outer
product of each item of y by the corresponding item in x.

----------------------------------------------------------

The new definitions have been implemented in J 3.1.  They are
modelled thus:

rx  =. ($@$@[ <@}. $@]) $&> [
ry  =. ($@$@] <@}. $@[) $&> ]
m2  =. (rx * ry) %&(+/) rx
m1  =. +/ % #

dev =. ry - m.
rms =. [ %:@m. *:@dev
n2  =. dev % rms
n1  =. 1&n2

s1  =. |: +/ . * ]
s2  =. s1@(%:@[ *"0 1 ])

m.  is   m1 : m2
n.  is   n1 : n2
s.  is   s1 : s2 " 2 1 2

[[NB. sws.  m., n.,  and s. have been removed these days, however you
 can use:

rx  =. ($@$@[ <@}. $@]) $&> [
ry  =. ($@$@] <@}. $@[) $&> ]
m2  =. (rx * ry) %&(+/) rx
m1  =. +/ % #
m=.   m1 : m2  f.

dev =. ry - m
rms =. [ %:@m *:@dev
n2  =. dev % rms
n1  =. 1&n2
n=.   n1 : n2 f.

s1  =. |: +/ . * ]
s2  =. s1@(%:@[ *"0 1 ])
s=.   s1 : s2 " 2 1 2 f.

]]


+------------------
| Lou Kates
| <356@tslwat.UUCP> 19 May 91 15:29:42 GMT


In article <1991May12.145907.19563@yrloc.ipsa.reuter.COM> hui@yrloc.ipsa.reuter.
COM (Roger Hui) writes:
>The definitions of mean (m.), normalize (n.), and spread (s.) have
>changed in response to a detailed critique by Professor Fraser Jackson
>of the Victoria University of Wellington (uunet!matai.vuw.ac.nz!jackson).
>Professor Jackson's comments are as follows (minor editing by me):
>
>--------------------------------------------------------------------
>
>The concept of expectation is central to both probability and
>statistics.  Indeed it is quite possible to axiomatise
>probability theory in terms of expectations rather than
>probabilities.  One didactic advantage of doing so is that the
>whole apparatus of measure theory essential for an
>axiomatisation based on probabilities can be introduced much
>later.

Nevertheless, I believe the key  concept here is not expectation,
probability or  measure  but regression and projection. From this
viewpoint the old APL's domino  operator (or regression operator)
had it correct and the above suggestions are a step backwards.

The mean    of  a  vector  V   is the  regression coefficient  of
projecting  V onto a vector  of all ones. The space orthogonal to
this vector of ones  is the deviation space and the length of the
projection of  V onto  this  deviation  space   divided   by  the
dimensionality of the deviation  space (which is  the length of V
minus one) is the standard deviation.

This fits    in  with  the   geometric  interpretation of  linear
statistical  methods that is commonly taught to statisticians and
is  the standard  way   of visualizing and unifying a variety  of
statistical techniques including regression, analysis of variance
and time series analysis.


+------------------
| Roger Hui
| <1991May21.042804.21102@yrloc.ipsa.reuter.COM>

Lou Kates writes:

> Nevertheless, I believe the key  concept here is not expectation,
> probability or  measure  but regression and projection. From this
> viewpoint the old APL's domino  operator (or regression operator)
> had it correct and the above suggestions are a step backwards.

> The mean    of  a  vector  V   is the  regression coefficient  of
> projecting  V onto a vector  of all ones. The space orthogonal to
> this vector of ones  is the deviation space and the length of the
> projection of  V onto  this  deviation  space   divided   by  the
> dimensionality of the deviation  space (which is  the length of V
> minus one) is the standard deviation.

I am puzzled.  How would a belief in regression and projection
instead of expectation as the key concept materially affect
the design of the primitives m., n., and s.?

There are many statistical functions worthy of inclusion in the
language.  We are adding the new functions mean, normalization,
and cross product matrix; and assigning the cognate weighted mean,
weighted normalization, and weighted cross product to the dyads
of these functions seems both consistent and useful.  These new
functions do not preclude adding other statistical functions in
the future, nor do they affect the domino function or any other
existing function.  Therefore, I would not characterize them
as a backward step.

We would welcome suggestions for other statistical functions.


+------------------
| Sam Sirlin
| <1991May21.162624.18312@jato.jpl.nasa.gov>

In article <1991May21.042804.21102@yrloc.ipsa.reuter.COM>, hui@yrloc.ipsa.
reuter.COM (Roger Hui) writes:

]> We would welcome suggestions for other statistical functions.
]>

I can calculate means and standard deviations pretty easily in APL or J.
On the other hand, what about FFT's or PSD's? These would have applicability
beyond just statistics. Of course once linkc (sp?) is available we can all
build our own. It seems to me that APL was a pioneer in including solutions
to linear equations (domino) in the language, but that now the standard is
somewhat higher (Matlab, Mathematica, ...) and includes EISPACK, LINPACK,
and FFT's. Of course these make the code size grow...


+------------------
| Eythan Weg
| <WEG.91May21104318@convx1.convx1.ccit.arizona.edu>

In article <1991May21.042804.21102@yrloc.ipsa.reuter.COM> hui@yrloc.ipsa.reuter.
COM (Roger Hui) writes:


   Lou Kates writes:

   > Nevertheless, I believe the key  concept here is not expectation,
   > probability or  measure  but regression and projection. From this
   > viewpoint the old APL's domino  operator (or regression operator)
   > had it correct and the above suggestions are a step backwards.

   > The mean    of  a  vector  V   is the  regression coefficient  of
   > projecting  V onto a vector  of all ones. The space orthogonal to
   > this vector of ones  is the deviation space and the length of the
   > projection of  V onto  this  deviation  space   divided   by  the
   > dimensionality of the deviation  space (which is  the length of V
   > minus one) is the standard deviation.

   I am puzzled.  How would a belief in regression and projection
   instead of expectation as the key concept materially affect
   the design of the primitives m., n., and s.?

   There are many statistical functions worthy of inclusion in the
   language.  We are adding the new functions mean, normalization,
   and cross product matrix; and assigning the cognate weighted mean,
   weighted normalization, and weighted cross product to the dyads
   of these functions seems both consistent and useful.  These new
   functions do not preclude adding other statistical functions in
   the future, nor do they affect the domino function or any other
   existing function.  Therefore, I would not characterize them
   as a backward step.

   We would welcome suggestions for other statistical functions.

I do not want to see J as a statistical package.  But I would agree
that the mean and its cousins are basic enough to be included in
some form.   Regression is after all some derived concept for which
expectation is needed to set up the appropriate model.

But since we are invited to suggest I will voice my desire to have
generalized means etc that are some robust versions of these, were you
obtain the regular concepts as defaults.  A generalization of this to
general robust regression is welcome but I am not sure it can be done
in the spirit of J.


+------------------
| Roger Hui
| <1991May22.130848.1892@yrloc.ipsa.reuter.COM>

Sam Sirlin writes:

>I can calculate means and standard deviations pretty easily in APL or J.
>On the other hand, what about FFT's or PSD's? These would have applicability
>beyond just statistics. Of course once linkc (sp?) is available we can all
>build our own. It seems to me that APL was a pioneer in including solutions
>to linear equations (domino) in the language, but that now the standard is
>somewhat higher (Matlab, Mathematica, ...) and includes EISPACK, LINPACK,
>and FFT's. Of course these make the code size grow...

Yes, LinkJ is the preferred means for access to FFT, LP, and other more
specialized routines.  The advantages are twofold:  it avoids duplicating
the excellent work of others, and it avoids having to support the facility
forever more.


+------------------
| Lou Kates
| <414@tslwat.UUCP>

In article <1991May21.042804.21102@yrloc.ipsa.reuter.COM> hui@yrloc.ipsa.reuter.
COM (Roger Hui) writes:
>Lou Kates writes:
>
>> Nevertheless, I believe the key  concept here is not expectation,
>> probability or  measure  but regression and projection. From this
>> viewpoint the old APL's domino  operator (or regression operator)
>> had it correct and the above suggestions are a step backwards.
>
>I am puzzled.  How would a belief in regression and projection
>instead of expectation as the key concept materially affect
>the design of the primitives m., n., and s.?


                   ^^^^^^^^^^
When everything you can think of gets put into the lanugage its hard
to see how they all qualify as primitive.

>
>There are many statistical functions worthy of inclusion in the
>language.  We are adding the new functions ...

I guess its whether you believe in parsimony or not.

Personally I   would   rather have   a wider  family of  powerful
operations at my disposal (such as regression, constrained linear
optimization a la simplex  or Karmarker, eigenvalue calculations,
etc.) that are not easily derivable from each other rather than a
large  set  of functions which are all readily derivable from the
ideas of regression  and   projection.  My own preference, and  I
suspect that of many others too, would be  that if   you feel the
need to  have zillions of functions at  your  disposal,  define a
standard library so that you can take them out of the language so
as to keep  the   language  smaller  and more    manageable.   If
performance is the  issue   then   there  is  nothing  to  stop a
particular  implementation  from implementing  certain   standard
library functions in the kernel.


+------------------
| Dave Weintraub
| <1991May29.191441.1279@aplcen.apl.jhu.edu>

In article <414@tslwat.UUCP>, louk@tslwat.UUCP (Lou Kates) writes:
]> ...
]>
]> Personally I   would   rather have   a wider  family of  powerful
]> operations at my disposal (such as regression, constrained linear
]> optimization a la simplex  or Karmarker, eigenvalue calculations,
]> etc.) that are not easily derivable from each other rather than a
]> large  set  of functions which are all readily derivable from the
]> ideas of regression  and   projection.  My own preference, and  I
]> suspect that of many others too, would be  that if   you feel the
]> need to  have zillions of functions at  your  disposal,  define a
]> standard library so that you can take them out of the language so
]> as to keep  the   language  smaller  and more    manageable.   If
]> performance is the  issue   then   there  is  nothing  to  stop a
]> particular  implementation  from implementing  certain   standard
]> library functions in the kernel.
]>
]> Lou Kates, Teleride Sage Ltd., louk%tslwat@watmath.waterloo.edu
]> 519-725-0646
I fully agree.  This is the path taken by IBM with APL2: write external
functions (in Assembler, FORTRAN, PL/I,  C, ...) and make these available
using QuadNA.


+------------------
| Sam Sirlin
| <1991May29.204803.26927@jato.jpl.nasa.gov>

In article <1991May29.191441.1279@aplcen.apl.jhu.edu>, dave@visual1.jhuapl.edu
(Dave Weintraub) writes:

]> I fully agree.  This is the path taken by IBM with APL2: write external
]> functions (in Assembler, FORTRAN, PL/I,  C, ...) and make these available
]> using QuadNA.

An interesting path in this vein is the path taken by ProMatlab. It uses the
dynamic linking capabilities of modern machines. Hence a variety of compiled
routines are available (and more can be written by the user) that are easily
linked in while running the Matlab interpreter, simply by invoking the program
by name (Matlab then searches for the right sort of file and then does the
dynamic link). Using this approach, the kitchen sink doesn't have to be in
the code for everyone, but standard compiled routines are available for those
who need them at practically no overhead.


+------------------
| Roger Hui
| <1991May31.171148.12753@yrloc.ipsa.reuter.COM>

Dave Weintraub writes:

> I fully agree.  This is the path taken by IBM with APL2: write external
> functions (in Assembler, FORTRAN, PL/I,  C, ...) and make these available
> using QuadNA.

Dave, this is in fact possible with LinkJ.  LinkJ permits calling
J from C and C from J.  C functions so introduced behave like primitive
verbs, in the sense that they may be assigned names, and (independently)
may serve as arguments to adverbs and conjunctions.  For example,

   f =. 10!:57     NB. f is an external function encoded by 57
   f"r y           NB. Apply f to rank r cells of y
   x f"r y         NB. Apply f to rank r cells of x and y
   f/y             NB. f insertion ("reduction")
   x f/y           NB. f table ("outer product")
   10!:362"r y     NB. Apply the external fn encoded by 362 to rank r cells


Sam Sirlin writes:

> An interesting path in this vein is the path taken by ProMatlab. It uses the
> dynamic linking capabilities of modern machines. Hence a variety of compiled
> routines are available (and more can be written by the user) that are easily
> linked in while running the Matlab interpreter,  ...

Sam, this is possible with LinkJ on systems supporting dynamic linking.


Lou Kates writes:

> >I am puzzled.  How would a belief in regression and projection
> >instead of expectation as the key concept materially affect
> >the design of the primitives m., n., and s.?
>                    ^^^^^^^^^^
> When everything you can think of gets put into the lanugage its hard
> to see how they all qualify as primitive.
>   ...
> I guess its whether you believe in parsimony or not.
>
> Personally I   would   rather have   a wider  family of  powerful
> operations at my disposal (such as regression, constrained linear
> optimization a la simplex  or Karmarker, eigenvalue calculations,
> etc.) that are not easily derivable from each other rather than a
> large  set  of functions which are all readily derivable from the
> ideas of regression  and   projection.  My own preference, and  I
> suspect that of many others too, would be  that if   you feel the
> need to  have zillions of functions at  your  disposal,  define a
> standard library so that you can take them out of the language so
> as to keep  the   language  smaller  and more    manageable.
>   ...

It may be helpful to consult your copy of the Dictionary.  In it,
you'd find that (a) not everything you can think of is in the language;
(b) we do not in fact have zillions of functions in the language;
(c) the ISO APL %. (matrix inverse and matrix divide, what you called
regression) is in the language; (d) characteristic values and vectors
are in the language.

Many primitives can be derived readily from each other.  One could
argue that *: (nand) makes the other boolean functions redundant, as do
- (minus) and % (divide) for + (plus) and * (times).  Similarly, one can
derive ]. (reverse), ]: (transpose), u;.n (cut), u . v (generalized
determinant), %. (domino), and so forth, from { (from).

There is more to parsimony than reducing the number of primitives.


*========================================
# J 3.2 - Early vs. Late Verb Bindings

+------------------
| Robert J Frey
| <610@kepler1.kepler.com> 10 Sep 91 15:57:28 GMT


I recently received a copy of J 3.2, Tangible Math (TM), ISI Dictionary of J
(DOJ) and Programming in J (PIJ). One important change is that verb references
are dynamic -- they use late binding; i.e., verb references in 3.2 behave more
like verb evocations in earlier versions.

In earlier versions one used evocation if a late binding was needed. In 3.2
one uses the fix adverb, 'f.' to force early bindings. I don't have a problem
with late binding per se, but introducing it ONLY for verbs is a big
mistake.

Using fix, 'f.', to force early binding is not quite the same thing because
nouns, adverbs and conjunctions still use early binding.

For example, in PIJ the definition of each is now no longer

        each =. 'x.&.>' : 1

but is given as

        each =. 'x.f.&.>' : 1

So if we define the 'mean' as

        mean =. +/%#

then

        mean each

+-----------+--+-+
|+-----+-+-+|&.|>|
||+-+-+|%|#||  | |
|||+|/|| | ||  | |
||+-+-+| | ||  | |
|+-----+-+-+|  | |
+-----------+--+-+

That is, in order to get the 'each' adverb to behave sensibly one has to
force the binding of the verb 'x.' to be early with a 'f.'.

If one wants to define a verb using 'mean' and 'each' with late binding,
then one must define a new adverb using an a-train:

        each_late =. &.>

        mean each_late

+-+-+----------+
|<|@|+----+-+-+|
| | ||mean|&|>||
| | |+----+-+-+|
+-+-+----------+

[[NB. sws.  This now is
   mean each_late
+----+--+-+
|mean|&.|>|
+----+--+-+
]]
This adverb does, by the way, behave a little better:

        mean f. each_late

+-+-+-----------------+
|<|@|+-----------+-+-+|
| | ||+-----+-+-+|&|>||
| | |||+-+-+|%|#|| | ||
| | ||||+|/|| | || | ||
| | |||+-+-+| | || | ||
| | ||+-----+-+-+| | ||
| | |+-----------+-+-+|
+-+-+-----------------+

[[NB. sws.  This now is
+-----------+--+-+
|+-----+-+-+|&.|>|
||+-+-+|%|#||  | |
|||+|/|| | ||  | |
||+-+-+| | ||  | |
|+-----+-+-+|  | |
+-----------+--+-+
]]
But, what is one to do if the adverb or conjunction either is not defined
tacitly ( in which case one has to recode) or can't be defined tacitly (in
which case one is screwed)?

Now, I'm not a big fan of consistency for its own sake, but this is one case
where it's needed: The default for binding should be either all early or all
late; not late for verbs while it's early for nouns, adverbs and conjunctions.

If many users find late binding better then so be it, but then let's do it!
For example, assuming late binding for ALL parts of speech:

        each =. 'x.&.>' : 1

        mean =. +/%#

Note that we no longer need the onerous 'f.' in 'each' because the bindings
are ALL late. The use of fix, 'f.', would now be to force early binding at
any arbitrary point in a sentence:

        u1 =. mean each

        u1

+----+----+
|mean|each|
+----+----+

        u2 =. mean f. each

        u2

+-----------+----+
|+-----+-+-+|    |
||+-+-+|%|#||    |
|||+|/|| | ||each|
||+-+-+| | ||    |
|+-----+-+-+|    |
+-----------+----+

        u3 =. mean each f.

        u3

+-+-+-----------------+
|<|@|+-----------+-+-+|
| | ||+-----+-+-+|&|>||
| | |||+-+-+|%|#|| | ||
| | ||||+|/|| | || | ||
| | |||+-+-+| | || | ||
| | ||+-----+-+-+| | ||
| | |+-----------+-+-+|
+-+-+-----------------+

but also note

        u4 =. mean (each f.)

        u4

+-+-+----------+
|<|@|+----+-+-+|
| | ||mean|&|>||
| | |+----+-+-+|
+-+-+----------+


At some point the interpeter has to know when to evaluate, so we would want

        m =. (i.10);i.6

        mean each m

+---+---+
|4.5|2.5|
+---+---+

and not

+----+----+-+
|mean|each|m|
+----+----+-+

In other words the interpeter must have enough sense to stick an implicit 'f.'
after every line input at the session. If the tacit sentences of u1, u2 and
u4 were entered at the session level as imperative sentences rather than
declarative ones they would all be identical to u3; i.e., 'commands' are
subject to immediate evaluation.

Note that for explicit definition one always has defacto late binding because
the character table representing the verb, adverb or conjunction is not
'evaluated' until it is run.

I hope the good people at ISI will consider these changes.

Thanks.

+------------------
| Greg P. Jaxon
| <1991Sep11.174738.5151@csrd.uiuc.edu>

Universal late binding seems like an especially powerful
linguistic idea... The return of call-by-name!  J.C.Reynolds
now at Carnegie Mellon, but also known from his long
stay at Syracuse U.  is currently designing 'Forsythe'
a language like J in many of its reductionist design
principles.  It may be significant that this respected
language theorist also has chosen the universal use of
call-by-name.  It seems to give greater power over evaluation
order as one writes program transformation functions.


+------------------
| Robert J Frey
| <611@kepler1.kepler.com> 16 Sep 91 19:07:53 GMT

In article <1991Sep11.174738.5151@csrd.uiuc.edu> jaxon@sp27.csrd.uiuc.edu (Greg
 P. Jaxon) writes:
>Universal late binding seems like an especially powerful
>linguistic idea... The return of call-by-name!  J.C.Reynolds
>now at Carnegie Mellon, but also known from his long
>stay at Syracuse U.  is currently designing 'Forsythe'
>a language like J in many of its reductionist design
>principles.  It may be significant that this respected
>language theorist also has chosen the universal use of
>call-by-name.  It seems to give greater power over evaluation
>order as one writes program transformation functions.
>
>
>Greg Jaxon - Univ. of Illinois Urbana/Champaign CSRD

As I said in my original posting, I was less concerned about early vs. late
binding than I was about the inconsistency of making it late for verbs and
early for nouns, adverbs and conjunctions. What you're adding here is that
there may be a good reason for prefering late binding per se.

Does Reynolds give specific reasons for his choice of late binding? What I
would like to see in J are bindings that are under the CONTROL of the
programmer.

Giving 'f.' the role of forcing early evaluation and making late binding
the default seems to me to be the best of both worlds. It does mean that the
syntactically 'f.' would be slightly anomalous -- a 'particle' rather than
an 'adverb' -- but it would be worth it.


+------------------
| Raul Rockwell
| <ROCKWELL.91Sep19011243@socrates.umd.edu>

[the original article seems to have expired on my machine, so I may
have missed an idea - Raul]

Robert J Frey:
   As I said in my original posting, I was less concerned about early
   vs. late binding than I was about the inconsistency of making it
   late for verbs and early for nouns, adverbs and conjunctions.

Um... maybe I'm missing something here, but J does have late binding
for adverbs, conjunctions and nouns, as well as for verbs.  (Though it
is a bit restricted for adverbs and conjuctions.)  Both Do (".) and
Define (:) use late binding for names.

   What I would like to see in J are bindings that are under the
   CONTROL of the programmer.

Well they are.  Not to the extent as they'd be if we had a J compiler
written in J, but we don't have one of those yet.

   Giving 'f.' the role of forcing early evaluation and making late
   binding the default seems to me to be the best of both worlds. It
   does mean that the syntactically 'f.' would be slightly anomalous
   -- a 'particle' rather than an 'adverb' -- but it would be worth
   it.

So why not go with the non-anomalous forms?  ". yields a noun, with
"call by name" access to values, and : yields a verb, adverb or
conjuction, here with "call by name" access to the global symbol
table.

Maybe I've got some big blind spot here, but I don't see what the
missing feature is.


+------------------
| Robert J Frey
| <618@kepler1.kepler.com> 23 Sep 91 20:35:25 GMT

In article <ROCKWELL.91Sep19011243@socrates.umd.edu> rockwell@socrates.umd.edu
 (Raul Rockwell) writes:
>[the original article seems to have expired on my machine, so I may
>have missed an idea - Raul]
>
>Robert J Frey:
>   As I said in my original posting, I was less concerned about early
>   vs. late binding than I was about the inconsistency of making it
>   late for verbs and early for nouns, adverbs and conjunctions.
>
>Um... maybe I'm missing something here, but J does have late binding
>for adverbs, conjunctions and nouns, as well as for verbs.  (Though it
>is a bit restricted for adverbs and conjuctions.)  Both Do (".) and
>Define (:) use late binding for names.
>

Note: J code is delineated by double quotes (") in the text below.
I thought I did mention at some point (although perhaps I did not) that the
explicit definition of operators does force late binding, i.e, call by name.
I was discussing the inconsistency of late-binding for proverbs and early-
binding for pro-nouns, -adverbs and -conjunctions in tacit definitions and
at point of execution, e.g., at the prompt or as an explicit form is being
executed.

The issue is not just binding at point of definition but at point of execution
as well.

For example, let "mean =. +/%#" and "each =. &.>".

If I write a statement such as "mean_over =. mean  each" in an explicitly
defined proverb, then at definition I obviously have late binding. What I
want is to control binding at execution and proposed that fit, "f.", be
redefined slightly to accomplish this. The forms "mean each", "mean each f.",
"mean (each f.)" and "mean f. each" all would be EVALUATED differently when
executed.

Also, the behavior of tacitly and explicitly defined forms would be more
consistent and predictable. Why can I define "each =. &.>" but am forced
to define "each =. 'x.f.&.>' ; '' "? What I proposed is that the explicitly
defined forms "each =. 'x.f.&.>' : '' " and "each =. x.&.>' : '' " both
mean something, but each something different. Proverbs produced by the first
would use early-binding and proverbs produced by the second would use late-
binding.

One reason this is important to me is that the code I write in both J and APL
contains alot of user-defined adverbs and conjunctions which often function
more as 'programs' in their own right than as simple noun or verb modifiers.
When you program like this (a legacy of my LISP days!) there is little real
distinction between 'program' and 'data'. In LISP everything gets evaluated
as much as possible -- the ultimate in early binding! You exercise control
through a single quote which has the effect of suspending evaluation.

I would propose the reverse for J. Everything is 'late' unless forced by a
fix, "f.". To repeat my example (with apologies to those who have seen it
before):

        v1 =. mean each                 -- late binding
        v1
+----+----+
|mean|each|
+----+----+

        v2 =. mean f. each              -- "mean" early, "each" late
        v2
+-----------+----+
|+-----+-+-+|each|
||+-+-+|%|#||    |
|||+|/|| | ||    |
||+-+-+| | ||    |
|+-----+-+-+|    |
+-----------+----+

        v3 =. mean each f.              -- early binding
        v3
+-+-+-----------------+
|<|@|+-----------+-+-+|
| | ||+-----+-+-+|&|>||
| | |||+-+-+|%|#|| | ||
| | ||||+|/|| | || | ||
| | |||+-+-+| | || | ||
| | ||+-----+-+-+| | ||
| | |+-----------+-+-+|
+-+-+-----------------+

        v4 =. mean (each f.)            -- "mean" late, "each" early
        v4
+-+-+----------+
|<|@|+----+-+-+|
| | ||mean|&|>||
| | |+----+-+-+|
+-+-+----------+

My point was that the current design of J does not permit all of the TACIT
forms above, but could if it used late binding on all names that could
be forced to early binding with a fix, "f.".

>   What I would like to see in J are bindings that are under the
>   CONTROL of the programmer.
>
>Well they are.  Not to the extent as they'd be if we had a J compiler
>written in J, but we don't have one of those yet.
>
>   Giving 'f.' the role of forcing early evaluation and making late
>   binding the default seems to me to be the best of both worlds. It
>   does mean that the syntactically 'f.' would be slightly anomalous
>   -- a 'particle' rather than an 'adverb' -- but it would be worth
>   it.
>
>So why not go with the non-anomalous forms?  ". yields a noun, with
>"call by name" access to values, and : yields a verb, adverb or
>conjuction, here with "call by name" access to the global symbol
>table.
>
Well, to some degree the fix adverb is already anomalous. Of course, one
always has the option of using a combination of explicit definition and
string execution to overcome these deficiencies, but why? They're easy to
correct. Also, the alternatives, even though they may work, can produce
some pretty kludgy code.


>...I don't understand what the missing feature is.
>
Well, I don't think anyone has a 'blind spot' because their own style of
programming or the demands of their programming problems doesn't come up
against this issue. These MY concerns and, although I don't think that they
are unique, I don't think they are necessarily universal. That being said
however I do believe this is an important issue: when I use an identifier
"fred" sometimes I mean the thing called "fred" and sometimes I mean the name
"fred" -- I just wanna control which of those two alternatives is being
applied at any point in time and I wanna do it AT THE POINT OF EXECUTION of
the statement.

Thanks for your comments.

+------------------
| Raul Rockwell
| <ROCKWELL.91Sep25084852@socrates.umd.edu>

Robert J Frey:
   The issue is not just binding at point of definition but at point
   of execution as well.

Something like this:
   x =. 1
   y =. 2
   z =. z + y
   z
3
   x =. 3
   z
5

???

That seems somewhat hellacious to debug. [At a minimum, you're going
to need an operator to extract the function definition associated
with a noun.]

   Or, if you make the 'f.' a meta-adverb, with delayed bindings on
everything till hit with f., you have a new problem:  what is the
syntax of f.?  And, worse, you're going to have to type f. a LOT.
[Every time you want to see results.]


*========================================
# Re: what is j?

+------------------
| Roger Hui
| <1991May18.004902.29060@yrloc.ipsa.reuter.COM>

a) J is an interpreter, but the subclass of verbs known as tacit
definitions are compiled.  For example, the following verbs are compiled:
   square =. ^&2
   sqrt   =. ^&0.5
   norm   =. sqrt & (+/) & square

[[NB. sws.  to compile norm, you now need to apply fix:
  norm   =. sqrt & (+/) & square f.
]]

b) J is not yet available on any HP platform. If there is a demand
for such a port, then the main obstacle would be getting access
to a machine to do the port. The system is written in C, and by now
most of the portability bugs have been found.

There is no direct support for X Windows, but the related product
LinkJ allows calling between J and C.

c) How is Iverson Software making any money by selling it for $25?
Good question. However, you could have asked a better question,
because we are selling it for only $24.


+------------------
| Robert Bernecky
| <1991May18.042957.704@yrloc.ipsa.reuter.COM>

In article <1991May18.004902.29060@yrloc.ipsa.reuter.COM> hui@yrloc.ipsa.reuter.
COM (Roger Hui) writes:
>a) J is an interpreter, but the subclass of verbs known as tacit
>definitions are compiled.  For example, the following verbs are compiled:
>   square =. ^&2
>   sqrt   =. ^&0.5
>   norm   =. sqrt & (+/) & square
>

Umm, err. "The following verbs are compiled". Please Explain.
My understanding ofcompilation is that you take advantage of knowledge about
data types, rank, shape, etc,  to produce highly efficient code.
Please tell me HOW J achieves this in tacit definition. It seems to me
that tacit definition appears to be more like C macro expansion than
compilation: e.g., destruction of any evidence of what you originally
wrote, rather than true compilation(Destruction plus substantial
performance improvement).

Furthermore, it appears to my naive view that Tacit Defn also
makes it impossibl[e] to debug a system unless you have maintained
source files for everything. For example, if you have  a defn for
a function "foo" which calls  "mean",and you define it using tacit defn,
and LATER discover a bug in "mean", you end up fixing "mean", BUT

"foo" is NOT fixed unless you recompile the universe.

This strikes me as :

a. making Real Programs, rather than Toy Programs, harder to develop
   and debug.

b. Increasing the liklihood of bugs in Real Programs.


+------------------
| Roger Hui
| <1991May18.143707.3182@yrloc.ipsa.reuter.COM>

Reply to article by Robert Bernecky, 1991 May 18:

>Umm, err. "The following verbs are compiled". Please Explain.
>My understanding ofcompilation is that you take advantage of knowledge about
>data types, rank, shape, etc,  to produce highly efficient code.
>Please tell me HOW J achieves this in tacit definition. It seems to me
>that tacit definition appears to be more like C macro expansion than
>compilation: e.g., destruction of any evidence of what you originally
>wrote, rather than true compilation(Destruction plus substantial
>performance improvement).

The verbs I cited are "compiled", in the sense that they are re-executed
without having to reparse the source.  The original question was "Is J
compiled or interpreted?", and I believe my description of J as an
interpreter with certain things compiled is fair and accurate.

Re your parenthetical definition of "true compilation" -- destruction
plus substantial performance improvement:  Some compilers bill themselves
as "optimizing compilers".  Does this mean that non-optimizing
compilers are not performing "true compilation"?  Whether something is
a compiler is a matter of what methods are used to execute sentences.
While compiled code generally perform better than interpreted code,
the performance improvement is not a defining characteristic of a compiler.

As it happens, the technique that J uses for tacit defns does result in
measurable performance improvement, relative to the equivalent explicit defn.

>Furthermore, it appears to my naive view that Tacit Defn also
>makes it impossiblw to debug a system unless you have maintained
>source files for everything. For example, if you have  a defn for
>a function "foo" which calls  "mean",and you define it using tacit defn,
>and LATER discover a bug in "mean", you end up fixing "mean", BUT
>"foo" is NOT fixed unless you recompile the universe.
>
>This strikes me as :
>a. making Real Programs, rather than Toy Programs, harder to develop and debug.
>b. Increasing the liklihood of bugs in Real Programs.

Well, one does maintain source files for C or Fortran or Pascal programs;
and if there is no "make" facility, then one would possibly recompile
everything.  (Does "true compilation" require a "make" facility?)  As well,
the evoke (~) adverb in J allows avoiding this complete recompilation.

I am sure that YOUR compiler would address these issues.  Perhaps we can
carry out further discussions offline.


+------------------
| Raul Rockwell
| <ROCKWELL.91May18095711@socrates.umd.edu>

Roger Hui:
   >a) J is an interpreter, but the subclass of verbs known as tacit
   >definitions are compiled.  For example, the following verbs are
   >compiled:
   >   square =. ^&2
   >   sqrt   =. ^&0.5
   >   norm   =. sqrt & (+/) & square

Robert Bernecky:
   Umm, err. "The following verbs are compiled". Please Explain.  My
   understanding ofcompilation is that you take advantage of knowledge
   about data types, rank, shape, etc, to produce highly efficient
   code.

   Please tell me HOW J achieves this in tacit definition.

Don't forget knowledge of data values and, similarly, function
properties.  One of my favorite compiled functions, SORTCUT, is an
implementation of  '':(+/  @  (<:/)  &  (>./\) )"1

Now, the pre-processor and the main computation are both O(n^2), where
n is the number of elements in a list, but SORTCUT is O(n) :-)
It is true you get speedups from knowing type, rank, etc., but in this
case the big speed ups come from (*) knowing that >. is associative,
and (*) knowing that the result of >./\ is nondecreasing.

   Furthermore, it appears to my naive view that Tacit Defn also makes
   it impossible to debug a system unless you have maintained source
   files for everything.

er... but that's already covered by the language semantics.  Isn't it?


+------------------
| Robert Bernecky
| <1991May20.045753.14595@yrloc.ipsa.reuter.COM>

In article <ROCKWELL.91May18095711@socrates.umd.edu> rockwell@socrates.umd.edu
(Raul Rockwell) writes:
>properties.  One of my favorite compiled functions, SORTCUT, is an
>implementation of  '':(+/  @  (<:/)  &  (>./\) )"1
>
>Now, the pre-processor and the main computation are both O(n^2), where
>n is the number of elements in a list, but SORTCUT is O(n) :-)
>It is true you get speedups from knowing type, rank, etc., but in this
>case the big speed ups come from (*) knowing that >. is associative,


I think my objection to Roger's use of the term "compiled" might be
addressed in a way which also covers your computational complexity argument
by  a definition of the following nature:

Compilation: The process of translating an algorithm expressed in an
  arbitrary notation into another notation which has execution speed
  properties which approximate those of hand-written assembler code.


Now, there ARE problems with this definition:
a. Whose assembler code?

b. Assembler code written for a RISC machine, without attention to
   code-scheduling issues, might execute worse than that resulting
   from a C compiler.

The point I want to make is that I think of compiled code as being
something that runs about the same speed as the same algorithm coded
in C, the above statement notwithstanding. Ten or a hundred times
slower doesn't count.


+------------------
| Roger Hui
| <1991May20.122605.16936@yrloc.ipsa.reuter.COM>

Robert Bernecky writes:

>The point I want to make is that I think of compiled code as being


>something that runs about the same speed as the same algorithm coded
>in C, the above statement notwithstanding. Ten or a hundred times
>slower doesn't count.

Bob, you should try timing  nub0 =. ~.  against  nub1 =. #~ i.@# = i.~
The former, being primitive, is coded in C; the latter is an example
of a "compiled" tacit definition.


+------------------
| Raul Rockwell
| <ROCKWELL.91May20074800@socrates.umd.edu>

Robert Bernecky:
   I think my objection to Roger's use of the term "compiled" might be
   addressed in a way which also covers your computational complexity
   argument by a definition of the following nature:

   Compilation: The process of translating an algorithm expressed in
      an arbitrary notation into another notation which has execution
      speed properties which approximate those of hand-written
      assembler code.

Hmmm... but that doesn't conflict with

Roger Hui:
   ... the subclass of verbs known as tacit definitions are compiled.
   For example, the following verbs are compiled:
      square =. ^&2
      ...

After all,  ^&2  is an algorithm (though rather minimalist :-), and it
is getting translated.  You could also say that the code for ^&2 was
compiled at the same time the interpreter was compiled, that fits
right in with what he said.  Seems perfectly clear.

Now if he was speaking and introducing new concepts left and right, as
the ISI people seem to do, I'd hope that he elucidated the concepts a
bit more (or that a transcript would become available so I could see
what I missed when I was thinking :-).

But (in my case, at least) this concept is an old familiar one:
compilation (including partial compilation) as a form of partial
evaluation.  Technically, you could consider any constructed J verb as
compiled, with a probable exception in the case of  :  and ".

That reminds me.  Has anyone besides me noticed that the transitive
form of ] is the same as the transitive form of #: (except for rank).
Has anyone noticed any other verbs which differ only in rank?


*========================================
# brackets

+------------------
| Greg P. Jaxon
| <1991Feb28.200918.6276@csrd.uiuc.edu>

> Brackets are anomalous in APL

Indeed their effects on the object to their left are basically left up
to that object.  But I associated one semantic action with the OBJ to [
binding in APLB:  The value of #IO to be used when interpreting the
content of the brackets was picked up at this point.  The limit on RANK
was also applied here (if OBJ was data, ['s could have no more than RANK - 1
semicolons, if OBJ was not data, no semicolons, and value(s) in range for
axis numbers (remember laminate?))

The binding of #IO allows an axis specifier to become part of a derived
function and be used in a different #IO binding without changing the
axis being specified.  One advantage of "rank" is that axes aren't numbered,
they are counted!  Does J have a variable #IO, or have we finally been spared
that disaster?


+------------------
| L.J.Dickey
| <1991Mar1.143939.26733@watmath.waterloo.edu>

In article <1991Feb28.200918.6276@csrd.uiuc.edu> jaxon@sp27.csrd.uiuc.edu
        (Greg P. Jaxon) writes:
>...
>
>  ... One advantage of "rank" is that axes aren't numbered, they are counted!
>Does J have a variable #IO, or have we finally been spared that disaster?

No disaster; the index origin is zero.


*========================================
# J3.2

+------------------
| Eythan Weg
| <WEG.91Jul2185344@convx1.convx1.ccit.arizona.edu>

I have downloaded today the Pc version fron Waterloo.  I tried the new
versions of the statistical functions, but none worked for me.  The
response is `spelling error'.
I also noticed that they are not mentioned in the help file provided
with the software but they are documented in the status file.

Incidentally, can we have some clue for what's in store in many of the
other additions?

Thanks

+------------------
| L.J.Dickey
| <1991Jul4.042529.4315@watmath.waterloo.edu>

In article <WEG.91Jul2185344@convx1.convx1.ccit.arizona.edu>
    weg@convx1.ccit.arizona.edu (Eythan Weg) writes:

>I have downloaded today the Pc version fron Waterloo.  I tried the new
>versions of the statistical functions, but none worked for me.  The
>response is `spelling error'.

My pc died, so I can not check my PC version.  However, these
results show on my mips version:

   a =. 1+i.5
   m. a
3
   n. a
_1.41421 _0.707107 0 0.707107 1.41421
   b =. a - 3
   b
_2 _1 0 1 2
   b % 0.5 ^~ ( +/ b^2) % ( # b)
_1.41421 _0.707107 0 0.707107 1.41421
   n. b
_1.41421 _0.707107 0 0.707107 1.41421


I guess this dones not help much...  It only shows that it
m. and n. are in one version an not yours.  But they are
there, and they do work.

[[NB. sws.  m., n., and s. are not in J 4.1. They can be easily
  simulated however (see my comments on R. Hui's post) ]]

 