September 1992

                            Statistics and J
                                    
                              Keith Smillie


J, a dialect of APL using the ASCII character
set, has been developed by Kenneth Iverson
and Roger Hui, and is available on a wide
variety of machines as shareware from Iverson
Software Inc. Like APL, J is intended for use
in teaching mathematics and related topics.

     The present paper uses some simple
statistical problems to introduce J. It is
intended to replace the earlier paper "Some
statistical calculations in J", which began,
possibly somewhat pretentiously, with Ludwig
Wittgenstein's remark that "The limits of my
language mean the limits of my world." The
author's experience with J since this paper was
written indicates that we are circumscribed not
only by our language but also by our fluency
with it. It is hoped that the present paper will
reflect an increased familiarity with J and even
some mastery of the language.  

     Much of the material presently
available on J has been written by Kenneth
Iverson although articles and commentary on
the language appear in Vector, the Journal of
the British APL Association. J is completely
but tersely described in Iverson's The ISI
Dictionary of J, and its use as a programming
language in his Programming in J and An
Introduction to J. Readers of the present paper
are encouraged to keep these documents
conveniently at hand and also to develop
applications of the language in areas of
particular interest to them. 


Means
The arithmetic mean of a list of observations is
defined as the sum of the observations divided
by the number of observations. If the
observations are 2.3, 5, 3.5 and 6, for
example, the arithmetic mean may be
expressed in J as
     (2.3+5+3.5+6) % 4, 
where % represents divide, which has the value
4.2. This expression may be written more
simply as
     (+/2.3 5 3.5 6) % 4 ,
where the adverb insert / is used to place the
verb + between the items in the list of
observations.

     The expression in the last paragraph
may be generalized by using the copula =. to
assign a name to the list of observations, i.e.,
     x =. 2.3 5 3.5 6 .
Then the expression for the mean may be
written as
     (+/x) % #x ,
where the verb tally # gives the number of
items in its argument. This expression, which
is valid for an arbitrary list x of observations,
may be written more simply as (+/ % #)x,
where the sequence of three verbs within the
parentheses is known as a fork. This form is
commonly used in conventional mathematical
notation, where, for example, (f+g)x represents
the sum f(x)+g(x). Finally, we may define a
verb
     am=. +/ % # ,
and write am x for the arithmetic mean of a
list x of observations.

     To see how am may be applied to an
argument which is not a list, let us introduce
the verb integers i. which gives an array of
non-negative integers whose structure is
determined by its argument. For example,
i. 4 is the four-item list 0 1 2 3, and
a=. i. 3 4 gives a as the table
     0  1  2  3
     4  5  6  7
     8  9 10 11
with 3 rows and 4 columns. Then am a is the
list 4 5 6 7 of column means. The row means
1.5 5.5 9.5 are given by am"1 a, where the
rank conjunction " is used to apply the verb am
to each of the rows of a. The expression
am"2 a applies am along the columns and thus
is equivalent to am a, and am"0 a gives the
individual items of a.

     The geometric mean of a list of n
observations is defined as the nth root of the
product of the observations. Since the verb root
%: gives an arbitrary root, and, for example,
3 %: 125 is 5, the geometric mean may be
represented by the fork # %: */, where the
derived verb */ gives the product of the items
of its argument. Thus we may define the verb
     gm=. # %: */
for the geometric mean, and the expression
gm x is equal to 3.94211.

     The harmonic mean is defined as the
reciprocal of the arithmetic mean of the
reciprocal of the observations. Since we wish to
apply a sequence of verbs, we introduce the
conjunction atop (or after) @, and define the
verb
     hm=. % @ am @ % ,
where % represents the monadic verb
reciprocal, which may be be read as "apply % to
the result of applying am to the result of
applying % (to each of the observations)". For
example, hm x has the value 3.6793.

     A three-item list of the arithmetic,
geometric and harmonic means is given by the
verb
     means=. am,gm,hm ,
where , is the verb append, and means x is
equal to 4.2 3.94211 3.6793.

     Finally, let us answer a few questions,
some of  which might have been raised
implicitly in the preceding discussion. All
primitive verbs - more commonly called
functions in programming languages - are
represented by a single graphic which may be
suffixed by a period or a colon, and almost all
have both a monadic form with a single right
argument and a dyadic form with left and
right arguments. We have seen the verb %
which represents both reciprocal and divide so
that, for example, %4 is 0.25 and 3%4 is 0.75.
As another example, the monadic and dyadic
verbs negate and minus are represented by -,
and each of 6-8 and -2 is equal to _2, where
the underbar _ represents the negative sign.
Precedence amongst verbs is determined by
parentheses, and in their absence the right
argument is the entire expression on the right
and the left argument is the noun or pronoun -
 constant and variable in most languages -
immediately on the left. For example, 3*4+5 is
27, and (3*4)+5 and 5+3*4 are equal to 17.
Adverbs and conjunctions, however, take
precedence over verbs, and the left argument of
an adverb or conjunction is the entire verb
phrase to the left. For example, the expression
%@+/1 2 3 is evaluated by inserting the verb
%@+ between the items of the list 1 2 3 and is
equal to 0.833333, and is not equal to the
reciprocal of the sum of the items of the list. 

 
Variances I 
In this section we shall consider the variance,
covariance and correlation coefficient of lists of
observations and give verbs in J for the
calculation of these statistics. The data used to
illustrate the calculations are from Hoel (1966)
and give the amounts w of irrigation water in
inches and the corresponding yields y of alfalfa
in tons per acre:
     w=. 12 18 24 30 36 42 48
     y=. 5.27 5.68 6.25 7.21 8.02
          8.71 8.42 .

     The variance of a list of observations is
defined as the sum of squares of the deviations
of the observations from the arithmetic mean
divided by one less than the number of
observations.  The deviations from the mean,
for the list w, say, are given by the expression
     w - am w
which has the value
     _18 _12 _6 0 6 12 18
since am w is equal to 30. This sequence may
be written more simply as (- am)w, where the
sequence of two verbs within parentheses is a
hook. (In general, if g and h are verbs, then the
hook (g h)y is equal to y g (h y).) Thus we
may define the verb
     dev=. - am
for deviations from the mean, and write dev w
for the deviations of the items of w from their
mean. A verb for the sum of squares of
deviations from the mean may be written as
     ss=. +/ @ *: @ dev , 
where *: is the verb square, and ss w is 1008.
The variance may be expressed as
     var=. ss % <:@# ,
where <: is the verb decrem which subtracts 1
from its argument. We have that var w is 168
and var y is 1.87967. The standard deviation
is defined as the square root of the variance,
i.e.,
     sd=. %:@var ,
where %: is the verb square root, and sd w is
12.9615 and sd y is 1.37101.

     The covariance of two lists of equal
length considered as pairs of observations is
the sum of products of the deviations of
corresponding observations from their
respective means divided by one less than the
number of pairs. To calculate the sum of
products conveniently we introduce the adverb
both ~ which provides its verb with a left
argument equal to its right argument as in
*~ 5 which has the value 25. We shall also
need the verb right ] which gives its right
argument where, for example, 5]8 is 8. (There
is also the verb left [ which gives the left
argument so that 5[8 is 5.) Then we may
define the verb
     sp=. +/ @ (*~ dev) 
for the sum of products, and finally the verb 
     cov=. sp % <:@#@]
for the covariance, and, for example, w cov y
is equal to 17.28. (The verb cov has a dyadic
fork with two arguments which may be
compared with the use of the expression
(f+g)(x,y) in conventional notation to represent
f(x,y)+g(x,y).)  

     The correlation coefficient of two lists of
paired observations is the covariance divided
by the square root of the product of the
variances. The verb
     corr=. cov % *~&sd ,
where the conjunction with & together with the
adverb ~ provides  standard deviations as the
left and right arguments for the verb *. Since
the expression w corr y has the value 0.97,
there is a very strong linear relationship
between yield and the amount of water applied.


Median and quartiles
The median of a list of observations is defined
as the middle observation when they are
arranged in sorted order if the number of
observations is odd, and the average of the two
middle observations if the number is even. For
example, the median of the list i. 5 is 2, and
the median of i. 6 is 2.5. The three quartiles 
are similarly defined as those statistics which
divide the sorted observations into four equal
groups. In this section we shall develop verbs
for calculating the median and the quartiles,
but first of all we shall introduce the additional
primitive verbs which will be required.

     The dyadic verb lesser of <. gives the
smaller of its arguments so that <./ gives the
smallest element in its list argument, and the
verb larger of >. gives the larger of its
arguments. The monadic verbs floor <. and
ceiling >. give the largest integer less than
and the smallest integer larger than their
arguments, and, for example, <.3.14 is 3 and
>.3.14 is 4. The dyadic verb residue | gives
the remainder when the first argument divides
the second, and, for example, 7|10 is 3 and
1|3.5 is 0.5. Finally, the verb halve -: is
equal to %&2.

     We also need the monadic verb grade
up /: which for a list argument gives the
permutation of the indices which sorts the
items of the list in sorted order. For example,
/: x, where x is the list 2.3 5 3.5 6, is
0 2 1 3 since item 0 of x is the smallest, item
2 is the second smallest, etc. The dyadic form
of /: is sort which sorts the left argument in
the order specified by the grade up of the right
argument. Thus x/:x, or more simply /:~x,
sorts the items of x in non-decreasing order.     

     In the examples which follow in this
section we shall use the ten-item list c whose
items are
     22 14 32 30 19 16 28 21 25 31 
or
     14 16 19 21 22 25 28 30 31 32
in sorted order. The median may be found to be
23.5, and the first and third quartiles are 19
and 30, respectively. The second quartile, of
course, is identical to the median.

      First define the verb
     midpt=. -:@<:@#
which gives the location in zero-origin 
indexing of the middle item of its list
argument, so that for the list c given in the
last paragraph midpt c is 4.5. Thus the 
expression (<. , >.) midpt c will be the
list 4 5 of indices of the two items which are
averaged to give the median. (If c had an odd
number of items, this expression would give
the index of the middle item twice.) Thus, we
may define the verb
     median=. -:@(+/)@
          ((<. , >.)@midpt { /:~)
which will give the median of an arbitrary list
of observations. The alternative form
     median=. +/@((0.5 0.5)&*@
          ((<. , >.)@midpt { /:~))
in which the middle items of the list are
weighted by multiplying by 0.5 before being
summed will be generalized in the next
paragraph to obtain an expression for the
quartiles.

     The quartiles may be calculated in a
similar manner by first defining the dyadic
verb
     qtrpt=. 0.25&*@(-&2)@([*#@])  ,
where the right argument is the list of
observations and the left argument is 1, 2 or 3,
to give the locations of the three quartiles, and
the monadic verb
     ptwts=. (1&- , ])@(1&|)@qtrpt
to give the weights associated with the two
observations on either side of the quartile. Now
we may define the verb
     quartile=. +/@(ptwts*((<.,>.)@
          qtrpt {  /:~@]))
whose right argument is the list of
observations and left argument specifies the
quartile to be calcuated. For convenience, we
may define the monadic verb
     q1=. 1&quartile
for the first quartile, and similar verbs q2 and
q3 for the second and third quartiles. The
median may now be defined simply by the
synonym
     median=. q2 .

     Finally, a five-statistic summary giving
the minimum observation, the three quartiles
and the maximum observation may be defined
as
     five=. <./,q1,q2,q3,>./ 
and the expression five c is the five-item list
14 19 23.5 30 32.


Tabulations
Data will be termed discrete if their range of
values is restricted to the non-negative integers
0, 1, 2, ... . We shall give several J verbs for
finding the distinct items in a list of discrete
data and their frequency of occurrence. 

     First we shall introduce the dyadic
adverb table / which gives an array formed by
inserting the verb it modifies between all
possible pairs of items chosen from the two
arguments. For example, if p=. 1 2 3 4 5,
then p+/p  gives the first five rows and
columns of an addition table, and
     10 11 12 */ 5 6 7 8
gives a rectangular segment of the
multiplication table. (Iverson (1991b) suggests
using the verbs
     over=. ({.,.@;}.)@":@,
and
     by=. ' '&;@,.@[,.]
to border a table on the left and top with its
left and right arguments as an aid to
understanding the table. For example, 
     q by q over !/~q
gives a table with Pascal's triangle of binomial
coefficients in the columns on and above the
main diagonal.) 

     The monadic verb nub ~. selects from
its list argument the list of distinct items. We
shall let
     nub=. ~.
so that, for example, if a is the list
     a=. 1 1 0 3 1 3 1 ,
then nub a is 1 0 3 as shown in the table on
the next page which contains most of the
examples discussed here. The monadic verb
self-classify = which we shall represent as
     dis=. =
gives a logical "distribution table" which
relates the items of its argument to the nub of
the argument. Since the row sums of the
distribution table give the frequency of
occurrence of the items of the nub, we define
     fr=. +/"1 @ dis
to give the list of frequencies so that fr a is
4 1 2.

      Now we may define the verb
     onub=. /:~ @ nub
for the ordered nub which gives the items of
the nub in ascending order so that, for
example, onub a is the list 0 1 3. We also
define
     odis=. = @ /:~ 
and
     ofr=. +/"1 @ odis
for the corresponding ordered distribution table
and frequencies.
 
     Now define the range of frequencies for
a list of discrete data as the list of consecutive
non-negative integers from 0 to the maximum
item, and let
     rng=. nni @ (>./) ,
where
     nni=. i. @ >:
gives the list of non-negative integers up to
and including the value of the argument, and
>: is the verb increm which adds 1 to its
argument. For example,   rng a is 0 1 2 3.
Thus we may define
     rdis=. rng=/]
and
     rfr=. +/"1 @ rdis
for the corresponding range distribution table
and frequencies. 

     The dyadic verb efr for extended
frequencies 
     efr=. +/"1 @ ((nni @ [) =/ ])
gives the frequencies of the items in the right
argument for all values up to a maximum
specified by the left argument, and, for
example, 5 efr a is 1 4 0 2 0 0. Finally, 
     frt=. (nni @ [) ,"0 efr
gives a two-column table with the arguments
in the first column and the corresponding
frequencies in the second. For example,
5 frt a gives a table with 0 1 2 3 4 5 in
the first column and 1 4 0 2 0 0 in the
second.

     Finally, we shall define the verb frtab
which may be used either monadically or
dyadically. For example, frtab a will give a
two-column table with the arguments of the
range in the first column and the
corresponding frequencies in the second, while
5 frtab a will give the table at the end of the
last paragraph. First we introduce the explicit
definition of verbs.

     All of the verbs given so far in this
paper have been defined tacitly with no explicit
mention of their arguments. The explicit
definition conjunction : provides for a
definition in which the left and right
arguments are represented by x. and y.,
respectively. The general form of explicit
definition is verb=. m : n, where m and n are
character lists (or other arrays to be discussed
in later sections) representing the monadic and
dyadic forms of the verb. (The spaces on either
side of the colon are required.) As an example,
consider the verb dev for deviations from the
arithmetic mean which was defined tacitly as
     dev=. - am .
An explicit definition is given by
     dev=. '(- am) y.' : '' ,
where the empty list '' indicates the dyadic
form is undefined.  Now suppose we wish to
modify the definition so that the verb may be
used monadically as before and also dyadically
with a left argument giving the power to which
the deviations are to be raised. For example,
for the list
     w=. 12 18 24 30 36 42 48
given in an earlier section, either of the
expressions dev w or 1 dev w will give the
deviations
     _18 _12 _6 0 6 12 18 ,
while 2 dev w will give the squared deviations
     324 144 36 0 36 144 324 .
Such a definition is provided by
     dev=. '1 dev y.' : 
             'x. ^~ (- am) y.' ,
where ^ is the verb power, and, for example,
5^2 is 25, and the conjunction cross ~
interchanges the arguments of the verb. 

     The desired verb frtab may now be
written as
     frtab=. '(>./y.) frt y.' : 
               'x. frt y.' ,
where frt has already been defined as
     frt=. (nni@[)=/] .


Barcharts
In addition to obtaining a frequency table for a
list of discrete data we may wish also to
display the frequencies in a barchart. For
example, with the list a used in the examples
in the previous section we could display the
frequencies both numerically and graphically
as follows:
     +---+----+
     |0 1|*   |
     |1 4|****|
     |2 0|    |
     |3 2|**  |
     +---+----+
First we shall introduce the important notions
of open and boxed nouns and the dyadic verb
copy #.

     The nouns appearing so far have been
open, as opposed to boxed, nouns which are
given by the verb box <. For example, the
expression i. 5 gives the five-item list
0 1 2 3 4, while <i. 5 gives the one-item
boxed list
     +---------+
     |0 1 2 3 4|
     +---------+ .
As another example, <i. 3 4 is 
     +---------+
     |0 1  2  3|
     |4 5  6  7|
     |8 9 10 11|
     +---------+ .
The verb inverse to box is open >, and for any
noun a the expression ><a is equal to a.

     The verb copy # copies items from its
right argument according to its left argument,
and, for example, 1 0 2 0 3 # 'abcdef' is
acceee. If one of the arguments is an atom,
then it is extended to the same length as the
other argument, and 1 0 2 0 3 # 'a' is
aaaaaa.

     Now we can define the verb
     bars=. ]@#'*'
which, with a one-column table of frequencies
as an argument, will give a table whose rows
are a graphic representation of the frequencies.
For example,
      (,.1 4 0 2) # '*' ,
where the verb ravel items,. gives a one-
column table of its list argument, is equal to
     *   
     ****
    
     **   .

     Finally, the verb
     barchart=. [ ; (bars@}."1@])
where }. is the monadic verb behead whose
argument is a two-column frequency table will
give the boxed representation of the
frequencies given at the beginning of this
section. Indeed, for the list a  and the verb
frtab of the previous section this figure is
given by the expression
     barchart frtab a .


Simulation
Two verbs, both represented by ?, are available
for random sampling. The monadic form roll
samples with replacement, and ? y gives a
uniform random selection from the population
i. y. The dyadic form deal samples without
replacement, and x ? y is a list of x items
chosen without repetition from i. y. The ten-
item list c of random items used in a previous
section was computed by the expression
10+10?30. 

     Now the expression ?6 will select one
integer at random from i. 6, and >:?6 will
select an integer at random from the first six
positive integers and so may represent the
result of throwing an unbiased die once. Thus
the verb
     dice=. >:@(?@(6&($~)@([,])))  
gives the results of a simulated throwing of a
given number of dice a given number of times,
since the verb shape $ gives an array of items 
of its right argument shaped according to the
left argument. For example, 3 dice 10 which
might have the value
     1 3 3 1 5 5 4 1 6 5
     1 4 1 4 5 6 4 2 5 4
     2 5 4 2 1 4 5 2 6 2
would represent the results of throwing 3 dice
10 times, the column sums +/ 3 dice 10
would give the sum of numbers occurring on
the faces on each of the throws. If we introduce
the verb drop }. which drops the number of
items from its right argument specified by the
left argument, the expression
       barchart 2}. 12 frtab +/2 dice 50
which could produce the array
     +-----+--------------+
     | 2  3|***           |
     | 3  4|****          |
     | 4  4|****          |
     | 5  5|*****         |
     | 6  5|*****         |
     | 7  7|*******       |
     | 8 14|**************|
     | 9  2|**            |
     |10  1|*             |
     |11  3|***           |
     |12  2|**            |
     +-----+--------------+
gives the sum of the numbers occurring on 2
dice rolled 50 times. Finally, the expression
1 dice 10, which gives the results of rolling
1 die 10 times could be replaced by the more
grammatically correct expression die 10 by
the definition
     die=. 1&dice  .

     In the following generalization of the
dice-throwing simulation we shall need the
adverb prefix \ which applies its left verb
argument to each prefix of its noun right
argument. For example, if p=. 1 2 3 4 5,
then <\p is the array
     +-+---+-----+-------+---------+ 
     |1|1 2|1 2 3|1 2 3 4|1 2 3 4 5|
     +-+---+-----+-------+---------+
A left argument of the modified verb may be
used to specify the prefix length, and 2<\p is
     +---+---+---+---+
     |1 2|2 3|3 4|4 5|
     +---+---+---+---+
and _2<\p is
     +---+---+-+
     |1 2|3 4|5|
     +---+---+-+  .

     Now introduce the random integer verb
     ri=. [+?@>:@(]-[)
which gives random integers between the
limits specified by its arguments, so that, for
example, 1 ri 6 gives a random integer
between 1 and 6. Now we have the alternative
verb
     dice=. 1&ri@(6&($~)@([*]))
for throwing an arbitrary number of dice, and,
for example, the expression 2 dice 5 which
might have the value
     1 4 6 6 4 4 3 2 1 1
represents the results of throwing 2 dice 5
times. The pairs of values obtained on
successive throws are shown by the expression 
     _2<\ 2 dice 5
which might have the value
     +---+---+---+---+---+
     |2 5|5 3|5 1|1 5|2 1|
     +---+---+---+---+---+  ,
and the sums of the pairs of values by
     _2+/\ 2 dice 5 
which could be 10 5 9 6 5.

     Let us now consider another example of
a simulation that will give an estimate of the
value of c. A quadrant of a circle with unit
radius and centre at the origin is inscribed in
a unit square with vertices at (0,0), (1,0), (0,1)
and (1,1). Now consider picking a number of
points at random within the unit square and
determining the fraction which lie within the
quadrant of the circle. This fraction may be
taken as an estimate of the area of the
quadrant of the circle and thus of the value of
c/4. We shall now give some verbs which will
allow us to estimate the value of c by
simulating this random procedure.

     First define the random real verb
     rr=. ] %~ (0&ri@])
for generating uniformly distributed random
numbers between 0 and 1, inclusive. The
argument determines the size of the random
integer used in the generation, and, for
example, the number generated by the
expression rr 100000 is obtained by selecting
a non-negative integer less than or equal to
100000 and dividing it by 100000. The verb  
     coords=.
           rr@(($&100000)@(2&*@]))
gives a list whose successive pairs of items are
the coordinates of the random points,
     incircle=.
      +/@(1&>:)@(_2&(+/\)@*:@coords)
gives the number of points within the quadrant
of the unit circle, and
     pi=. 4&% * incircle
finds the estimate of the value of c for a given
number of random points. For example, five
estimates for 1000 points, each given by the
expression pi 1000, gave the values 3.144,
3.16, 3.08, 3.088 and 3.1.
     The use of the verb pi for larger values
of its argument, say, 2000, 3000, ..., depending
on the system on which J is running, will
eventually cause a limit error and the
termination of the computation. Therefore, it is
desirable to modify the verb pi so that an
estimate of c based on a large number of trials
may be divided into a number of smaller
simulations and the results pooled. For this
purpose shall define a dyadic verb pi such that
the expression 10 pi 1000, for example, will
give an estimate of c based on the pooled
results of 10 simulations of 1000 trials each.
Furthermore, the monadic use of the verb as in
pi 1000 will be equivalent to 1 pi 1000.
Since such a verb will be defined explicitly, we
shall consider first some aspects of explicit
definition that have not been discussed
previously.

     An explicitly defined verb may contain
any number of sentences which are numbered
sequentially with the non-negative integers.
The order of execution is determined by the
suite $. which is set initially to i. ns, where
ns is the number of sentences. The suite may
be reset during execution to allow for repetition
of a number of statements, branching or
recursion.  

     The explicit definition of pi is given by
+-------+-+----------------------------+
|1 pi y.|:|r=. 0                       |
|       | |n=. x.                      |
|       | |$.=. > (n = 0) { (<3 4 5),<6|
|       | |r=. r + incircle y.         |
|       | |n=. <: n                    |
|       | |$.=. 2                      |
|       | |4 * r % x. * y.             |
+-------+-+----------------------------+
The variable r is used to accumulate the total
number of random points lying within the
quadrant of the circle calculated by incircle.
By use of the verb from { which selects items
from the list right argument as determined by
the left argument the suite is set in sentence 2
to either 3 4 5 or 6 according as n is greater
than zero or equal to zero, respectively. The
total number of points lying in the quadrant of
the circle is used in sentence 6 to find the
estimate of c. If the left argument x. is 0, this
estimate is 0 since 0%0 in J is equal to 0. Five
evaluations of the expression 10 pi 1000 gave
estimates 3.1388, 3.1596, 3.1228, 3.1472
and 3.1348.   


The Poisson distribution
A well-known probability distribution is the
Poisson distribution which applies when the
probability of success on any one trial is very
small and the number of trials is so large that
the expected number of successes, the product
of these two quantities, is of moderate size. If
the mean number of successes is  , then the
probability of x successes, where x is a non-
negative integer, is equal to e-  x/x!. The
classical example cited in many statistics texts
is the number of deaths which occurred from
1875 to 1894 in various Germany army corps
from kicks from horses. The original data,
given in Weaver (1963), are as follows where
the rows refer to army corps and the columns
to years:
     02210011030210010101
     00020302000111020310
     00020200110021100200
     00011120200010121000
     01011110000100001100
     00002100100101111110
     00102001201131110300
     10100010110020021020
     10001001000010001101
     00000211102110120100
     00110102020000213011
     00002401301111213131
     11211304010321021100
     01000001011000220000
We shall use these data to illustrate our brief
discussion of the Poisson distribution.

     The observed frequencies of the number
of deaths may be found very simply using one
of the verbs for tabulating discrete data
developed earlier. If we let the above table be
represented by d, say, then d has 14 rows and
20 columns, and ,d, where , is the monadic
verb ravel, is a list with 280 items. The
expression 5 efr ,d is the list
     144 91 32 11 2 0 
of the observed frequencies of 0 to 5 deaths,
inclusive.      

     To define a verb for the Poisson density
function we need the monadic verb factorial !
which gives the factorial of its non-negative
integer argument, and the monadic exponential
and dyadic power verbs both represented by ^,
such that, for example, ^1 is 2.71828 and 2^3
is 8. Then we may define the verb
     pd=. ^ * ((^&-@[)%(!@]))
to give the probabilities for the Poisson
distribution. For example, the expression
     1.5 pdf 0 1 2 3
which is equal to
     0.2231 0.3347 0.2510 0.1255
gives the probabilities rounded to four decimal
places for 0, 1, 2 and 3 successes in a Poisson
distribution with mean 1.5.

     To calculate the theoretical frequencies
on the basis of a Poisson distribution we
estimate the expected number of deaths from
the data as am ,d which has the value 0.7.
Then an estimate of the Poisson probabilities
for 0 to 5 deaths, inclusive, is given by
0.7 pd i. 6 which is the six-item list
     0.49659 0.34761 0.12166 0.02839
          0.00497 0.00070 .
The expected frequencies are given by
     expfr=. 280 * 0.7 pd i. 6
which rounded to one decimal place is the list
     139.0 97.3 34.1 7.9 1.4 0.2 .
If we let t=. 5 frt ,d be the table of
corresponding number of deaths and observed
frequencies, then the calculations may be
conveniently summarized by the expression
     6.0 6.0 8.1 ": t,.expfr
which has the value
     0   144   139.0
     1    91    97.3
     2    32    34.1
     3    11     7.9
     4     2     1.4
     5     0     0.2  .
The verb format ": is used to format the
columns of the right argument according to the
field specifications given in the left argument.
(The expression 1":d was used to display the
table d of data at the beginning of this section.) 


The binomial distribution
The probability of x successes in n independent
binomial trials with probability p of success in
a single trial is equal to nCxpx(1-p)n-x for x = 0,
1, 2, ..., n, where nCx is the number of
combinations of n things taken x at a time. A
simple example is the number of heads
occurring when an unbiased coin is tossed 4
times, say, in which the probabilities of 0 to 4
heads, inclusive, are 0.0625, 0.25, 0.375, 0.25
and 0.0625, respectively. In this section we
shall define some verbs that will enable us to
investigate the binomial distribution for
arbitrary numbers of trials and probabilities of
success. 

     First of, we shall introduce the adverb
key /. which applies a verb to the
classification specified by the noun left
argument to the noun right argument. For
example, if p is the list of the first eleven non-
negative integers, then 2&|p is the list
     0 1 0 1 0 1 0 1 0 1 0
of remainders when the integers are divided by
2, the expression (2&|p) </. p is the list
     +------------+---------+
     |0 2 4 6 8 10|1 3 5 7 9|
     +------------+---------+
of the items of p classified by their 2-residues,
and (3&|p) +//. p is the list 18 22 15 of
the sums of the items of p when classified by
their 3-residues. The phrase f &. g, where &.
is the conjunction under, is equivalent to
f & g followed by the application of the verb
inverse to g. For example, 
     x=. 1 2 3;4 5;6 7 8 ,
where ; is the dyadic verb link, has the value 
     +-----+---+-----+
     |1 2 3|4 5|6 7 8|
     +-----+---+-----+ ,
and +/ &. > is the boxed list 
     +-+-+--+
     |6|9|21|
     +-+-+--+ 
which gives the sums of the items of x.
     
     One final verb required in this section
is catalog { which is a generalization of the
Cartesian product in that it gives the array
which is formed by selecting in all possible
ways one item from each of the items of its
argument and whose shape is determined by
the shapes of its argument. As a very simple
example, the expression {1 2;3 4 5 is the
array
     +---+---+---+
     |1 3|1 4|1 5|
     +---+---+---+
     |2 3|2 4|2 5|
     +---+---+---+  .
As another example, the expression ,{2#<0 1
is the boxed list 
     +---+---+---+---+
     |0 0|0 1|1 0|1 1|
     +---+---+---+---+
of all possible cases of two boolean arguments.
This expression may be generalized by the
truth table verb
     tt=. ,@{@(]@#&(<0 1))
which gives all boolean arguments of the truth
table whose size is specified by its argument,
so that, for example, tt 2 gives the above list.
We may also define the verb
     ttkey=. >@(+/ &. >)@tt
which may be used to classify the items of the
truth table according to the number of 1s each
contains, so that, for example, ttkey 2 is the
list 0 1 1 2, and (ttkey </. tt) 2 is
     +-----+---------+-----+
     |+---+|+---+---+|+---+|
     ||0 0|||0 1|1 0|||1 1||
     |+---+|+---+---+|+---+|
     +-----+---------+-----+ .

     Now we may return to our brief
consideration of the binomial distribution. The
range of the number of trials is
     brng=. i.@>:
so that, for example, brng 2 is 
     +-+-+-+-+
     |0|1|2|3|
     +-+-+-+-+  .
The verb
     bsf=. ({&'FS' &. >)@tt
shows the configurations of successes and
failures for an arbitrary number of trials and
     grbsf=. ttkey </. bsf
shows them grouped by the number of
successes, and, for example, bsf 2 is
     +--+--+--+--+
     |FF|FS|SF|SS|
     +--+--+--+--+
and grbsf 2 is
     +----+-------+----+
     |+--+|+--+--+|+--+|
     ||FF|||FS|SF|||SS||
     |+--+|+--+--+|+--+|
     +----+-------+----+ .
The probabilities and grouped probabilities are
given by
     bpr=. (*/ &. >)@,@{@(] #       
       (<@|@(1&-@[ , [)))
and
     grbpr=. ttkey@] </. >@bpr ,
respectively, where the right argument gives
the number of trials and the left argument the
probability of success in a single trial. For
example, 0.6 bpr 2 is 
     +----+----+----+----+
     |0.16|0.24|0.24|0.36|
     +----+----+----+----+
and 0.6 grbpr 2 is
     +----+---------+----+
     |0.16|0.24 0.24|0.36|
     +----+---------+----+ .
Finally, the binomial probabilities are given by
the verb
     bd=. <"0@>@(+/ &. >)@grbpr
where, for example, 0.6 bd 2 is
     +----+----+----+
     |0.16|0.48|0.36|
     +----+----+----+  .
A summary verb for the binomial distribution
is
     bsum=. (brng@]),"0 1(grbsf@],
                         "0 0 bd)
and 0.6 bsum 2 is
     +-+-------+----+
     |0|+--+   |0.16|
     | ||FF|   |    |
     | |+--+   |    |
     +-+-------+----+
     |1|+--+--+|0.48|
     | ||FS|SF||    |
     | |+--+--+|    |
     +-+-------+----+
     |2|+--+   |0.36|
     | ||SS|   |    |
     | |+--+   |    |
     +-+-------+----+  .

     The binomial coefficients are given by
the verb out of !, and 2!5 is 10, the number of
combinations of 5 things taken 2 at a time.
The verb
     bc=. i.@>:!]
gives a list of binomial coefficients, and, for
example, bc 5 is the list 1 5 10 10 5 1.


Variances II
In a previous section we developed verbs for
calculating the variance of a list of
observations, and the covariance and
correlation coefficient of two lists of paired
observations. In this section we will discuss the
calculation of these statistics for an arbitrary
number of lists of observations and the
presentation of the results in both list and
tabular forms. The illustrative data we shall
use have been taken from Searle (1966) and
represent six observations on each of three
variables, and are given here as the three-item
boxed list  
+-----------+-----------+-----------------+
|1 4 2 2 1 3|0 6 4 3 1 5|10 17 13 14 12 15|
+-----------+-----------+-----------------+
which we shall refer to as d.

     The variances of each of the lists may
be calculated by the verb
     varlist=. var@"1@> ,
where var has been defined in the previous
discussion of variances as have been the verbs
cov and corr to be used later in this section,
and varlist d has the value
     1.36667 5.36667 5.9
giving the variances of each of the lists in d.

     The calculation of the covariances and
correlation coefficients for all pairs of variables
requires the enumeration of the combinations
of an arbitrary number of things taken two at
a time. The following brief discussion has been
taken from Iverson (1991c) to which the reader
is referred for further details. We introduce
first the primitive verb atomic permute A.
which selects from its right argument the
permutation or permutations specified by the
left argument. For example, 2 A. 'abcd' is
acbd and 10 A. 'abcd' is bdac. The verb
     PT=. i.@! A. i.
gives an ordered table of all permutations of
order specified by its non-negative integer
argument, and, for example, PT 3 is the table
     0 1 2
     0 2 1
     1 0 2
     1 2 0
     2 0 1
     2 1 0 .
A table of combinations may be obtained from
this table by selecting the required number of
columns from this table, sorting the rows, and
then removing duplicate rows. Thus we may
define the verb
     C=. nub@:rsort@([ rtake PT@])
where
     nub=. ~.
has already been defined, 
     rtake=. {."1
and
     rsort=. /:~"1 . 
The conjunction at @: is similar to @ but allows
arguments of infinite rank. For example,
2 C 3 is a table with rows 0 1, 0 2 and 1 2,
and 4 C 4 is the table 0 1 2 3.

     Now we may define the verb
     dispairs=. (2&C@#) { ]
that will select all possible pairs of lists, and,
for example, dispairs d is
+-----------+-----------------+
|1 4 2 2 1 3|0 6 4 3 1 5      |
+-----------+-----------------+
|1 4 2 2 1 3|10 17 13 14 12 15|
+-----------+-----------------+
|0 6 4 3 1 5|10 17 13 14 12 15|
+-----------+-----------------+  .
The covariances may now be found by the verb
     vclist=. (first cov second)"1 
               @ dispairs
where the verbs
     first=. >@(0&{) 
and
     second=. >@(1&{)
select the first and second items, respectively,
of their list arguments, and vclist d is equal
to 2.56667 2.7 5.3. Similarly, the list of
correlation coefficients may be calculated by
     corrlist=. 
       (first corr second)"1 @
               dispairs
and in our example corrlist d is equal to
0.947733 0.950838 0.941884.  
      
     Before giving verbs for variance-
covariance and correlation tables we shall
introduce the two verbs base #. and antibase
#: which are used to convert from one number
representation to another although only the
second one will be used here. For example,
#: 13 is 1 1 0 1, the binary representation of
the decimal number 13, and, conversely,
#. 1 1 0 1 is 13. As another example,
     8 8 8 #: 123
which is equal to 1 7 3 is the three-digit octal
or base-8 representation of the decimal number
123, and 8#.1 7 3 is 123. The number base
may be mixed as in 
     24 60 60 #: 19510
which is 5 25 10, the number of hours,
minutes and seconds in 19510 seconds.

     The verb
     allpairs=. ((2&#@#) #:
       (i.@*:@#)) { ]
gives a table similar to that given by
dispairs for all pairs of the lists given as its
argument. For example, for the list d the
indices of the pairs would be 0 0, 0 1, 0 2,
1 0, 1 1, 1 2, 2 0, 2 1 and 2 2. The verbs
     vctable=. (2&#@#) $ 
       (first cov second)"1 @
               allpairs
and     
     corrtable=. (2&#@#) $
       (first corr second)"1 @
               allpairs
give the variance-covariance and correlation
tables, respectively, and 10.5": vctable d is
     1.36667   2.56667   2.70000
     2.56667   5.36667   5.30000
         2.70000   5.30000   5.90000
and 10.5": corrtable d is
     1.00000   0.94773   0.95084
     0.94773   1.00000   0.94188
     0.95084   0.94188   1.00000  .


Regression
In this section we shall introduce a conjunction
and two verbs used for matrix operations,
discuss briefly their use in regression
calculations, and finally make a few remarks
about a multiple regression program which is
given in the Appendix.

       The dyadic dot product conjunction .
gives the inner or matrix product of its
arguments. If a and b are tables with the
number of columns of a equal to the number of
rows of b, then a +/ . * b, where the space
on either side of the conjunction symbol is
required, is the conventional matrix product of
a and b.

     The monadic matrix inverse %. gives
the inverse of its non-singular square matrix
argument. The dyadic matrix divide, also
represented by %., may be used to solve a
system of linear equations or perform a least-
squares fit. For example, consider the data
     y=. 5.27 5.68 6.25 7.21 8.02      
          8.71 8.42
and
     w=. 12 18 24 30 36 42 48
on yield as a function of the amount of water
applied which was given earlier in the paper.
Then
     W=. 1,"0 w
is a two-column table with 1s in the first
column and the values of w in the second
column, and 
     b=. y%.W
has the value 3.99429 0.102857 which are
the regression coefficients of the least-squares
linear regression of yield on water. The
estimated values of yield are W +/ . * b or
     5.23 5.85 6.46 7.08 7.70
          8.31 8.93 .
The expression
     y%. 1,"1 w,"0 *:w
gives the regression coefficients
     3.19429 0.166349 _0.0010582
for a quadratic fit, and y%.w gives the single
coefficient 0.217635 for a fit through the
origin.

     Now consider defining an explicit
monadic verb reg whose argument is a boxed
list whose items give the values of one or more
independent variables and whose last item
gives the values of the dependent variable. For
example, for the data given earlier in this
section we could define d0=. w;y which would
have the value
+--------------------+---------
|12 18 24 30 36 42 48|5.27 5.68
+--------------------+---------
      -------------------------+
        6.25 7.21 8.02 8.71 8.42|
      -------------------------+ . 
Then the expression reg d0 would give the
regression of yield on water. As another
example, for the three-item list of the previous
section reg d would give a regression with the
first two items of d as independent variables
and the third item as the dependent variable,
and reg 0 2{d would give a simple regression
involving the first and third items of d.
  
     For convenience we will define the
verbs
     indepvar=.
           (1&,"1)@|:@(>@(_1&}.))
and
     depvar=. ;@(_1&{.)  ,
where the dyadic verbs take {. and drop }.
select or remove items from either end of a list,
to generate the independent and dependent
variables in the appropriate format to be used
as the arguments for matrix divide. For
example, indepvar d0 gives the two-column
table W and depvar d0 gives the list y of
values for the yield used in the example above.

     The verb reg gives the regression
coefficients, standard errors and t-values, the
analysis-of-variance table, the standard error of
estimate, and the square of the multiple
correlation coefficient. The Appendix gives a
listing of reg and some examples of its use.


Analysis of variance
The main computational problem in the
analysis of variance is the partition of the total
variation of a dependent variable as measured
by the sum of squares of deviations from the
arithmetic mean into a number of orthogonal
components for each of several main effects
and their interactions. If the data are
considered to be arranged in a rectangular
array with an axis for each factor and one for
replications, then a major aspect of the
partitioning process is the calculation of some,
or possibly all, of the marginal totals of various
subarrays. For example, for a one-factor design
the data may be arranged in a table, and the
four marginal totals consist of the row and
column totals, the grand total, and the items of
the array. In general, for an n-dimensional
rectangular array there are 2n marginal totals.
For each set of marginal totals a weighted sum
of squares is found by squaring each item,
summing and dividing by the number of items
occurring in each total. From these weighted
sums of squares the required analysis-of-
variance table may be found relatively simply.
In this section we shall develop verbs for
finding all weighted sums of squares in a
rectangular array of arbitrary rank.

     We shall require the conjunction power
^: which may be used to repeat the verb left
argument a number of times specified by the
right argument. For example, (*:^:2) 5,
where *: is square, is 625, and *:^:(i. 4) 2
is 2 4 16 256. As another example, if x is the
array i. 3 5 which has the value 
      0  1  2  3  4
      5  6  7  8  9
     10 11 12 13 14  ,
then +/^:2 x is 105 which is the sum of the
column sums of x, or, more simply, the sum of
the items of x. We need also the verb transpose
|: which interchanges the axes of the array
right argument as specified by the left
argument, and the verb not -. which
complements its boolean argument. 

     The calculation of any set of marginal
totals in an arbitrary array may be
accomplished by first transposing the array so
that the axes to be summed over occur first
and then summing over these axes. We shall
specify the axes by a boolean list of length
equal to the rank of the array with 0s
indicating the axes being summed over. For
example, consider summing over the first, third
and fourth axes of an array a, where $a is
2 3 4 5, say, so that the boolean list
specifying the summation is 0 1 0 0. Since
/:0 1 0 0 is equal to 0 2 3 1, the
transposition may be accomplished by the
expression 0 2 3 1|:a which gives an array
of shape 2 4 5 3. Application of the verb +/
three times, the number of 0s in the boolean
list specifying the marginal totals, to this array
will give the desired marginal totals. These
calculations may be accomplished simply by
the explicit verb
     T=. '' :
          '+/^:(+/-.x.)(/:x.)|:y.'
where the right argument gives the array and
the left argument specifies the marginal totals
required. Then the above example is given by
0 1 0 0 T a. The weighted sum of squares is
then given by
     SS=. (+/@((*:@,)@T)) % 
          (*/@(-.@[ # $@]))
where the arguments are the same as for T.
Finally all sums of squares for an arbitrary
array may be calculated by the verb
     allSS=. (>@tt@#@$) SS"1 _ ]  ,
where tt is the truth table verb
     tt=. ,@{@(]@#&(<0 1))
introduced earlier in the paper. We note that
the rank conjunction is used to apply SS to
each row of the table giving the items in the
truth table generated as the left argument and
to an array of arbitrary rank given as the right
argument.

     As an example of the use of the above
verbs consider the array x introduced earlier in
this section. Then 0 0 T x is the sum 105 of
the items of x, 0 1 T x is the column sums 
15 18 21 24 27, 1 0 T x is the row sums
10 35 60, and 1 1 T x is the array x. The
corresponding values of the weighted sums of
squares given by SS are 735, 765, 985 and
1015, respectively. These are given in the same
order by the expression allSS x since the
rank of x is 2 and tt 2 is 
     +---+---+---+---+
     |0 0|0 1|1 0|1 1|
     +---+---+---+---+  .
With these sums of squares the various sums
of squares of deviations from the mean
required in an analysis-of-variance table may
be found. For example, the total sum of
squares for the above data is given by
     -/3 0{allSS x 
which has the value 280.


Acknowledgements
I have followed the advice I gave at the
beginning of this paper and have kept Ken
Iverson's monographs on J "conveniently at
hand" while writing this paper, and have found
them indispensable. I have benefited too from
Donald McIntyre's work on J, and, in
particular, I appreciate his assistance in
correcting my earlier work on the calculation of
variances and covariances. While writing this
paper, I have had some very helpful
correspondence from Eugene McDonnell.
Finally, I would like to thank Roger Hui for
responding so promptly and cheerfully to all of
the email I have sent to him.
   

References
Hoel, P. G., 1966. Elementary Statistics.
     Second edition. John Wiley & Sons,
     Inc., New York.

Iverson, K. E., 1991a. ISI Dictionary of J.
     Iverson Software Inc., Toronto. 

Iverson, K. E., 1991b. Programming in J.
     Iverson Software Inc., Toronto.

Iverson, K. E., 1991c. Arithmetic. Iverson
     Software Inc., Toronto.

Iverson, K. E., 1992. An Introduction to J.
     Iverson Software Inc., Toronto.

McIntyre D. B., 1991. "Language as an
     intellectual tool: From hieroglyphics to
     APL." IBM Systems Journal, vol. 30,
     no. 4, pp. 554 - 581.

Searle, S. R., 1966. Matrix Algebra for the
     Biological Sciences. John Wiley and
     Sons, Inc., New York.

Weaver, W., 1963. Lady Luck: The Theory of
     Probability. Doubleday & Company,
     Inc., Garden City, N. Y.     
Appendix. Mutiple regression calculations
     reg
+---------------------------------------------------------+-++
|b=. (y=. depvar y.)%.X=. indepvar y.                     |:||
|sst=. +/*:(y-am y)                                       | ||
|ssr=. sst-sse=. +/*:(y-yest=. X +/ . * b)                | ||
|F=. (msr=. ssr%k)%mse=. sse%_1+(n=. $y)-k=. <:#y.        | ||
|rsq=. ssr%sst                                            | ||
|seb=. %:(0{mse)*(<1 0)|:%.(|:X)+/ . * X                  | ||
|r=. 49{.'             Var.    Coeff.      S.E.         t'| ||
|r=. r, 15.0 12.5 12.5 10.2 ": (i. >:k),. b,. seb,. b%seb | ||
|r=. r, ' '                                               | ||
|r=. r, '  Source     D.F.    S.S.        M.S.         F' | ||
|r=. r, 'Regression', 5.0 12.5 12.5 10.2 ": k, ssr,msr,F  | ||
|r=. r, 'Error     ', 5.0 12.5 12.5": (n-k+1), sse, mse   | ||
|r=. r, 'Total     ', 5.0 12.5 ": (n-1), sst              | ||
|r=. r, ' '                                               | ||
|r=. r, 'S.E. of estimate    ', 10.5":%:mse               | ||
|r=. r, 'Corr. coeff. squared', 10.5": rsq                | ||
+---------------------------------------------------------+-++

      reg d0
             Var.    Coeff.      S.E.         t  
              0     3.99429     0.35656     11.20
              1     0.10286     0.01104      9.32
                                                 
  Source     D.F.    S.S.        M.S.         F  
Regression    1    10.66423    10.66423     86.87
Error         5     0.61377     0.12275          
Total         6    11.27800                      
                                                 
S.E. of estimate       0.35036                   
Corr. coeff. squared   0.94558                   


      reg d
             Var.    Coeff.      S.E.         t  
              0     9.59821     0.94950     10.11
              1     1.18750     1.06075      1.12
              2     0.41964     0.53529      0.78
                                                 
  Source     D.F.    S.S.        M.S.         F  
Regression    2    27.15179    13.57589     17.34
Error         3     2.34821     0.78274          
Total         5    29.50000                      
                                                 
S.E. of estimate       0.88472                   
Corr. coeff. squared   0.92040                   
  
   
      reg 0 2{d
             Var.    Coeff.      S.E.         t  
              0     9.21951     0.77705     11.86
              1     1.97561     0.32173      6.14
                                                 
  Source     D.F.    S.S.        M.S.         F  
Regression    1    26.67073    26.67073     37.71
Error         4     2.82927     0.70732          
Total         5    29.50000                      
                                                 
S.E. of estimate       0.84102                   
Corr. coeff. squared   0.90409                   
