Introduction to arrays
Introduction to arrays#
The array is APL’s fundamental data type. Arrays are collections of scalars (atomic data units). There are a few types of scalars: numbers, characters, and references (refs). References are to such things as namespaces (≈JSON objects), GUI objects (WinForms), HTML Renderers, classes, instances, etc. Let’s not worry about all those now.
Characters are denoted by single quotes. 'a'
is a scalar letter a
. APL doesn’t really have strings, just lists (vectors in APL lingo) of characters. In order to write a literal vector (=list) you just write the items next to each other. 'H' 'e' 'y'
will render as Hey
:
'H' 'e' 'y'
Hey
Fortunately, there is a shortcut. APL allows you to write 'Hey'
and it means the same as 'H' 'e' 'y'
:
'Hey'
Hey
So a list of numbers need no decorators whatsoever: 1 2 3
1 2 3
1 2 3
You can also nest items. 'Hey' 'you!'
is a vector of two elements. Each element is itself a vector.
'Hey' 'you!'
┌───┬────┐ │Hey│you!│ └───┴────┘
You can also mix data types: 'APL'360
is a two-element vector. The first element is a three-element vector of chars, the second is a scalar number.
'APL'360 ⍝ Note: no space required
┌───┬───┐ │APL│360│ └───┴───┘
By the way, in APL, a number is a number. APL converts between internal representations on the fly, so you never have to worry about such conversions. It even takes care of floating point imprecision for you!
'a'3
is a two element vector. No space needed here, either.
'a'3
a 3
You can also use parentheses to delimit vectors:
(1 2 3)(4 5)
┌─────┬───┐ │1 2 3│4 5│ └─────┴───┘
Question:
Is there any concept of a “mutable array” in APL?
Nope. You always (appear to) create a new array when modifying an array. However, internally, APL keeps a ref-count and points multiple names to the same memory location if possible. However, all the “reference” types are mutable.
The levels of nesting in APL lingo are called depth. A simple scalar has depth 0. A vector has depth 1. A vector of vectors has depth 2, etc. If the depth is uneven, then we report it as negative. Note that negative numbers in APL are denoted by a high minus (like TI calculators).
You can have 1-element vectors, but you have to “create” them rather than write them. The prefix function ,
(comma) takes an array and makes it into a list. So ,6
is a one-element list.
]display ,6 ⍝ Verbose display to demonstrate that ,6 is indeed a vector
]display (,6)1 2
]display (,6)(1 2)
┌→┐ │6│ └~┘
┌→────────┐ │ ┌→┐ │ │ │6│ 1 2 │ │ └~┘ │ └∊────────┘
┌→──────────┐ │ ┌→┐ ┌→──┐ │ │ │6│ │1 2│ │ │ └~┘ └~──┘ │ └∊──────────┘
APL also has a concept of rank. The rank of an array is the number of dimensions in that array. A scalar has rank 0, a vector has rank 1.
However, we can also have a rank 2 array; a matrix, or table. Note that rank ≠ depth. So I can have a matrix where every element is a “string” (i.e. a vector). I can also have a vector of vectors of “strings”.
Rank is always flush. Every row in a matrix must have the same number of columns. Every layer in a 3D block of data must have the same number of rows and columns.
Each APL implementation has a different max number of dimensions. Dyalog allows 15D arrays. If that isn’t enough for you, you may be doing something not quite right. J, which is a dialect of APL (and the mother of Jelly) allows for an unlimited (except by memory) number of dimensions.
Imagine a piece of paper with a grid of letters. So we have rows and columns. Each paper is a page in a book. Each book is numbered on its shelf. The shelves are numbered. There are multiple bookcases next to each other. And there are several such corridors. In rooms next to each other. Each floor has multiple numbered corridors, etc.
The infix function reshape, ⍴
(Greek letter “rho” for reshape), takes a list of dimension lengths as left argument and any data as right argument. It returns a new array with the specified dimensions, filled with the data. If there is too much data, the tail just doesn’t get used. If there is too little, it gets recycled from the beginning.
We can create a 3-row, 4-column table with
3 4⍴'abcdefghijkl'
abcd efgh ijkl
3 4⍴'abc' ⍝ insufficient data; keep recycling
abca bcab cabc
Most primitive APL functions have both a monadic (one argument) and a dyadic (two arguments) form. It is always clear from context which one is being applied, as all monadic functions are prefix, and all dyadic ones are infix.
We already addressed the dyadic ⍴
which was Reshape. The monadic ⍴
is Shape. It reports back what the shape is.
3 3⍴⎕A
⍴3 3⍴⎕A ⍝ What is the shape of a 3x3 matrix?
]display ⍴1 2 3 4 ⍝ What is the shape of a vector?
]display ⍴6 ⍝ What is the shape of a scalar?
ABC DEF GHI
3 3
┌→┐ │4│ └~┘
┌⊖┐ │0│ └~┘
Note that the shape is always a vector. The shape of a scalar is the empty numeric vector, denoted ⍬
.
Monadic ↑
is mix, which ups the rank (at the cost of one level of depth). We can also lower the rank with split, ↓
, and thereby gain a level of depth.
(1 2 3)(4 5 6)(7 8 9) ⍝ vector
↑(1 2 3)(4 5 6)(7 8 9) ⍝ mix vector to a matrix
3 4⍴⍳12 ⍝ matrix
↓3 4⍴⍳12 ⍝ split matrix to a vector
┌─────┬─────┬─────┐ │1 2 3│4 5 6│7 8 9│ └─────┴─────┴─────┘
1 2 3 4 5 6 7 8 9
1 2 3 4 5 6 7 8 9 10 11 12
┌───────┬───────┬──────────┐ │1 2 3 4│5 6 7 8│9 10 11 12│ └───────┴───────┴──────────┘
There is no primitive for rank, because if you think about it, the rank is the shape (actually, the tally) of the shape. There is, however, a primitive for depth: ≡
≡(1 2 3)(4 5 6)(7 8 9) ⍝ vector of vectors
2
There is a different primitive for count (called tally): ≢
– it looks like a tallying mark.
≢7 5 6 3 2
5
Question:
so…what is the difference between rank and depth?
This is important to understand. Depth is the level of nesting. Rank is the number of dimensions.
So now we have discovered monadic ↑
, ↓
, ≡
, ≢
, ⍴
and dyadic ⍴
. Monadic ⍴
always returns a vector. Monadic ≢
always returns a scalar. ≢
on a matrix returns the number of rows. ≢
on a 3D block returns the number of layers, etc. We prefer to call it the tally of “major cells”. The concept of major cells is important when it comes to manipulating and comparing arrays.
We already saw how dyadic ⍴
can reshape things. Dyadic ↑
is take. In order to speak about its two arguments easier, we will give them names. The left argument we will call ⍺
as in the leftmost letter of the Greek alphabet, and the right argument we will call ⍵
as in the rightmost letter. In other words, ↑⍵
is monadic ↑
and ⍺↑⍵
is dyadic ↑
.
⍺↑⍵
takes the ⍺
first major cells from ⍵
:
3↑3 1 4 1 5
3 1 4
We can take major cells from the end of ⍵
by using a negative ⍺
:
¯3↑3 1 4 1 5
4 1 5
APL arrays have something called prototype. The prototype for numbers is 0 and the prototype for chars is a space. The prototype for a mixed-type array is the first element’s prototype. More generally, for an array of arrays, the prototype is the first element, but with all numbers made 0 and all chars made spaces. If you take more than there is, APL will pad with this prototype element:
10↑1 2 3
¯10↑1 2 3
]display 10↑'Hello'
1 2 3 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 2 3
┌→─────────┐ │Hello │ └──────────┘