Variant in depth #

Variant, , is a dyadic operator, but it is quite unlike all other operators in APL. Syntactically, it is normal though. It always takes a function (monadic or dyadic) on its left, and always takes an array on its right. Although it is usually called Variant, you can also call it Option. In fact, it has a system operator synonym, ⎕OPT.

Variant is special in that it sets options in an invisible set of options. You can’t access this set directly, only observe modified behaviour in the operand function, because the operand function will check this set to know what to do.

This also means that, uniquely, the operand function will “know” that it is being called as an operand of . Usually, functions can’t really detect (easily) who called them. The left operand (the function) must be one of a fixed set of system functions (or functions derived from system operators).

The right operand must be one of:

  • a scalar (this one is known as the principal option)

  • a 2-element key-value pair

  • a vector of 2-element key-value pairs.

The scalar operand is only allowed if a default key exists, in which case it is equivalent to ‘DefaultKey’ value. Let’s take an example. You might know about the system function to convert to and from JSON:

⎕JSON3
[1,2,3]

We can use with the key 'Compact' to change the white-space behaviour of ⎕JSON. In essence, sets the Compact setting to the corresponding value (0 or 1 in this case):

⎕JSON'Compact' 0 3
[
  1,
  2,
  3
]

There are other options too. Typically, ⎕JSON will convert a JavaScript null to an APL enclosed string ⊂'null':

('null')  ⎕JSON'null'
1

However, if you instead want it to convert it to an object-type null, ⎕NULL you can tell it so:

⎕NULL  ⎕JSON'Null' ⎕NULL  'null'
1

Notice the . Whenever a dyadic operator has an array right operand, it will strand together with any literal right argument. There must be a function (or parentheses, or naming) to split them apart.

Another option for ⎕JSON is to convert JSON into an APL matrix that describes the JSON, rather than attempting to actually convert to an equivalent APL structure:

⎕JSON'Format' 'M'  '[1,null,"hello"]'
┌─┬┬──────┬─┐
│0││      │2│
├─┼┼──────┼─┤
│1││1     │3│
├─┼┼──────┼─┤
│1││┌────┐│5│
│ │││null││ │
│ ││└────┘│ │
├─┼┼──────┼─┤
│1││hello │4│
└─┴┴──────┴─┘

The exact details of this Matrix Format isn’t important here, though. You can check out the docs. Now that we know about a couple of options, we can look at how to specify multiple options. We can create a “dictionary” of key-value pairs:

⎕JSON('Format' 'M')('Null' ⎕NULL)  '[1,null,"hello"]'
┌─┬┬────────┬─┐
│0││        │2│
├─┼┼────────┼─┤
│1││1       │3│
├─┼┼────────┼─┤
│1││ [Null] │5│
├─┼┼────────┼─┤
│1││hello   │4│
└─┴┴────────┴─┘

Notice how we both got a matrix, and the null became [Null] (the text representation of ⎕NULL) rather than an enclosed 'null'. We can also use twice:

⎕JSON'Format' 'M''Null' ⎕NULL  '[1,null,"hello"]'
┌─┬┬────────┬─┐
│0││        │2│
├─┼┼────────┼─┤
│1││1       │3│
├─┼┼────────┼─┤
│1││ [Null] │5│
├─┼┼────────┼─┤
│1││hello   │4│
└─┴┴────────┴─┘

If we check the docs for ⎕JSON, we’ll see that 'Format' is the principal option, which means we can specify it as a scalar:

⎕JSON'M''Null' ⎕NULL  '[1,null,"hello"]'
┌─┬┬────────┬─┐
│0││        │2│
├─┼┼────────┼─┤
│1││1       │3│
├─┼┼────────┼─┤
│1││ [Null] │5│
├─┼┼────────┼─┤
│1││hello   │4│
└─┴┴────────┴─┘

What happens if we set the same option twice with different values? The rightmost one takes precedence. There are two ways you can think of it, both leading to that same conclusion:

  1. (like any operator) modifies its operand function. For simplicity, lets say we have two monadic operators applied acting on a function, f op1 op2, op2 gets to modify the derived function f op1. That is, the rightmost has the final say.

  2. When we evaluate, we first have to process the inner derived function’s operator (as in the previous point), which sets the hidden option. Then we proceed to the outer operator, which in turn overwrites the state. Only then is the function allowed to run, picking up the setting set by the rightmost (outer) operator.

Variant is also used with ⎕R and its sibling ⎕S. If you’re not familiar with ⎕R: Briefly, it is a dyadic operator, Replacing occurrences of its left operand with its right operand, in the right argument:

's'⎕R'S'  'mississippi'
miSSiSSippi

This replaces all lowercase s with uppercase S. Let’s say we only want to replace the first 2. We can set the Match Limit to 2. The option key to use for this is 'ML'.

's'⎕R'S''ML'2  'mississippi' 
miSSissippi

⎕R is an operator. It takes two operands, in our case ‘s’ and ‘S’, and derives a new function. It is this derived function that needs to act upon by taking it as its left operand. So the order is FunctionToBeModified options argument. Alternatively, we can parenthesise: (FunctionToBeModified options) argument.

('s'⎕R'S''ML'2) 'mississippi' 
miSSissippi

Naming a derived monadic operator:

ReplaceWithS⎕R'S'
's'ReplaceWithS 'mississippi'
miSSiSSippi

This also means we can name the combination of with one or more options.

OnlyTwo'ML'2
's'⎕R'S'OnlyTwo 'mississippi'
miSSissippi

We can even do both:

ReplaceWithS⎕R'S'
OnlyTwo'ML'2
's'ReplaceWithS OnlyTwo 'mississippi'
miSSissippi

A really common thing with regexes is wanting case insensitivity. That is 'IC'1 (Ignore Case), but it is also the principal option:

'ss'⎕R'__'1'MISSissippi'
MI__i__ippi

But it only works if that is the only setting you’re changing. Though, you can always use twice:

's'⎕R'_''ML'31'MISSissippi'
MI__i_sippi

Here is another example where we use on ⎕R to do something entirely unrelated to regular expressions. Sometimes, your input can be of various forms and you need to normalise it. Say you get some text, but it could be a character scalar, a character vector, a vector of character vectors, an enclosed character vector, or even a character vector with literal newlines. So we want to normalise all of these to become a vector of character vectors.

VecOfVecs''⎕R'''ResultText' 'Nested'
VecOfVecs 'a'
VecOfVecs 'abc'
VecOfVecs 'abc' 'def'
VecOfVecs 'abc'
VecOfVecs 'abc',(⎕UCS 10),'def'
┌─┐
│a│
└─┘
┌───┐
│abc│
└───┘
┌───┬───┐
│abc│def│
└───┴───┘
┌───┐
│abc│
└───┘
┌───┬───┐
│abc│def│
└───┴───┘

Note that Dyalog often adds additional options to existing system functions based on customer demand. Case in point, in version 18.0, options were added to ⎕JSON to automatically split high-rank arrays so they can be represented as JSON, and an option to process and generate JSON5. And for ⎕R/⎕S, options to turn regexes off so you can do literal replacements without worrying about having to escape characters that have special meaning in PCRE.

One more usage of that isn’t really related to this, and we can’t demonstate it easily here, either. When using external .NET methods, APL will coerce its arrays into an appropriate type for the called method. However, .NET methods can be overloaded (different code depending on the type of the argument), and then APL can’t know which one you want. You can use with the method and the option 'OverloadTypes' to choose. The value has to be a .NET data type, e.g. Double or Int32. This option is the principal option too, so the calling can be done simply with MyDotNetMethod⍠Double argument. If the method takes multiple arguments, you can specify a vector of types: MyDotNetMethod⍠(⊂Double Int32) argument

Notice two things:

  1. The types are not quoted names, they are scalar references to the .NET types. 1. When specifying a vector of types, it must be enclosed, as the principal option must be a scalar.