Unique mask in depth ≠#

Let’s have a look at monadic ≠, called called unique mask or nub sieve. Note that it isn’t particularly related to the dyadic form (unequal). Instead, it relates to unique, ∪. Unique returns a subset of the major cells of its argument. Unique mask returns a Boolean vector which, when used as left argument to ⌿ and with the original argument as right argument, returns the same as unique would on the original argument:

∪'mississippi'
≠'mississippi'
{(≠⍵)⌿⍵}'mississippi'

misp

1 1 1 0 0 0 0 0 1 0 0

misp


Why might we need such a function? Compared with ∪Y, you can use the results from ≠Y to filter other arrays, or indeed to do other computations. It is as if ∪Y already applied their implied information before you had a chance to use that info for what you wanted. It’s also worth noting that the result of ≠Y is much more light-weight than ∪Y, in that it only ever has one bit per major cell, while ∪Y could end up duplicating a lot of data.

Another thing you can do with the mask is to combine it with other masks:

m←⎕←(≠∧∊∘'aeiou')t←'hello world'
m/t

0 1 0 0 1 0 0 0 0 0 0

eo


which gives the unique vowels. Of course, in this case, you could equally well write 'aeiou'∩'hello world' but this example is to illustrate the concept.

Here’s another example. Given some text (simple character vector) t, return a matrix so that the first instance of each occurring character is “underlined”. Here’s one approach:

F1 ← ↑⊢,⍥⊂'-'\⍨≠

F1 'mississippi'
F1 'hello world'

mississippi
---     -

hello world
--- --- - -


Here’s another,

F2 ← {⎕IO←0⋄↑⍵(' _'[≠⍵])}

F2 'mississippi'
F2 'hello world'

mississippi
___     _

hello world
___ ___ _ _


A final example: given a vector, return the set of elements which have duplicates, preserving order:

F3 ← (∪∩{⍵/⍨~≠⍵}) ⍝ h/t @bubbler

F3 'mississippi'
F3 'hello world'

isp

lo