multimodal – Tutorials for learning q/kdb+

The mode is the value that appears most often in a set of data.

Like the statistical mean and median, the mode is a way of expressing, in a single number, important information about a random variable or a population.

The mode is not necessarily unique, since the same maximum frequency may be attained at different values.

Source: Wikipedia

The mode of a sample is the element that occurs most often in the collection.

For example: Mode of the sample {3, 7, 5, 13, 20, 23, 39, 23, 40, 23, 14, 12, 56, 23, 29} is 23.

It is not necessary that the mode for a given sample is always unique. There could be multiple modes of the given dataset, if it have 2 modes then we call the dataset as bimodal and if it have more than 2 modes then we call the dataset as multimodal.

bimodal example : {1, 3, 3, 3, 4, 4, 6, 6, 6, 9} have 2 modes 3 & 6.

There is currently no built in function in q for finding the mode. Lets try writing it, a simple function could be :

q)mode1:{key[d]where max[c]=c:count each value d:group x}
q)mode1  1 2 3 2 4 5 3 5 6 4 3 2
2 3

The above function is simple but we are not utilizing the features which comes automatically with the dictionary datatype in q.

Here is the new definition, which will return all the modes of the input sample:

q)mode:{where max[c]=c:count each d:group x}
q)mode  1 2 3 2 4 5 3 5 6 4 3 2
2 3

Note that the count, max and where works in a different way in case of dictionary. Here “count each” is actually counting the values corresponding to each key and returns a dictionary, max looks up the maximum value of dictionary range and where returns the key for true dictionary range.

The k equivalent of the above mode function is

k)mode:{&:max[c]=c:#:'=:x}

Share this: