Step Function

A mathematical function of a single variable that remains constant within each of a series of adjacent intervals but changes in value from one interval to the next.

Examples :

  • Mathematical functions like floor function, ceiling function, signum function
  • A constant function is a trivial example of a step function e.g.  y=1
  • The rectangular function, the normalized boxcar function, is the next simplest step function, and is used to model a unit pulse.

ceiling step function

Step function in KDB

Signum function

The function signum returns -1, 0 or 1 if the argument is negative, zero or positive respectively. It can be applied item-wise to lists, dictionaries and tables, and to all data types except symbol.

q)signum  -1 -3 9 0 2 -2 3
-1 -1 1 0 1 -1 1i

Floor function

The floor function maps a real number to the largest previous integer. More precisely, floor(x) = \lfloor x\rfloor is the largest integer not greater than x.

q)floor 8.4 6.3 9.3 5.4 3.8 9.7 8.8 5.8 6.8 4.5
8 6 9 5 3 9 8 5 6 4

Ceiling function

The ceiling function map a real number to the smallest following integer. More precisely,  ceiling(x) =  \lceil x \rceil is the smallest integer not less than x.

q)ceiling 8.4 6.3 9.3 5.4 3.8 9.7 8.8 5.8 6.8 4.5
9 7 10 6 4 10 9 6 7 5

Sorted attribute (`s#)

The sorted attribute(`s#)  when applied to a dictionary makes the dictionary into a step function.

q)sd:`s#1 3 5 7 9!1 3 25 49 81
q)d[6]
25
q)d[8]
49

Temporal Data

In traditional RDMSs, temporal changes in data are often represented by adding valid-time interval information to each relationship. This is usually achived by adding start and end columns to the relational tables.
Applying a sorted attribute(`s#) on a keyed table gives similar effect in KDB. check out more on Temporal data.

Parted attribute (`p#)

The parted attribute (`#p) indicates that the list represents a step function in which all occurrences of a particular output value are adjacent. The range is an int or temporal type that has an underlying int value, such as years, months, days, etc. You can also partition over a symbol provided it is enumerated.

q)`p#5 3 3 4 4 4 1 1 1 2

Mode

The mode is the value that appears most often in a set of data.

Like the statistical mean and median, the mode is a way of expressing, in a single number, important information about a random variable or a population.

The mode is not necessarily unique, since the same maximum frequency may be attained at different values.

Source: Wikipedia

The mode of a sample is the element that occurs most often in the collection.

For example: Mode of the sample {3, 7, 5, 13, 20, 23, 39, 23, 40, 23, 14, 12, 56, 23, 29} is 23.

It is not necessary that the mode for a given sample is always unique.  There could be multiple modes of the given dataset, if it have 2 modes then we call the dataset as bimodal and if it have more than 2 modes then we call the dataset as multimodal.

bimodal example : {1, 3, 3, 3, 4, 4, 6, 6, 6, 9}  have 2 modes 3 & 6.

There is currently no built in function in q for finding the mode. Lets try writing it, a simple function could be :

q)mode1:{key[d]where max[c]=c:count each value d:group x}
q)mode1  1 2 3 2 4 5 3 5 6 4 3 2
2 3

The above function is simple but we are not utilizing the features which comes automatically with the dictionary datatype in q.

Here is the new definition, which will return all the modes of the input sample:

q)mode:{where max[c]=c:count each d:group x}
q)mode  1 2 3 2 4 5 3 5 6 4 3 2
2 3

Note that the count, max and where works in a different way in case of dictionary. Here “count each” is actually counting the values corresponding to each key and returns a dictionary, max looks up the maximum value of dictionary range and where returns the key for true dictionary range.

The k equivalent of the above mode function is

k)mode:{&:max[c]=c:#:'=:x}