Skip to main content

module analysis::statistics::Frequency

rascal-0.40.16

Frequency distributions.

Usage

import analysis::statistics::Frequency;

Dependencies

import util::Math;

Description

Counting the frequency of events is usually the first step in statistical analysis of raw data. It involves choosing what are the events to count, how to group them in certain categories and then quickly counting the frequency of each occurring event.

This module helps by providing commonly used functions for the purpose of counting events. The output of these functions can be used to draw (cumulative) histograms, or they can directly be used for further statistical processing and visualisation.

function distribution

Compute a distribution: count how many times events are mapped to which bucket.

map[&T, int] distribution(rel[&U event, &T bucket] input)

map[&T <: num, int] distribution(rel[&U event, &T <: num bucket] input, &T <: num bucketSize)

map[&T, int] distribution(map[&U event, &T bucket] input)

map[&T <: num, int] distribution(map[&U event, &T <: num bucket] input, &T <: num bucketSize)

Examples

rascal>import analysis::statistics::Frequency;
ok
rascal>distribution({<"chicken","animal">,<"bear","animal">,<"oak","plant">,<"tulip","plant">});
map[str, int]: ("plant":2,"animal":2)
rascal>distribution({<"alice",2>,<"bob",3>,<"claire",5>},5);
map[int, int]: (5:2,0:1)

function cumFreq

Cumulative frequency of values less than or equal to a given value.

int cumFreq(list[value] values, num n)

int cumFreq(list[value] values, str s)

Returns the cumulative frequency of values less than or equal to a given numeric or string value. Returns 0 if the value is not comparable to the values set.

Examples

rascal>import analysis::statistics::Frequency;
ok
rascal>D = [1, 2, 1, 1, 3, 5];
list[int]: [1,2,1,1,3,5]
rascal>cumFreq(D, 1);
int: 3
rascal>cumFreq(D, 2);
int: 4
rascal>cumFreq(D, 10);
int: 6

function cumPct

Cumulative percentage of values less than or equal to a given value.

num cumPct(list[value] values, num n)

num cumPct(list[value] values, str s)

Returns the cumulative percentage of values less than or equal to v (as a proportion between 0 and 1).

rascal>import analysis::statistics::Frequency;
ok
rascal>D = [1, 2, 1, 1, 3, 5];
list[int]: [1,2,1,1,3,5]
rascal>cumPct(D, 1);
num: 0.5
rascal>cumPct(D, 2);
num: 0.6666666666666666
rascal>cumPct(D, 10);
num: 1.0

function pct

Percentage of values that are equal to a given value.

num pct(list[value] values, num n)

num pct(list[value] values, str s)

Returns the percentage of values that are equal to v (as a proportion between 0 and 1).

Examples

rascal>import analysis::statistics::Frequency;
ok
rascal>D = [1, 2, 1, 1, 3, 5];
list[int]: [1,2,1,1,3,5]
rascal>pct(D, 1);
num: 0.5
rascal>pct(D, 2);
num: 0.16666666666666666
rascal>pct(D, 10);
num: 0.0