Skip to main content

module Set

rascal-0.40.16

Library functions for sets.

Usage

import Set;

Dependencies

import Exception;
import List;
import util::Math;

Description

The following library functions are defined for sets:

function classify

Classify elements in a set.

map[&K,set[&V]] classify(set[&V] input, &K (&V) getClass)

Examples

We classify animals by their number of legs.

rascal>import Set;
ok

Create a map from animals to number of legs.

rascal>legs = ("bird": 2, "dog": 4, "human": 2, "snake": 0, "spider": 8, "millepede": 1000, "crab": 8, "cat": 4);
map[str, int]: ("snake":0,"spider":8,"human":2,"crab":8,"cat":4,"bird":2,"dog":4,"millepede":1000)

Define function nLegs that returns the number of legs for each animal (or 0 when the animal is unknown):

rascal>int nLegs(str animal){
>>>>>>> return legs[animal] ? 0;
>>>>>>>}
int (str): function(|prompt:///|(0,53,<1,0>,<3,1>))

Now classify a set of animals:

rascal>classify({"bird", "dog", "human", "spider", "millepede", "zebra", "crab", "cat"}, nLegs);
map[int, set[str]]: (
8:{"spider","crab"},
2:{"human","bird"},
4:{"cat","dog"},
1000:{"millepede"},
0:{"zebra"}
)

function group

Group elements in a set given an equivalence function.

set[set[&T]] group(set[&T] input, bool (&T a, &T b) similar)

Examples

We classify animals by their number of legs.

rascal>import Set;
ok

Create a map from animals to number of legs.

rascal>legs = ("bird": 2, "dog": 4, "human": 2, "snake": 0, "spider": 8, "millepede": 1000, "crab": 8, "cat": 4);
map[str, int]: ("snake":0,"spider":8,"human":2,"crab":8,"cat":4,"bird":2,"dog":4,"millepede":1000)

Define function nLegs that returns the number of legs fro each animal (or 0 when the animal is unknown):

rascal>int nLegs(str animal){
>>>>>>> return legs[animal] ? 0;
>>>>>>>}
int (str): function(|prompt:///|(0,53,<1,0>,<3,1>))
rascal>bool similar(str a, str b) = nLegs(a) == nLegs(b);
bool (str, str): function(|prompt:///|(0,50,<1,0>,<1,50>))

Now group a set of animals:

rascal>group({"bird", "dog", "human", "spider", "millepede", "zebra", "crab", "cat"}, similar);
set[set[str]]: {
{"spider"},
{"zebra"},
{"human"},
{"crab"},
{"cat"},
{"bird"},
{"dog"},
{"millepede"}
}

WARNING: check compiler.

function index

Map set elements to a fixed index.

map[&T,int] index(set[&T] s)

Examples

rascal>import Set;
ok
rascal>index({"elephant", "zebra", "snake"});
map[str, int]: ("snake":0,"zebra":1,"elephant":2)

function isEmpty

Test whether a set is empty.

bool isEmpty(set[&T] st)

Yields true if s is empty, and false otherwise.

Examples

rascal>import Set;
ok
rascal>isEmpty({1, 2, 3});
bool: false
rascal>isEmpty({});
bool: true

function mapper

Apply a function to all set elements and return set of results.

set[&U] mapper(set[&T] st, &U (&T) fn)

Return a set obtained by applying function fn to all elements of set s.

Examples

rascal>import Set;
ok
rascal>int incr(int x) { return x + 1; }
int (int): function(|prompt:///|(0,33,<1,0>,<1,33>))
rascal>mapper({1, 2, 3, 4}, incr);
set[int]: {5,3,2,4}

function max

Determine the largest element of a set.

&T max(set[&T] st)

Examples

rascal>import Set;
ok
rascal>max({1, 3, 5, 2, 4});
int: 5
rascal>max({"elephant", "zebra", "snake"});
str: "zebra"
---
zebra
---

function min

Smallest element of a set.

&T min(set[&T] st)

Examples

rascal>import Set;
ok
rascal>min({1, 3, 5, 2, 4});
int: 1
rascal>min({"elephant", "zebra", "snake"});
str: "elephant"
---
elephant
---

Examples

rascal>import Set;
ok
rascal>min({1, 3, 5, 4, 2});
int: 1

function power

Determine the powerset of a set.

set[set[&T]] power(set[&T] st)

Returns a set with all subsets of s.

Examples

rascal>import Set;
ok
rascal>power({1,2,3,4});
set[set[int]]: {
{},
{1,2,4},
{1},
{3,2,4},
{3},
{1,3,2,4},
{1,3},
{2},
{4},
{1,2},
{1,4},
{3,2},
{3,4},
{1,3,2},
{1,3,4},
{2,4}
}

function power1

The powerset (excluding the empty set) of a set value.

set[set[&T]] power1(set[&T] st)

Returns all subsets (excluding the empty set) of s.

Examples

rascal>import Set;
ok
rascal>power1({1,2,3,4});
set[set[int]]: {
{1,2,4},
{1},
{3,2,4},
{3},
{1,3,2,4},
{1,3},
{2},
{4},
{1,2},
{1,4},
{3,2},
{3,4},
{1,3,2},
{1,3,4},
{2,4}
}

function reducer

Apply a function to successive elements of a set and combine the results.

danger

deprecated: marked for future deletion Use a reducer expression instead, such as (init | fn(it,e) | e <- st).

&T reducer(set[&T] st, &T (&T,&T) fn, &T unit)

&T reducer(set[&T] _:{})

Apply the function fn to successive elements of set s starting with unit.

Examples

rascal>import Set;
ok
rascal>int add(int x, int y) { return x + y; }
int (int, int): function(|prompt:///|(0,39,<1,0>,<1,39>))
rascal>reducer({10, 20, 30, 40}, add, 0);
int: 100

function size

Determine the number of elements in a set.

int size(set[&T] st)

Examples

rascal>import Set;
ok
rascal>size({1,2,3,4});
int: 4
rascal>size({"elephant", "zebra", "snake"});
int: 3
rascal>size({});
int: 0

function sum

(&T <:num) sum(set[(&T <:num)] _:{})

function sum

Sum the elements of a set.

default (&T <:num) sum({(&T <: num) e, *(&T <: num) r})

Examples

rascal>import Set;
ok
rascal>sum({3, 1, 4, 5});
int: 13
rascal>sum({3, 1.5, 4, 5});
num: 13.5

function getOneFrom

Pick an arbitrary element from a set.

&T getOneFrom(set[&T] st)

This randomly picks one element from a set, unless the set is empty.

danger

Use Get Single From if you want the element from a singleton set. Get One From will silently continue even if there are more element present, which can be a serious threat to the validity of your analysis algorithm (arbitrary data is not considered).

Examples

rascal>import Set;
ok
rascal>getOneFrom({"elephant", "zebra", "snake"});
str: "elephant"
---
elephant
---
rascal>getOneFrom({"elephant", "zebra", "snake"});
str: "elephant"
---
elephant
---
rascal>getOneFrom({"elephant", "zebra", "snake"});
str: "snake"
---
snake
---
rascal>getOneFrom({"elephant", "zebra", "snake"});
str: "zebra"
---
zebra
---

Benefits

  • Random sampling can be an effective test input selection strategy.

Pitfalls

  • The name Get One From does not convey randomness.
  • Get One From drops all the other elements. If you are sure there is only one element and you need it, then use Get Single From. It will fail fast if your assumption is wrong.
  • If you need more then one element, then repeatedly calling Get One From will be expensive. Have a look at Sampling for more effective sampling utilities.

function getFirstFrom

Get "first" element from a set.

&T getFirstFrom(set[&T] st)

Get "first" element of a set. Of course, sets are unordered and do not have a first element. However, we could assume that sets are internally ordered in some way and this ordering is reproducible (it's deterministic up to hashing collisions). Applying getFirstFrom on the same set will always returns the same element.

danger

Use Get Single From if you want the element from a singleton set. Get First From will silently continue even if there are more element present, which can be a serious threat to the validity of your analysis algorithm (arbitrary data is not considered).

Benefits

This function helps to make set-based code more deterministic, for instance, for testing purposes.

Pitfalls

  • The deterministic order is undefined. This means it may be stable between runs, but not between releases of Rascal.
  • There are much better ways to iterate over the elements of a set:
    • Use the <- enumerator operator in a for loop or a comprehension.
    • Use list matching
  • Get First From drops all the other elements
    • If you are sure there is only one element and you need it, then use Get Single From. It will fail fast if your assumption is wrong.

function getSingleFrom

Get the only element from a singleton set.

&T getSingleFrom(set[&T] st)

Get the only element of a singleton set. This fails with a Call Failed exception when the set is not a singleton.

Benefits

  • Get Single From fails fast if the assumption (parameter st is a singleton) is not met.
  • If a binary relation r is injective (i.e. it models a function where each key only has one value) then all projections r[key] should produce singleton values: {v}. Using Get Single From to get the element out makes sure we fail fast in case our assumptions were wrong, or they have changed.
  • Never use Get First From or Take One From if you can use Get Single From.

Pitfalls

  • Call Failed exceptions are sometimes hard to diagnose. Look at the stack trace to see that it was Get Single From that caused it, and then look at the parameter of Call Failed to see that the set was not a singleton.

function takeOneFrom

Remove an arbitrary element from a set, returns the element and a set without that element.

tuple[&T, set[&T]] takeOneFrom(set[&T] st)

Remove an arbitrary element from set s and return a tuple consisting of the element and a set without that element. Also see GetOneFrom.

Examples

rascal>import Set;
ok
rascal>takeOneFrom({1, 2, 3, 4});
tuple[int,set[int]]: <4,{1,3,2}>
rascal>takeOneFrom({1, 2, 3, 4});
tuple[int,set[int]]: <4,{1,3,2}>
rascal>takeOneFrom({1, 2, 3, 4});
tuple[int,set[int]]: <3,{1,2,4}>

function takeFirstFrom

Remove "first" element from a set, returns the element and a set without that element.

tuple[&T, set[&T]] takeFirstFrom({&T f, *&T r})

tuple[&T, set[&T]] takeFirstFrom(set[&T] _:{})

element of a set.

function toList

Convert a set to a list.

list[&T] toList(set[&T] st)

Examples

rascal>import Set;
ok
rascal>toList({1, 2, 3, 4});
list[int]: [1,3,2,4]
rascal>toList({"elephant", "zebra", "snake"});
list[str]: ["snake","zebra","elephant"]

Note that the same result can be obtained using splicing:

rascal>s = {1,2,3,4};
set[int]: {1,3,2,4}
rascal>l = [*s];
list[int]: [1,3,2,4]

Pitfalls

Recall that the elements of a set are unordered and that there is no guarantee in which order the set elements will be placed in the resulting list.

function toMap

Convert a set of tuples to a map; each key is associated with a set of values.

map[&A,set[&B]] toMap(rel[&A, &B] st)

Convert a set of tuples to a map in which the first element of each tuple is associated with the set of second elements of all tuples with the same first element.

Examples

rascal>import Set;
ok
rascal>toMap({<"a", 1>, <"b", 2>, <"a", 10>});
map[str, set[int]]: (
"a":{10,1},
"b":{2}
)

function toMapUnique

Convert a set of tuples to a map (provided that there are no multiple keys).

map[&A,&B] toMapUnique(rel[&A, &B] st) throws MultipleKey

Convert a set of tuples to a map. The result should be a legal map (i.e., without multiple keys).

Examples

rascal>import Set;
ok
rascal>toMapUnique({<"a", 1>, <"b", 2>, <"c", 10>});
map[str, int]: ("a":1,"b":2,"c":10)

Now explore an erroneous example:

rascal>toMapUnique({<"a", 1>, <"b", 2>, <"a", 10>});
|file:///home/runner/actions-runner/_work/rascal/rascal/src/org/rascalmpl/library/Set.rsc|(11682,520,<412,0>,<427,70>): MultipleKey("a",10,1)
at *** somewhere ***(|file:///home/runner/actions-runner/_work/rascal/rascal/src/org/rascalmpl/library/Set.rsc|(11682,520,<412,0>,<427,70>))
at toMapUnique(|prompt:///|(39,2,<1,39>,<1,41>))
ok

function toString

Convert a set to a string.

str toString(set[&T] st)

Examples

rascal>import Set;
ok
rascal>toString({1, 2, 3});
str: "{1,3,2}"
---
{1,3,2}
---
rascal>toString({"elephant", "zebra", "snake"});
str: "{\"snake\",\"zebra\",\"elephant\"}"
---
{"snake","zebra","elephant"}
---

Pitfalls

Recall that the elements of a set are unordered and that there is no guarantee in which order the set elements will be placed in the resulting string.

function itoString

Convert a set to an indented string.

str itoString(set[&T] st)

Examples

rascal>import Set;
ok
rascal>toString({1, 2, 3});
str: "{1,3,2}"
---
{1,3,2}
---
rascal>toString({"elephant", "zebra", "snake"});
str: "{\"snake\",\"zebra\",\"elephant\"}"
---
{"snake","zebra","elephant"}
---

Pitfalls

Recall that the elements of a set are unordered and that there is no guarantee in which order the set elements will be placed in the resulting string.

function sort

Sort the elements of a set. Sort the elements of a set: Use the built-in ordering on values to compare list elements. Give an additional lessThan function that will be used to compare elements. This function lessThan (<) function should implement a strict partial order, meaning: that it is not reflexive, i.e. never a < a is anti-symmetric, i.e. never a < b && b < a. * is transitive, i.e. if a < b and b < c then a < c.

list[&T] sort(set[&T] s)

list[&T] sort(set[&T] l, bool (&T a, &T b) less)

Examples

rascal>import Set;
ok
rascal>import String;
ok
rascal>sort({10, 4, -2, 11, 100, 5});
list[int]: [-2,4,5,10,11,100]
rascal>fruits = {"mango", "strawberry", "pear", "pineapple", "banana", "grape", "kiwi"};
set[str]: {"mango","banana","pear","pineapple","grape","strawberry","kiwi"}
rascal>sort(fruits);
list[str]: ["banana","grape","kiwi","mango","pear","pineapple","strawberry"]
rascal>sort(fruits, bool(str a, str b){ return size(a) > size(b); });
list[str]: ["strawberry","pineapple","banana","mango","grape","kiwi","pear"]

function top

Produce the smallest k elements of a set as sorted by the less function.

list[&T] top(int k, set[&T] l, bool (&T a, &T b) less)

list[&T] top(int k, set[&T] l)

This function is fast if k is relatively small, say 10 out of a 1000 elements. It operates in O(n*k) time where n is the size of the set.

If k is a larger value, say k > 10, then it's perhaps better to just sort the entire set using the asympotically faster (n*log^2(n)) sort function and take the first k elements of the resulting list.

If k is a negative number, top will return the largest abs(k) elements of the set instead of the smallest.

function union

Flatten a set of sets into a single set.

set[&T] union(set[set[&T]] sets)

function jaccard

Compute the Jaccard similarity between two sets.

real jaccard(set[value] x, set[value] y)