Reaching Definitions

rascal-0.40.17

Reaching definitions and life variables are common analyses in compilers and static analysis tools. See:

A.V. Aho, R. Sethi, J.D. Ullman, Compilers, Prinicples, Techniques and Tools, Addison-Wesley, 1986, Section 10.5 Introduction to global data-flow analysis.

In this demo we select an extremely simple intermediate representation for programs such that the algorithm can get all the focus. Statements are represented by integers (think the starting line of the statement in the source file), variables are strings, definitions and uses are simply the name of the variable tupled with the statement that the variable is defined or used in:

import Relation;
import analysis::graphs::Graph;

alias Stat = int;
alias Var = str;
alias Def  = tuple[Stat, Var];
alias Use = tuple[Stat, Var];

In an actual analysis we would use more complex representations, for example qualified names as Location and statements as Algebraic Data Type syntax trees.

The following three helper functions represent queries on the intermediate representation that the main algorithm requires:

rel[Stat, Def] definition(rel[Stat, Var] defs)
    = {<s, <s, v>> | <Stat s, Var v> <- defs};

rel[Stat, Def] use(rel[Stat, Var] uses)
    = {<s, <s, v>> | <Stat s, Var v> <- uses};

rel[Stat, Def] kill(rel[Stat, Var] defs) { 
    return {<s1, <s2, v>> | <Stat s1, Var v> <- defs, <Stat s2, v> <- defs, s1 != s2};
}

The reachingDefinitions algorithm is almost straight out of the book:

rel[Stat, Def] reachingDefinitions(rel[Stat, Var] defs, rel[Stat, Stat] pred) {
    set[Stat] statement = carrier(pred);
    rel[Stat, Def] def  = definition(defs);
    rel[Stat, Def] kill = kill(defs);

    // The set of mutually recursive dataflow equations that has to be solved:

    rel[Stat, Def] \in = {};
    rel[Stat, Def] out = def;
    
    solve (\in, out) {
       \in =  {<s, d> | int s <- statement, Stat p <- predecessors(pred, s), Def d <- out[p]};
       out = {<s, d> | int s <- statement, Def d <- def[s] + (\in[s] - kill[s])};
    };

    return \in;
}

Let's take this for a spin. In the ASU dragon book on page 626 we find the following example:

rascal>PRED = { <1,2>, <2,3>, <3,4>, <4,5>, <5,6>, <5,7>, <6,7>, <7,4>};
rel[int,int]: {
  <7,4>,
  <1,2>,
  <3,4>,
  <2,3>,
  <4,5>,
  <6,7>,
  <5,7>,
  <5,6>
}
rascal>DEFS = { <1, "i">, <2, "j">, <3, "a">, <4, "i">, <5, "j">, <6, "a">, <7, "i">};
rel[int,str]: {
  <5,"j">,
  <7,"i">,
  <1,"i">,
  <3,"a">,
  <2,"j">,
  <4,"i">,
  <6,"a">
}

let's test the kill function:

rascal>assert kill(DEFS) ==  {<1, <4, "i">>, <1, <7, "i">>, <2, <5, "j">>, <3, <6, "a">>, 
>>>>>>>                       <4, <1, "i">>, <4, <7, "i">>, <5, <2, "j">>, <6, <3, "a">>, 
>>>>>>>                       <7, <1, "i">>, <7, <4, "i">>};
bool: true

and what we expect from reachingDefinitions according to the Dragon book:

rascal>RES = reachingDefinitions(DEFS, PRED);
rel[int,tuple[int,str]]: {
  <2,<1,"i">>,
  <5,<3,"a">>,
  <5,<2,"j">>,
  <5,<4,"i">>,
  <5,<6,"a">>,
  <5,<5,"j">>,
  <7,<3,"a">>,
  <7,<4,"i">>,
  <7,<6,"a">>,
  <7,<5,"j">>,
  <3,<1,"i">>,
  <3,<2,"j">>,
  <4,<7,"i">>,
  <4,<1,"i">>,
  <4,<3,"a">>,
  <4,<2,"j">>,
  <4,<6,"a">>,
  <4,<5,"j">>,
  <6,<3,"a">>,
  <6,<4,"i">>,
  <6,<6,"a">>,
  <6,<5,"j">>
}
rascal>assert RES == {<2, <1, "i">>, <3, <2, "j">>, <3, <1, "i">>, <4, <3, "a">>, 
>>>>>>>               <4, <2, "j">>, <4, <1, "i">>, <4, <7, "i">>, <4, <5, "j">>, 
>>>>>>>               <4, <6, "a">>, <5, <4, "i">>, <5, <3, "a">>, <5, <2, "j">>, 
>>>>>>>               <5, <5, "j">>, <5, <6, "a">>, <6, <5, "j">>, <6, <4, "i">>, 
>>>>>>>               <6, <3, "a">>, <6, <6, "a">>, <7, <5, "j">>, <7, <4, "i">>, 
>>>>>>>               <7, <3, "a">>, <7, <6, "a">>};
bool: true

Next up is liveVariables which resembles the same kind of fixed-point solution as reachingDefinitions, because that is how it is defined in the book:

rel[Stat, Def] liveVariables(rel[Stat, Var] defs, rel[Stat, Var] uses, rel[Stat, Stat] pred) {
    set[Stat] statement = carrier(pred);
    rel[Stat, Def] def  = definition(defs);
    rel[Stat, Def] use = use(uses);
    
    rel[Stat, Def] lin = {};
    rel[Stat, Def] lout = def;
    
    solve (lin, lout) {
        lin  =  {<s, d> | Stat s <- statement,  Def d <- use[s] + (lout[s] - (def[s]))};
        lout =  {<s, d> | Stat s <- statement,  Stat succ <- successors(pred, s), Def d <- lin[succ] };
    }
    return lin;
}

On page 619 is an extended example we can use to test liveVariables:

    1: i := m-2                         i := m-2
        |
  2: j := n                             j := n
        |
  3: a := u1                            a := u1
          |                             do
          V
    4: i :=i+1  <--------------         i := i+1
        |                      |
      5: j :=j-1               |            j := j-1
     /          \              |            if e1 then
    /            \             |               a := u2
  V              V             |            else
  6: a := u2 ---> 7: i := u3 ---               i := u3
                                        while e2

We can represent this in our simplified intermediate representation like so:

rascal>ROOT = 1;
int: 1
rascal>PRED= { <1,2>, <2,3>, <3,4>, <4,5>, <5,6>,<5,7>,<6,7>,<7,4>};
rel[int,int]: {
  <7,4>,
  <1,2>,
  <3,4>,
  <2,3>,
  <4,5>,
  <6,7>,
  <5,7>,
  <5,6>
}
rascal>DEFS= { <"i",1>,<"j",2>,<"a",3>,<"i",4>,<"j",5>,<"a",6>,<"i",7>}<1,0>;
rel[int,str]: {
  <5,"j">,
  <7,"i">,
  <1,"i">,
  <3,"a">,
  <2,"j">,
  <4,"i">,
  <6,"a">
}
rascal>USES= {<"m",1>,<"n",2>,<"u1",3>,<"i",4>,<"j",5>,<"u2",6>,<"u3",7>}<1,0>;
rel[int,str]: {
  <5,"j">,
  <7,"u3">,
  <1,"m">,
  <3,"u1">,
  <2,"n">,
  <4,"i">,
  <6,"u2">
}

And now we can test if the implementation matches expectations for this example:

rascal>assert liveVariables(DEFS, USES, PRED) ==
>>>>>>>           {<1, <6, "u2">>, <1, <7, "u3">>, <1, <5, "j">>, <1, <4, "i">>,
>>>>>>>            <1, <3, "u1">>, <1, <2, "n">>, <1, <1, "m">>, 
>>>>>>>            <2, <7, "u3">>, <2, <6, "u2">>, <2, <5, "j">>, <2, <4, "i">>, 
>>>>>>>            <2, <3, "u1">>, <2, <2, "n">>, 
>>>>>>>            <3, <7, "u3">>, <3, <6, "u2">>, <3, <5, "j">>, <3, <4, "i">>, 
>>>>>>>            <3, <3, "u1">>, 
>>>>>>>            <5, <4, "i">>, <5, <7, "u3">>, <5, <6, "u2">>, <5, <5, "j">>, 
>>>>>>>            <6, <5, "j">>, <6, <4, "i">>, <6, <7, "u3">>, <6, <6, "u2">>, 
>>>>>>>            <7, <6, "u2">>, <7, <5, "j">>, <7, <4, "i">>, <7, <7, "u3">>, 
>>>>>>>            <4, <7, "u3">>, <4, <6, "u2">>, <4, <5, "j">>, <4, <4, "i">>};
bool: true