Chapter 10

Sequences, Globals, and Termination

Programming Logic: Sequences, Global Variables, and Termination

In our conclusion of programming logic, we’ll examines programs with sequences and global variables. We’ll also learn the idea of termination in programs, and why this is a much more difficult thing to demonstrate for a general program that it might seem.

Subsections of Sequences, Globals, and Termination

Logika Modes

Logika has different modes for programming logic verification. We can switch between modes by going to File->Settings->Tools->Sireum->Logika.

Logika’s “manual” mode

Up to now, we have been running Logika in “manual mode”, which uses these Logika settings:

manual mode manual mode

We are now reaching the point where additional practice in manual mode may no longer be a learning activity, and where the proof-blocks after claim transformations can become dozens of lines long.

Logika’s symexe mode

In Chapter 10, we will be switching to Logika’s “symexe mode”, which uses these Logika settings:

symexe mode symexe mode

Symexe mode allows us to reason about our program by using ONLY invariants and function contracts. While the same work has to be done for a program to be verified (the precondition must be true before a function call, the loop invariant must be true before the loop begins, etc.), symexe mode does the work of analyzing your program statements to ensure that all parts of your loop invariant and function contract are satisfied. When you use symexe mode, you will only need to include a function contract for each function and a loop invariant block for each loop, and it will do the grunt work.

Multiplication example

In section 9.3, we did a verification of a multiplication program using Logika’s manual mode. Here is how we would write the verification of the same program using Logika’s symexe mode:

import org.sireum.logika._

def mult(x: Z, y: Z) : Z = {
    //function contract
    l"""{
        requires y >= 0         //precondition: y should be nonnegative
        ensures result == x*y   //postcondition (we promise to return x*y)
    }"""

    var sum: Z = 0
    var count: Z = 0

    while (count != y) {
        l"""{
            invariant sum == count*x
            modifies sum, count
        }"""

        sum = sum + x
        count = count + 1
    }

    return sum
}

//////////// Test code //////////////

var one: Z = 3
var two: Z = 4

var answer: Z = mult(one, two)
assert(answer == 12)

Note that the only logic blocks we needed to provide were the function contract and the loop invariant block.

Pitfalls

When using this more advanced mode, it is not always obvious why Logika will not verify. Sometimes semantic errors in the program keep it from verifying; i.e. Logika has found a corner or edge case for which the program does not account. Other times the invariants and conditions do not actually help prove the goal in an assert or postcondition. Inevitably, sometimes it will be both.

In either case an option is to uncheck “auto” and begin typing each proof-block as if in manual mode (this can be done with symexe enabled) until you find the logical or programming error.

Intro to Sequences

Sequences in Logika are fairly similar to lists in Python and arrays in Java and C#. As with lists and arrays, Logika sequences are ordered. Each element is stored at a particular (zero-based) position, and there are no gaps in the sequence.

Logika sequences can either store integers (type ZS) or booleans (type BS).

Sequence syntax

We can create new sequence variables like this:

//creates the sequence (5, 10, 15)
var seq: ZS = ZS(5,10,15)

//creates the sequence (false, true)
var bools: BS = BS(false, true)

//creates an empty sequence of integers
var empty: ZS = ZS()

Given the following sequence:

var a: ZS = ZS(1, 2, 3)

Here is a table of the different sequence operations we will use in this course:

Operation Explanation
Indexing: a(pos) Accesses the value in the sequence at position pos.
Sequences are zero-based, and Logika will show an error if you have not proven (or if it cannot infer, in symexe mode) that the position lies within the sequence range.

For example, a(0) is 1.
a(0) = 11 would change the first value to be 11, so the sequence would be [11,2,3].
a(3) would give an error, as position 3 is past the end of the sequence.
Size: a.size Evaluates to the number of elements in the sequence: a.size == 3
Reassignment Sequences instantiated as var can be reassigned.

For example, after a = ZS(5,6), a is now [5,6].

Sample program with a sequence

Here is a sample Logika program that uses a sequence. The makeFirstZero function sets the first element in a sequence to 0:

import org.sireum.logika._

//"Unit" is like a void return type
def makeFirstZero(seq: ZS): Unit = {
    seq(0) = 0
}

///// Test code ///////////

var nums: ZS = ZS(1,2,3)

makeFirstZero(nums)
assert(nums == ZS(0,2,3))

This program will not run (or be verified) as we have not yet provided a function contract for makeZeroFirst. We will complete the verification for the program later in the section.

Predicate logic statements with sequences

When we write function contracts and loop invariants with sequences, we will need to make statements about all or some elements in a sequence. We can do this with predicate logic statements.

Statements about all sequence elements

As we did in chapters 4 and 5, we will use the universal () quantifier for statements involving all elements in a sequence. The basic forms of specifying some claim P(a(x)) holds for every element in a sequence a are:

Statement Explanation
∀ x: (lower..upper) P(a(x)) P(a(x)) holds for every element in a from position lower to position upper (including both lower and upper)
∀ x: (lower..<upper) P(a(x)) P(a(x)) holds for every element in a from position lower up to but not including position upper (lower but not upper)

Here are several sample claims and explanations about integer sequence a:

Claim Explanation
∀ x: (0..<a.size) a(x) > 0 Every element in a is greater than 0
∀ x: (1..3) a(x) == 0 All elements in a between positions 1 and 3 (inclusive of 1 and 3) have value 0
∀ x: (0..<a.size) a(x) < 0 → a(x) == -10 All negative elements in a have value -10

Statements about some sequence elements

We will use the existential () quantifier for statements involving one or more elements in a sequence. The basic forms of specifying claims is the same as for the universal quantifier, but using the existential quantifier instead of the universal quantifier.

Here are several sample claims and explanations about integer sequences a and b:

Claim Explanation
∃ x: (0..<a.size) a(x) > 0 There is an element in a that is greater than 0
∃ x: (2..4) a(x) == a(x-1) * 2 There is an element in a between positions 2 and 4 (inclusive) that is twice as big as the previous element
∀ x: (0..<a.size) (∃ y: (0..<b.size) a(x) == b(y)) Every value in a appears somewhere in b

Sequences in Functions

Sequences in Logika are passed to functions by reference. This means that if a function makes changes to a sequence parameter, then it will also change whatever sequence object was passed to the function. For example, the test code above passes nums, which has the value ZS(1,2,3), to the makeFirstZero function. The makeFirstZero function changes the first position in its parameter (seq) to be 0, which means that the nums sequence in the test code will also have its first position set to 0 (making it have the value ZS(0,2,3)).

Preconditions with sequences

When writing the precondition for a function that takes a sequence parameter, we must consider whether our function will only work correctly for particular sequence values or sizes. For example, our makeFirstZero function will not work if the size of the seq parameter is zero. We would need to add this requirement to the function’s precondition:

requires seq.size > 0

If we wanted to require that all values in a sequence parameter (say, nums) be between 10 and 20, we would write:

requires ∀ x: (0..<nums.size) (nums(x) >= 10 ∧ nums(x) <= 20)

Sometimes, functions with sequence parameters will work for any size/values – in those cases, we don’t need to list anything about the sequence in the precondition.

Function modifies clause

We learned in chapter 9 that the format of a function contract is:

l"""{
    requires (preconditions)
    modifies (sequences/globals changed in this function)
    ensures (postconditions)
}"""

Until this point, we have been leaving off the modifies clause because our functions have not used sequences or global variables. We now need to include that clause whenever a function CHANGES the values in a sequence parameter. For example, the makeFirstZero function DOES change its sequence parameter, seq, as it sets its first position to 0. makeFirstZero should therefore include this modifies clause:

modifies seq

If the function modifies more than one sequence or global variable, they are listed together in a comma-separated list.

Postconditions with sequences

When writing the postcondition for a function that uses a sequence parameter, we must consider two things:

  • How the return value relates to the sequence
  • How the function will change the sequence

Relating return values to sequence elements

We will still use the result keyword for describing the function’s return value in the postcondition. For example, if a function was returning the smallest value in the sequence nums, we would say:

ensures ∀ x: (0..<nums.size) result <= nums(x)
    ∃ x: (0..<a.size) result == nums(x)

Here, the first postcondition states that our return value will be less than or equal to every value in the sequence, and the second postcondition states that our return value is one of the sequence elements. (The second postcondition prevents us from sneakily returning some large negative number and claiming that it was the smallest element in the sequence, when in fact it wasn’t one of the sequence elements.)

Sometimes, our postconditions will promise to return a particular value if some claim about the sequence is true. Suppose we have a function that returns whether or not (i.e., a bool) all elements in the sequence a are negative. Our postcondition would be:

ensures (∀ x: (0..<nums.size) nums(x) < 0) → (result == true)
    (∃ x: (0..<nums.size) nums(x) >= 0) → (result == false)

Here, the first postcondition promises that if all sequence elements are negative, then the function will return true. The second postcondition promises the opposite – that if there is a nonnegative sequence element, then the function will return false.

Describing how the function changes the sequence

Consider again the makeFirstZero function:

def makeFirstZero(seq: ZS): Unit = {
    seq(0) = 0
}

This function doesn’t return anything (hence the Unit return type), but we do need to describe what impact calling this function will have on the sequence. We can partially accomplish our goal with this postcondition:

ensures seq(0) == 0

Which promises that after the function ends, the first value in the sequence will be 0. However, suppose we wrote this instead for the makeFirstZero function:

def makeFirstZero(seq: ZS): Unit = {
    seq(0) = 0
    seq(1) = 100
}

This version of the function DOES satisfy the postcondition – the first element is indeed set to 0 – but it changes other elements, too. The postcondition should be descriptive enough that anyone calling it can be certain EXACTLY what every single value in the sequence will be afterwards. Our postcondition needs to describe exactly what values in the sequence WILL change and exactly what values WON’T change.

This means that our makeFirstZero function needs to state that the first element in seq gets set to 0, and that every other value in the sequence stays the same as its original value. To help us describe the original value of a sequence, we can use the special sequenceName_in syntax, which holds the value of a sequence parameter sequenceName at the time the function was called. (This _in syntax can only be used in logic blocks, not in the code.)

We can specify exactly what happens to each sequence element in our first version of makeFirstZero like this:

ensures seq(0) == 0
    ∀ x: (1..<seq.size) seq(x) == seq_in(x)

The second postcondition says: “All elements from position 1 on keep their original values”.

Postcondition: size doesn’t change

Logika has an oddity with programs that modify sequence parameters – in those cases, you must also include as a postcondition that the size of the sequence will not change (i.e., that the resulting sequence size will equal the original sequence size. For example, if a function modified the sequence seq, we would need to add the postcondition:

seq.size == seq_in.size

Logika is concerned that any function that modifies a sequence might also change the size of that sequence. While it is possible to append and prepend to Logika sequences (much like with Python lists), we cannot do so to sequence parameters. As a rule, Logika functions cannot assign to ANY parameter value (they are read-only). However, we still must state that the size doesn’t change or the program will not be verified. Whenever you list a sequence in the modifies clause to a function, you must also include a postcondition to say that its size doesn’t change.

If you are writing a function that uses a sequence parameter but doesn’t change that parameter, you should not list that sequence in a modifies clause, and you should not state that the sequence size doesn’t change (or anything else about the ..._in value of the sequence).

Example: finished makeFirstZero verification

Now that we have seen all the pieces of writing function contracts for functions that work with sequences, we can put together the full function contract for our makeFirstZero function. The assert statement in the test code will be verified in Logika’s symexe mode:

import org.sireum.logika._

//"Unit" is like a void return type
def makeFirstZero(seq: ZS): Unit = {
    l"""{
        requires seq.size >= 1  //we need at least 1 element
        modifies seq            //we are changing the sequence
        ensures
            //we promise the first element will be a 0
            seq(0) == 0

            //we promise every other element is the same as its original value
            A x: (1..<seq.size) seq(x) == seq_in(x)

            //we promise the sequence size won't change
            seq.size == seq_in.size
    }"""

    seq(0) = 0
}

///// Test code ///////////

var nums: ZS = ZS(1,2,3)
makeFirstZero(nums)

//we want to claim that nums is what it was, but with the first
//element as a 0
assert(nums == ZS(0,2,3))

Example: swap program

Suppose we have the following swap program:

import org.sireum.logika._

def swap(list: ZS, pos1: Z, pos2: Z): Unit = {
    var temp: Z = list(pos1)
    list(pos1) = list(pos2)
    list(pos2) = temp
}

///////////// Calling code ///////////////////

var testList: ZS = ZS(1,2,3,4)
swap(testList,0,3)

//the values in positions 0 and 3 should be swapped
//all other elements should be the same
assert(testList == ZS(4,2,3,1))

Here, swap takes an integer sequence (list) and two positions (pos1 and pos2). It uses a temp variable to swap the values in list at pos1 and pos2. We would like to write an appropriate function contract so the assert statement in the test code holds.

We must first consider the precondition – does swap have any requirements about its parameters? Since swap uses pos1 and pos2 as positions within list, we can see that swap will crash if either position is out of bounds – either negative or past the end of the sequence.

The function is changing the sequence, so we will need a modifies clause. Finally, we must consider the postcondition. This function isn’t returning a value, but it is changing the sequence – so we should describe exactly what values HAVE changed (and their new values) and what values have NOT changed. We want to say that:

  • list(pos1) has the value that was originally at list(pos2) (i.e, the value at list_in(pos2))
  • list(pos2) has the value that was originally at list(pos1) (i.e, the value at list_in(pos1))
  • All other positions are unchanged (i.e., they are the same as they were in list_in)
  • The size doesn’t change (which we must always list if a sequence is modified)

We can now complete the function contract for swap:

import org.sireum.logika._

def swap(list: ZS, pos1: Z, pos2: Z): Unit = {
    l"""{
        //pos1 and pos2 need to be valid positions
        requires pos1 >= 0
            pos2 >= 0
            pos1 < list.size
            pos2 < list.size
        modifies list
        ensures
            list(pos1) == list_in(pos2)
            list(pos2) == list_in(pos1)
            list.size == list_in.size

            //all the other spots stay the same
            A x:(0..<list.size) (x != pos1 ^ x != pos2) -> list(x) == list_in(x)
    }"""

    var temp: Z = list(pos1)
    list(pos1) = list(pos2)
    list(pos2) = temp
}

///////////// Calling code ///////////////////

var testList: ZS = ZS(1,2,3,4)
swap(testList,0,3)

//the values in positions 0 and 3 should be swapped
//all other elements should be the same
assert(testList == ZS(4,2,3,1))

If we test this program in Logika’s symexe mode, the final assert will hold – we have enough information to make a claim about EXACTLY what the sequence will look like after calling swap.

Sequences in Loops

We also must consider sequences when writing loop invariants. Typically, we must include the following in our invariant:

  • If the sequence changes in the loop
    • Describe what sequence elements have already changed in the loop (and what their new values are)
    • Describe what sequence elements still have their original value
    • State that the sequence size does not change
    • Prove lower and upper bounds for whatever variable is being used as a sequence position (so we can be certain we will not go past the bounds of the sequence)
    • List the sequence along with other changing variables in the loop invariant block’s modifies clause
  • If the sequence does not change in the loop
    • Consider what we are doing with each sequence element as we look at them. Usually we have another variable that is storing our progress (and often, this variable is returned from the function after the loop). Express how the variable’s value relates to the part of the sequence we’ve looked at so far – this statement should look very similar to your postcondition, but should only describe part of the sequence.
    • Prove lower and upper bounds for whatever variable is being used as a sequence position (so we can be certain we will not go past the bounds of the sequence)

Example: add one to all program

Suppose we have the following program, which adds one to every element in a sequence parameter:

import org.sireum.logika._

def addOne(list: ZS): Unit = {
    var i: Z = 0
    while (i < list.size) {
        list(i) = list(i) + 1
        i = i + 1
    }
}

////////////// Calling code ///////////////////

var test: ZS = ZS(1,2,3,4)
addOne(test)

assert(test == ZS(2,3,4,5))

We would like to write an appropriate function contract and loop invariant block so the assert statement in the test code holds (which asserts that the sequence ZS(1,2,3,4) becomes the sequence ZS(2,3,4,5) after calling the function).

For the function contract, we must consider:

  • Precondition: this function will work correctly on all sequences – even empty ones. We can leave the requires clause off.
  • modifies clause: this function is changing the list sequence parameter, so we must list it in a modifies clause.
  • Postcondition: the function is not returning anything, but we must describe that all sequence parameters will be one bigger than their original values. Since the function modifies the sequence, the postcondition must also state that the sequence size does not change.

For the loop invariant block, we notice that the loop is changing the sequence. We must include:

  • Which elements have already changed. Since i is tracking our position in the sequence, we know that at the end of each iteration, all elements from position 0 up to but not including position i have been changed to be one bigger than their original values.
  • Which elements have not changed. All other elements in the sequence – from position i to the end of the sequence – still have their original values
  • State that the sequence size does not change.
  • Upper and lower bounds for position variables. Since i is tracking our position, we must state that i is always a valid sequence index. Here, we need to claim that i will always be greater than or equal to 0 and less than or equal to the sequence size. (While the sequence size itself is not a valid sequence index, we see from looking at the loop that i is incrementing as the very last step in the loop. On the last iteration, i will start off at list.size-1, and we will correctly access and modify the last element in list. Then we will increment i, making it EQUAL list.size – at that point, the loop ends. If we made part of our invariant be i < list.size, it would be incorrect because of that last iteration.)

We can now complete the function contract and loop invariant for addOne:

import org.sireum.logika._

def addOne(list: ZS): Unit = {
    l"""{
        //function contract
        modifies list
        ensures
            list.size == list_in.size
            A x: (0..<list.size) list(x) == list_in(x) + 1
    }"""

    var i: Z = 0
    while (i < list.size) {
        l"""{
            invariant
                i >= 0
                i <= list.size
                list.size == list_in.size

                //what I HAVE changed
                A x:(0..<i) list(x) == list_in(x) + 1

                //what I haven't changed
                A x: (i..<list.size) list(x) == list_in(x)

            modifies list, i
        }"""


        list(i) = list(i) + 1
        i = i + 1
    }
}

////////////// Calling code ///////////////////

var test: ZS = ZS(1,2,3,4)
addOne(test)

assert(test == ZS(2,3,4,5))

If we test this program in Logika’s symexe mode, the final assert will hold – we have enough information to know what the sequence will look like after calling addOne.

Example: min program

In our next example, we examine a function that does not modify its sequence parameter, and that does return a value. Consider the following min function and test code:

import org.sireum.logika._

//return the smallest element in list
def min(list: ZS): Z = {
    var small: Z = list(0)
    var i: Z = 1

    while (i < list.size) {
        if (list(i) < small) {
            small = list(i)
        }

        i = i + 1
    }

    return small
}

////////////// Calling code ///////////////////

var test: ZS = ZS(8,1,0,10,9,2,0)
var testMin: Z = min(test)

assert(testMin == 0)

Here, our min function is supposed to find and return the smallest value in an integer sequence. We can see that our test code passes min the sequence (ZS(8,1,0,10,9,2,0)), and that we are trying to assert that min correctly returns 0 as the smallest value in the sequence. We need to add an appropriate function contract and loop invariant block to make this assert hold.

For the function contract, we must consider:

  • Precondition: this function starts by saving out the element at position 0. If the sequence was empty, the function would crash. We need to require that the sequence size be at least 1.
  • modifies clause: this function is NOT modifying its sequence parameter, so we can omit this clause
  • Postcondition: the function is not changing the sequence, so we do not need to describe the final values of each sequence element. We do need to describe what value we are returning, and how it relates to the sequence. We want to describe that result (our return value) is the smallest element in list, so that:
    • result is less than or equal to every element in list
    • There is an element in list that equals result (i.e., we really are returning one of our sequence values)

For the loop invariant block, we notice that the loop is NOT changing the sequence. We must include:

  • What we are doing with each sequence element, and how that relates to another variable. We can see that small tracks the smallest element we’ve seen in the sequence so far – up to but not including position i. Similar to the postcondition, we want to claim that:
    • small is less than or equal to every element in list that we have seen so far
    • There is an element we have already seen in list that equals small
  • Upper and lower bounds for position variables. Here, i is our position variable. We see that it is initialized to 1, so we will claim that it is always greater than or equal to 1 and less than or equal to the list size.

We can now complete the function contract and loop invariant for min:

import org.sireum.logika._

//return the smallest element in list
def min(list: ZS): Z = {
    l"""{
        requires list.size > 0
        ensures
            //result is <= every element in list
            A x: (0..<list.size) result <= list(x)

            //there exists a list element that matches my result
            E x: (0..<list.size) result == list(x)
    }"""

    var small: Z = list(0)
    var i: Z = 1

    while (i < list.size) {
        l"""{
            invariant
                i <= list.size
                i >= 1

                //small is the smallest up to position i
                A x: (0..<i) small <= list(x)

                //small is one of the elements up to position i
                E x: (0..<i) small == list(x)
            modifies small, i
        }"""

        if (list(i) < small) {
            small = list(i)
        }

        i = i + 1
    }

    return small
}

////////////// Calling code ///////////////////

var test: ZS = ZS(8,1,0,10,9,2,0)
var testMin: Z = min(test)

assert(testMin == 0)

If we test this program in Logika’s symexe mode, the final assert will hold – we have enough information to know what the sequence will look like after calling addOne.

Logika Facts, revisited

As in other programs with loops, programs with sequences sometimes necessitate the use of a Logika fact to describe how a particular value is calculated. For example, consider the following program that finds and returns the sum of all elements in an integer sequence:

import org.sireum.logika._

def seqSum(list: ZS): Z = {
    var i: Z = 0
    var total: Z = 0

    while (i < list.size) {
        total = total + list(i)
        i = i + 1
    }

    return total
}

////////////// Calling code ///////////////////

var test: ZS = ZS(1,2,3,4)
var added: Z = seqSum(test)

assert(added == 10)

In the seqSum function contract, we need to describe that the return value equals the sum of all sequence elements – that is, that result == list(0) + list(1) + ... + list(list.size - 1). This is the same situation that we encountered when trying to specify something like a factorial. We know the pattern that we want to describe, but aren’t able to do so without using the “…” notation. When you need to describe a pattern in this way, you will almost always want to use a Logika fact.

Blueprint for Logika facts with sequences

When writing a Logika fact that works with a sequence, we will have this general recursive definition:

  • Base case: we have already processed all elements in the sequence
  • Recursive case: process one sequence element, and recursively process the rest

We need some way to track what element we are ready to process, so in addition to having the Logika proof function take a sequence parameter, we will also have it take a parameter that stores how many elements we have left to process. We will use this template for our Logika fact:

l"""{
    fact
        def factName(seqName: seqType, count: Z): returnType
            = (baseCaseValue), if count == 0                    (factName0)
            = (recursiveCaseValue), if count > 0                (factNameN)
}"""

Where we have that:

  • factName is the name we give our Logika proof function
  • seqName is the name of our sequence parameter, and seqType is its type (either ZS or ZB)
  • returnType is the type of what we are calculating (most likely Z or B)
  • count is the number of items left to process in the sequence. We will see that we will initially pass our sequence size as this parameter, so that all elements in the sequence will be processed.
  • (baseCaseValue) is the value we want for our base case – if we have already processed all sequence elements
  • (recursiveCaseValue) is the value we want for our recursive case. In this step, we want to process the current sequence element, seqName(count-1), and then use the proof function to recursively evaluate the rest of the sequence (passing count-1 as the number of items left to process).
  • (factName0) is the name we are giving out base case definition, and (factNameN) is the name we are giving our recursive case definition. We will not need to refer to these name in our verification when we use symexe mode.

You may notice that this format is slightly different than the format we used for Logika facts in section 9.4. We could also write our sequence Logika facts in that format, using quantifiers to describe ranges, but you will find that the above format is much more straightforward.

Logika fact for sequence sum

We are now ready to write a Logika fact to describe the calculation of adding all elements in a sequence – the sum list(0) + list(1) + ... + list(list.size - 1) for some sequence list. We use the template above to write:

l"""{
    fact
        def sumFact(seq: ZS, pos: Z): Z
            = 0, if pos == 0                                    (sum0)
            = seq(pos-1) + sumFact(seq, pos-1), if pos > 0          (sumN)
}"""

How does the sequence sum fact work?

To see how this Logika proof function works for a particular sequence, seq is ZS(5,4,2). If we were to calculate:

sumFact(seq, seq.size)

Then sumFact would initially go into the recursive case, since pos is 3 (and is greater than 0). Thus we would have that:

sumFact(seq, seq.size) = 2 + sumFact(seq, 2)

sumFact(seq, 2) would also go into the recursive case. Since pos is 2 in this case, it would evaluate to be: 4 + sumFact(seq, 1). Next, sumFact(seq, 1) would evaluate to be: 5 + sumFact(seq, 0), and then sumFact(seq, 0) would reach our base case and would evaluate to 0.

Since sumFact(seq, 0) is 0, we now have that:

sumFact(seq, 1) = 5 + sumFact(seq, 0) = 5 + 0 = 5

And then we can plug in 5 for sumFact(seq, 1) to get:

sumFact(seq, 2) = 4 + sumFact(seq, 1) = 4 + 5 = 9

Lastly, we can use sumFact(seq, 2) == 9 in our top-level calculation to get that:

sumFact(seq, seq.size) = 2 + sumFact(seq, 2) = 2 + 9 = 11

And we see that sumFact has correctly described that the sum of all elements in our ZS(5,4,2) sequence is 11.

Finishing the `seqSum`` example

Now that we have a Logika fact to describe the sum of all the elements in a sequence, we have enough information to write the postcondition and loop invariant for our seqSum function.

For the function contract, we must consider:

  • Precondition: this function will work for all sizes of sequences. For an empty sequence, it will correctly return a sum of 0. We can omit the requires clause from the function contract.
  • modifies clause: this function is NOT modifying its sequence parameter, so we can omit this clause
  • Postcondition: the function is not changing the sequence, so we do not need to describe the final values of each sequence element. We do need to describe what value we are returning, and how it relates to the sequence. We want to use our sumFact proof function to describe that result (our return value) is the sum of all elements in list – that is, that result == sumFact(list, list.size).

For the loop invariant block, we notice that the loop is NOT changing the sequence. We must include:

  • What we are doing with each sequence element, and how that relates to another variable. We can see that total tracks the sum of all elements in the sequence so far – up to but not including position i. Similar to the postcondition, we want to use our sumFact proof function to claim that small is the sum of the first i elements in list – that is, that small == sumFact(list, i).
  • Upper and lower bounds for position variables. Here, i is our position variable. We see that it is initialized to 0, so we will claim that it is always greater than or equal to 0 and less than or equal to the list size.

We can now complete the function contract and loop invariant for seqSum:

import org.sireum.logika._

//What is this Logika fact saying?

//add all elements from position 0 up to but not including pos
//sum(seq, seq.size) - defines adding ALL elements in seq

l"""{
   fact
       def sum(seq: ZS, pos: Z): Z
           = 0, if pos == 0                                    (sum0)
           = seq(pos-1) + sum(seq, pos-1), if pos > 0          (sumN)
}"""

def seqSum(list: ZS): Z = {
   l"""{
       ensures
           result == sum(list, list.size)
   }"""

   var i: Z = 0
   var total: Z = 0

   while (i < list.size) {
       l"""{
           invariant
               //total is the sum of the first i elements
               //total = list(0) + list(1) + ... + list(i-1)
               total == sum(list, i)
               i >= 0
               i <= list.size
           modifies total, i
       }"""

       total = total + list(i)
       i = i + 1
   }

   return total
}

////////////// Calling code ///////////////////

var test: ZS = ZS(1,2,3,4)
var added: Z = seqSum(test)

assert(added == 10)

If we test this program in Logika’s symexe mode, the final assert will hold – we have enough information to know exactly the sum of a specific sequence.

Global Variables

Motivation

We will now consider programs with multiple functions that modify a shared pool of global variables. (This is very similar to the concerns in general classes in Java or C#, where multiple methods might edit fields/property values for an object). We want to be sure that global variables will maintain desired ranges and relationships between one another, even as multiple functions modify their values.

Global variables in Logika

A global variable in Logika exists before any function call, and still exists after any function ends.

Functions that access global variables

Consider the following Logika program:

import org.sireum.logika._

//global variable
var feetPerMile: Z = 5280  // feet in a mile mile

def convertToFeet(m : Z): Z = {
    val feet: Z = m * feetPerMile
    return feet
}

/////////// Calling code ////////////////////

var miles: Z = readInt()

var totalFeet: Z = 0
if (miles >= 0){
    totalFeet = convertToFeet(miles)
}

Here, feetPerMile is a global variable – it exists before the convertToFeet function is called, and still exists after convertToFeet ends. In contrast, the feet variable inside convertToFeet is NOT global – its scope ends when the convertToFeet function returns.

(The miles and totalFeet variables in the calling code do not behave as global variables, as they were declared after any function definition. However, if we did add additional functions after our calling code, then miles and totalFeet would be global to those later functions. In Logika, the scope for any variable declared outside of a function begins at the point in the code where it is declared.)

In the example above, convertToFeet only accesses the feetPerMile global variable. A global variable that is read (but not updated) by a function body can be safely used in the functions precondition and postcondition – it acts just like an extra parameter to the function. We might edit convertToFeet to have this function contract:

import org.sireum.logika._

//global variable
var feetPerMile: Z = 5280  // feet in a mile mile

def convertToFeet(m : Z): Z = {
    l"""{
        //only do conversions on nonnegative distances
        requires m >= 0    
            //not needed, but demonstrates using global variables in preconditions     
            feetPerMile > 5200  

        //can use global variable in postcondition    
        ensures result == m * feetPerMile
    }"""

    val feet: Z = m * feetPerMile
    return feet
}

/////////// Calling code ////////////////////

var miles: Z = readInt()

var totalFeet: Z = 0
if (miles >= 0){
    totalFeet = convertToFeet(miles)
}

However, we cannot assign to a global variable the result of calling a function. That is, totalFeet = convertToFeet(5) is ok, and so is totalFeet = convertToFeet(feetPerMile), but feetPerMile = convertToFeet(5) is not.

Functions that modify global variables

In the Logika language, every global variable that is modified by a function must be listed in that function’s modifies clause. Such functions must also describe in their postconditions how these global variables will be changed by the function from their original (pre-function call) values. We will use the notation globalVariableName_in for the value of global variable globalVariableName at the start of the function, just as we did for sequences.

Here is an example:

import org.sireum.logika._

//global variable
var time: Z = 0

def tick(): Z = {
    l"""{
        requires time > 0
        modifies time
        ensures time == time_in + 1
    }"""

    time = time + 1
}

Here, we have a global time variable and a tick function that increases the time by 1 with each function call. Since the tick function changes the time global variable, we must include two things in its function contract:

  • A modifies clause that lists time as one of the global variables modified by this function
  • A postcondition that describes how the value of time after the function call compares to the value of time just before the function call. The statement time == time_in + 1 means: “the value of time after the function call equals the value of time just before the function call, plus one”.

Global invariants

When we have a program with global variables that are modified by multiple functions, we often want some way to ensure that the global variables always stay within a desired range, or always maintain a particular relationship among each other. We can accomplish these goals with global invariants, which specify what must always be true about global variables.

Bank example

For example, consider the following partial program that represents a bank account:

import org.sireum.logika._

//global variables
var balance: Z = 0
var elite: B = false
val eliteMin: Z = 1000000 //$1M is the minimum balance for elite status

//global invariants
l"""{
    invariant
        //balance should be non-negative
        balance >= 0

        //elite status should reflect if balance is at least a million
        elite == (balance >= eliteMin)
}"""

def deposit(amount: Z): Unit = {
    l"""{
        //We still need to complete the function contract
    }"""

    balance = balance + amount
    if (balance >= eliteMin) {
        elite = true
    } else {
        elite = false
    }
}

def withdraw(amount: Z): Unit = {
    l"""{
        //We still need to complete the function contract
    }"""

    balance = balance - amount
    if (balance >= eliteMin) {
        elite = true
    } else {
        elite = false
    }
}

Here, we have three global variables: balance (the bank account balance), elite (whether or not the customer has “elite status” with the bank, which is given to customers maintaining above a certain balance threshold), and eliteMin (a value representing the minimum account balance to achieve elite status). We have two global invariants describing what must always be true about these global variables:

  • balance >= 0, which states that the account balance must never be negative
  • elite == (balance >= eliteMin), which states that the elite boolean flag should always accurately represent whether the customer’s current account balance is over the minimum threshold for elite status

Global invariants must hold before each function call

In any program with global invariants, we either must prove (in manual mode) or their must be sufficient evidence (in symexe mode) that each global invariant holds immediately before any function call (including when the program first begins, before any function call). In our bank example, we see that the global variables are initialized as follows:

var balance: Z = 0
var elite: B = false
val eliteMin: Z = 1000000

In symexe mode, there is clearly enough evidence that the global invariants all hold with those initial values – the balance is nonnegative, and the customer correctly does not have elite status (because they do not have about the $1,000,000 threshold).

Global invariants must still hold at the end of each function call

Since we must demonstrate that global invariants hold before each function call, functions themselves can assume the global invariants are true at the beginning of the function. If we were using manual mode, we could list each global invariant as a premise at the beginning of the function – much like we do with preconditions. Then, it is the job of each function to ensure that the global invariants STILL hold when the function ends. In manual mode, we would need to demonstrate that each global invariant claim globalInvariant still held in a logic block just before the end of the function:

l"""{
    //each global invariant must still hold at the end of the function

    1. globalInvariant              (some justification)
}"""

In symexe mode, we do not need to include such logic blocks, but there must be sufficient detail in the function contract to infer that each global invariant will hold no matter what at the end of the function.

Bank function contracts

Consider the deposit function in our bank example:

def deposit(amount: Z): Unit = {
    l"""{
        //We still need to complete the function contract
    }"""

    balance = balance + amount
    if (balance >= eliteMin) {
        elite = true
    } else {
        elite = false
    }
}

Since deposit is modifying the global variables balance and elite, we know we must include two things in its function contract:

  • A modifies clause that lists balance and elite as global variables modified by this function
  • A postcondition that describes how the value of balance after the function call compares to the value of balance just before the function call. We want to say, balance == balance_in + amount, because the value of balance at the end of the function equals the value of balance at the beginning of the function, plus amount.

We also must consider how the elite variable changes as a result of the function call. In the code, we use an if/else statement to ensure that elite gets correctly updated if the customer’s new balance is above or below the threshold for elite status. If we were to write a postcondition that summarized how elite was updated by the function, we would write: elite == (balance >= eliteMin) to say that the value of elite after the function equaled whether the new balance was above the threshold. However, this claim is already a global invariant, which already must hold at the end of the function. We do not need to list it again as a postcondition.

Consider this potential function contract for deposit:

def deposit(amount: Z): Unit = {
    l"""{
        //this function contract is not quite correct

        modifies balance, elite
        ensures balance == balance_in + amount
    }"""

    balance = balance + amount
    if (balance >= eliteMin) {
        elite = true
    } else {
        elite = false
    }
}

This function contract is close to correct, but contains a major flaw. In symexe mode, the function contract must be tight enough to guarantee that the global invariants will still hold after the function ends. Suppose balance still has its starting value of 0, and that we called deposit(-100). With no other changes, the function code would dutifully update the balance global variable to be -100…which would violate the global invariant that balance >= 0. In order to guarantee that the balance will never be negative after the deposit function ends, we must restrict the deposit amounts to be greater than or equal to 0. Since functions are can assume that the global invariants hold when they are called, we know that balance will be 0 at minimum at the beginning of deposit. If amount is also nonnegative, we can guarantee that the value of balance at the end of the deposit function will be greater than or equal to 0 – thus satisfying our global invariant.

Here is the corrected deposit function:

def deposit(amount: Z): Unit = {
    l"""{
        requires amount >= 0
        modifies balance, elite
        ensures balance == balance_in + amount
    }"""

    balance = balance + amount
    if (balance >= eliteMin) {
        elite = true
    } else {
        elite = false
    }
}

We can similarly write the function contract for the withdraw function. Since withdraw is subtracting an amount from the balance, we must require that the withdraw amount be less than or equal to the account balance – otherwise, the account balance might become negative, and we would violate the global invariant. We will also require that our withdrawal amount be nonnegative, as it doesn’t make any sense to withdraw a negative amount from a bank account:

def withdraw(amount: Z): Unit = {
    l"""{
        requires amount >= 0
            amount <= balance
        modifies balance, elite
        ensures
            balance == balance_in - amount
    }"""

    balance = balance - amount
    if (balance >= eliteMin) {
        elite = true
    } else {
        elite = false
    }
}

Bank calling code

When we call a function in a program with global invariants (whether in the calling code or from another function), we must consider four things:

  • We must demonstrate that all global variables hold before the function call
  • We must demonstrate that the preconditions for the function holds
  • We can assume that all global variables hold after the function call (as the function itself if responsible for showing that the global invariants still hold just before the function ends)
  • We can assume the postcondition for the function holds after the function call

Suppose we had this test code at the end of our bank program:

deposit(500000)

//Assert will hold
assert(balance == 500000 & elite == false)

deposit(500000)

//Assert will hold
assert(balance == 1000000 & elite == true)

//Precondition will not hold
withdraw(2000000)

We already showed how our global invariants initially held for the starting values of the global variables (balance = 0 and elite = false). When we consider the first function call, deposit(500000), we can also see that the precondition holds (we are depositing a non-negative amount). The deposit postcondition tells us that the new value of balance is 500000 more than it was before the function call, so we know balance is now 500000. We can also assume that all global invariants hold after the deposit call, so we can infer that elite is still false (since the balance is not more than the threshold). Thus the next assert statement:

assert(balance == 500000 & elite == false)

will hold in Logika’s symexe mode.

The very next statement in the calling code is another call to deposit. Since we could assume the global invariants held immediately after the last call to deposit, we can infer that they still hold before the next deposit call. We also see that the function’s precondition is satisfied, as we are depositing another nonnegative value. Just as before, we can use the deposit postcondition to see that balance will be 1000000 after the next function call (the postcondition tells us that balance is 500000 more than it was just before the function call). We also know that the global invariants hold, so we are sure elite has been updated to true. Thus our next assert holds as well:

assert(balance == 1000000 & elite == true)

Our final function call, withdraw(2000000), will not be allowed. We are trying to withdraw $2,000,000, but our account balance at this point is $1,000,000. We will get an error saying that the withdraw precondition has not been satisfied, as that function requires that our withdrawal amount be less than or equal to the account balance.

Termination

What is termination?

In this section, we will consider the notion of termination – whether a function will ever finish running.

Partial correctness vs total correctness

Up to this point, we have proved partial correctness for functions – IF the function’s precondition holds, AND if it terminates, THEN we promise that its postcondition will hold.

Example of partial correctness

Consider the following version of our mult function, which uses repeated addition to multiply two numbers:

def mult(m: Z, n: Z): Z = {
    l"""{
        ensures result == m*n
    }"""

    var sum: Z = 0
    var count: Z = 0

    while (count != n) {
        l"""{
            invariant sum == m*count
            modifies sum, count
        }"""

        sum = sum + m
        count = count + 1
    }

    return sum
}

This function will be verified in Logika’s symexe mode, but in fact it has a subtle flaw. If we were to pass in -1 for our second parameter (n), then we would get stuck in an infinite loop. count would be initially 0, and we would increment count each time in the loop, but of course it NEVER equal -1.

This is an example of partial correctness – if our function DOES finish (which it would for nonnegative values of n), then we have shown it will return the correct value. We can see that we will need to require that the n parameter be nonnegative .

Total correctness definition

Total correctness goes a step further than partial correctness – it says that IF the function’s precondition holds, THEN we promise that it will terminate and that its postcondition will hold.

In order to show total correctness for our mult function, we must show that it always terminates.

Process of proving termination

We will see later in this section that the idea of termination is a much more challenging topic than it might seem. There is no button in Logika that will check for termination, but we can often insert manual assertions which, if they are verified, will guarantee termination. We will show how to create such manual assertions for simple loops that execute a set number of times.

First, we need to come up with a way to measure (as an integer) how much work the loop has left to do. Using this measure of work, we want to show two things:

  • Each loop iteration decreases the integer measure (i.e., the amount of work left to do is strictly decreasing)
  • When our integer measure is 0 or less, then we are certain that we are done (i.e., the loop exits)

Termination in mult

In our mult example, let’s first try to establish an integer measure of work for the loop. We know that the loop is computing m + m + ... + m, for a total of n additions. When count is 0, we know we have n more additions to do (n more iterations of the loop). When count is 1, we know we have n-1 more additions…and when count is n, we know that we have no more additions to do (and the loop ends). Our measure of work should be the number of additions left to do, which is:

Measure of work: n - count

We can calculate this measure at the beginning of each iteration and again at the end of each iteration:

while (count != n) {
    l"""{
        invariant sum == m*count
        modifies sum, count
    }"""

    //get measure value at beginning of iteration
    val measureBegin: Z = n-count

    sum = sum + m
    count = count + 1

    //get measure value at end of iteration
    val measureEnd: Z = n-count
}

Next, we want to assert that measureEnd < measureBegin – that the amount of work decreases with each iteration. We can also assert that measureEnd > 0 | count == n – that either we have more work to do, or our loop condition is false (meaning that if we have no more work to do, then our loop condition must be false and thus terminate):

def mult(m: Z, n: Z): Z = {
    l"""{
        requires n >= 0         //needed for termination
        ensures result == m*n
    }"""

    var sum: Z = 0
    var count: Z = 0

    while (count != n) {
        l"""{
            invariant sum == m*count
            modifies sum, count
        }"""

        //get measure value at beginning of iteration
        val measureBegin: Z = n-count

        sum = sum + m
        count = count + 1

        //get measure value at end of iteration
        val measureEnd: Z = n-count

        //we are making progress
        //the amount of work decreases with each iteration
        assert(measureEnd < measureBegin)

        //we either have more work, or the loop will terminate
        //(if there is no work work to do, then the loop condition must be false)
        assert(measureEnd > 0 | count == n)     //NOTE: will not hold!
    }

    return sum
}

If we try verifying this program in Logika, the second assert, assert(measureEnd > 0 | count == n) will not hold. To see why, let’s suppose that measureEnd <= 0. For the assert to be true, we would need to be certain that count == n (since the left side of the OR would be false). Because measureEnd = n-count, we can infer that count >= n when measureEnd <= 0. However, Logika is unable to infer that count == n from the knowledge that count >= n unless it also knows that count <= n always holds. We can add this knowledge by strengthening our loop invariant to provide a range for the loop counter – count >= 0 and count <= n. Even if not required, it is a good habit anyway to include the loop counter range as part of the loop invariant.

We strengthen our loop invariant, and both asserts will hold – thus demonstrating termination:

def mult(m: Z, n: Z): Z = {
    l"""{
        requires n >= 0         //needed for termination
        ensures result == m*n
    }"""

    var sum: Z = 0
    var count: Z = 0

    while (count != n) {
        l"""{
            invariant sum == m*count
                count >= 0
                count <= n      //bound loop counter
                                //needed for assert to hold
            modifies sum, count
        }"""

        //get measure value at beginning of iteration
        val measureBegin: Z = n-count

        sum = sum + m
        count = count + 1

        //get measure value at end of iteration
        val measureEnd: Z = n-count

        //we are making progress
        //the amount of work decreases with each iteration
        assert(measureEnd < measureBegin)

        //we either have more work, or the loop will terminate
        //(if there is no work work to do, then the loop condition must be false)
        assert(measureEnd > 0 | count == n) 
    }

    return sum
}

We could similarly use measures of work and manual assert statements to prove termination in some recursive functions. Here, we would demonstrate that a parameter value decreased with each recursive call, and that we either had more work to do or had reached the base case of our recursion (with no more recursive calls needed).

Collatz function

While it is possible to prove termination for certain kinds of programs – those that loop or make recursive calls a set number of times – it is not possible to prove termination for all programs.

Consider the collatz function below:

import org.sireum.logika._
def collatz(m: Z): Z = {
    l"""{
        requires m > 0
        ensures result == 1
    }"""

    var n: Z = m
    while (n > 1) {
        l"""{
            invariant n >= 1
            modifies n
        }"""

        if (n % 2 == 0) {
            n = n / 2
        } else {
            n = 3 * n + 1
        }
    }

    return n
}

We see that we must pass collatz a positve parameter, and that it promises to return 1 (no matter what the parameter is). It contains a loop that repeatedly modifies a current value (which is initially the parameter value):

  • If the current number is even, we divide the number by 2
  • If the current number is odd, we triple the number and add 1

Suppose we compute collatz(17). We can track the value of n as follows: 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1. We can see that n does eventually reach 1, and that the program terminates in that case. We can similarly try other parameters, and will again see that we always end up with 1 (sometimes after a surprising number of iterations). But in fact:

  • No one has proved that the Collatz function terminates for all positive numbers; and
  • No one has found a positive number on which the Collatz function does not terminate

Decidability and the Halting problem

It is an obvious question whether we could write a program to check whether another program always terminates. Unfortunately, this (the Halting problem) turns out to be impossible, as was demonstrated by Alan Turing. The Halting problem is an example of an undecidable problem in computer science – a decision problem (a problem with a yes/no answer) that we can’t correctly answer one way or another on all inputs, even if we have unlimited resources.