Download Imperative Programming in Objective Caml: Record Types, Control Structures, and Stacks and more Summaries Functional Programming in PDF only on Docsity!
Imperative
Programming
In contrast to functional programming, in which you calculate a value by applying a
function to its arguments without caring how the operations are carried out, imperative
programming is closer to the machine representation, as it introduces memory state
which the execution of the program’s actions will modify. We call these actions of
programs instructions, and an imperative program is a list, or sequence, of instructions.
The execution of each operation can alter the memory state. We consider input-output
actions to be modifications of memory, video memory, or files.
This style of programming is directly inspired by assembly programming. You find it
in the earliest general-purpose programming languages (Fortran, C, Pascal, etc.). In
Objective Caml the following elements of the language fit into this model:
- modifiable data structures, such as arrays, or records with mutable fields;
- input-output operations;
- control structures such as loops and exceptions.
Certain algorithms are easier to write in this programming style. Take for instance
the computation of the product of two matrices. Even though it is certainly possible
to translate it into a purely functional version, in which lists replace vectors, this is
neither natural nor efficient compared to an imperative version.
The motivation for the integration of imperative elements into a functional language
is to be able to write certain algorithms in this style when it is appropriate. The two
principal disadvantages, compared to the purely functional style, are:
- complicating the type system of the language, and rejecting certain programs
which would otherwise be considered correct;
- having to keep track of the memory representation and of the order of calcula-
tions.
68 Chapter 3 : Imperative Programming
Nevertheless, with a few guidelines in writing programs, the choice between several
programming styles offers the greatest flexibility for writing algorithms, which is the
principal objective of any programming language. Besides, a program written in a style
which is close to the algorithm used will be simpler, and hence will have a better chance
of being correct (or at least, rapidly correctable).
For these reasons, the Objective Caml language has some types of data structures whose
values are physically modifiable, structures for controlling the execution of programs,
and an I/O library in an imperative style.
Plan of the Chapter
This chapter continues the presentation of the basic elements of the Objective Caml
language begun in the previous chapter, but this time focusing on imperative construc-
tions. There are five sections. The first is the most important; it presents the different
modifiable data structures and describes their memory representation. The second de-
scribes the basic I/O of the language, rather briefly. The third section is concerned
with the new iterative control structures. The fourth section discusses the impact of
imperative features on the execution of a program, and in particular on the order of
evaluation of the arguments of a function. The final section returns to the calculator
example from the last chapter, to turn it into a calculator with a memory.
Modifiable Data Structures
Values of the following types: vectors, character strings, records with mutable fields,
and references are the data structures whose parts can be physically modified.
We have seen that an Objective Caml variable bound to a value keeps this value to
the end of its lifetime. You can only modify this binding with a redefinition—in which
case we are not really talking about the “same” variable; rather, a new variable of the
same name now masks the old one, which is no longer directly accessible, but which
remains unchanged. With modifiable values, you can change the value associated with
a variable without having to redeclare the latter. You have access to the value of a
variable for writing as well as for reading.
Vectors
Vectors, or one dimensional arrays, collect a known number of elements of the same
type. You can write a vector directly by listing its values between the symbols [| and
|], separated by semicolons as for lists.
let v = [| 3.14; 6.28; 9.42 |] ; ;
val v : float array = [|3.14; 6.28; 9.42|]
The creation function Array.create takes the number of elements in the vector and
an initial value, and returns a new vector.
70 Chapter 3 : Imperative Programming
val m : int array array = [|[|0; 0; 0|]; [|0; 0; 0|]; [|0; 0; 0|]|]
0 0 0
m
v
Figure 3.1: Memory representation of a vector sharing its elements.
If you modify one of the fields of vector v, which was used in the creation of m, then
you automatically modify all the “rows” of the matrix together (see figures 3.1 and
v.(0) <- 1; ;
m; ;
- : int array array = [|[|1; 0; 0|]; [|1; 0; 0|]; [|1; 0; 0|]|]
1 0 0
m
v
Figure 3.2: Modification of shared elements of a vector.
Duplication occurs if the initialization value of the vector (the second argument passed
to Array.create) is an atomic value and there is sharing if this value is a structured
value.
Values whose size does not exceed the standard size of Objective Caml values—that
is, the memory word—are called atomic values. These are the integers, characters,
booleans, and constant constructors. The other values—structured values—are repre-
sented by a pointer into a memory area. This distinction is detailed in chapter 9 (page
Vectors of floats are a special case. Although floats are structured values, the creation
of a vector of floats causes the the initial value to be copied. This is for reasons of
optimization. Chapter 12, on the interface with the C language (page 315), describes
this special case.
Modifiable Data Structures 71
Non-Rectangular Matrices
A matrix, a vector of vectors, does not need not to be rectangular. In fact, nothing
stops you from replacing one of the vector elements with a vector of a different length.
This is useful to limit the size of such a matrix. The following value t constructs a
triangular matrix for the coefficients of Pascal’s triangle.
let t = [|
[|1|]; [|1; 1|]; [|1; 2 ; 1|]; [|1; 3 ; 3 ; 1|]; [|1; 4 ; 6 ; 4 ; 1|]; [|1; 5 ; 10 ; 10 ; 5 ; 1|] |] ; ; val t : int array array = [|[|1|]; [|1; 1|]; [|1; 2; 1|]; [|1; 3; 3; 1|]; [|1; 4; 6; 4; ...|]; ...|]
t.(3) ; ;
- : int array = [|1; 3; 3; 1|]
In this example, the element of vector t with index i is a vector of integers with size
i + 1. To manipulate such matrices, you have to calculate the size of each element
vector.
Copying Vectors
When you copy a vector, or when you concatenate two vectors, the result obtained is
a new vector. A modification of the original vectors does not result in the modification
of the copies, unless, as usual, there are shared values.
let v2 = Array.copy v ; ;
val v2 : int array = [|1; 0; 0|]
let m2 = Array.copy m ; ;
val m2 : int array array = [|[|1; 0; 0|]; [|1; 0; 0|]; [|1; 0; 0|]|]
v.(1)<- 352; ;
v2; ;
- : int array = [|1; 0; 0|]
m2 ; ;
- : int array array = [|[|1; 352; 0|]; [|1; 352; 0|]; [|1; 352; 0|]|]
We notice in this example that copying m only copies the pointers to v. If one of the
elements of v is modified, m2 is modified too.
Concatenation creates a new vector whose size is equal to the sum of the sizes of the
two others.
let mm = Array.append m m ; ;
val mm : int array array = [|[|1; 352; 0|]; [|1; 352; 0|]; [|1; 352; 0|]; [|1; 352; 0|]; [|1; 352; ...|]; ...|]
Array.length mm ; ;
Modifiable Data Structures 73
Mutable Fields of Records
Fields of a record can be declared mutable. All you have to do is to show this in the
declaration of the type of the record using the keyword mutable.
Syntax : type name = {... ; mutable namei : t ;... }
Here is a small example defining a record type for points in the plane:
type point = { mutable xc : float; mutable yc : float } ; ;
type point = { mutable xc: float; mutable yc: float }
let p = { xc = 1.0; yc = 0.0 } ; ;
val p : point = {xc=1; yc=0}
Thus the value of a field which is declared mutable can be modified using the syntax:
Syntax : expr 1. name <- expr 2
The expression expr 1 should be a record type which has the field name. The modifica-
tion operator returns a value of type unit.
p.xc <- 3.0 ; ;
p ; ;
We can write a function for moving a point by modifying its components. We use a
local declaration with pattern matching in order to sequence the side-effects.
let moveto p dx dy =
let () = p.xc <- p.xc +. dx in p.yc <- p.yc +. dy ; ; val moveto : point -> float -> float -> unit =
moveto p 1.1 2.2 ; ;
p ; ;
- : point = {xc=4.1; yc=2.2}
It is possible to mix mutable and non-mutable fields in the definition of a record. Only
those specified as mutable may be modified.
type t = { c1 : int; mutable c2 : int } ; ;
type t = { c1: int; mutable c2: int }
let r = { c1 = 0; c2 = 0 } ; ;
val r : t = {c1=0; c2=0}
r.c1 <- 1 ; ;
Characters 0-9: The label c1 is not mutable
r.c2 <- 1 ; ;
74 Chapter 3 : Imperative Programming
r ; ;
On page 82 we give an example of using records with modifiable fields and arrays to
implement a stack structure.
References
Objective Caml provides a polymorphic type ref which can be seen as the type of a
pointer to any value; in Objective Caml terminology we call it a reference to a value.
A referenced value can be modified. The type ref is defined as a record with one
modifiable field:
type ’a ref = {mutable contents:’a}
This type is provided as a syntactic shortcut. We construct a reference to a value using
the function ref. The referenced value can be reached using the prefix function (!).
The function modifying the content of a reference is the infix function (:=).
let x = ref 3 ; ;
val x : int ref = {contents=3}
x ; ;
!x ; ;
x := 4 ; ;
!x ; ;
x := !x+1 ; ;
!x ; ;
Polymorphism and Modifiable Values
The type ref is parameterized. This is what lets us use it to create references to values
of any type whatever. However, it is necessary to place certain restrictions on the type
of referenced values; we cannot allow the creation of a reference to a value with a
polymorphic type without taking some precautions.
Let us suppose that there were no restriction; then someone could declare:
let x = ref [] ;;
76 Chapter 3 : Imperative Programming
Likewise, when you apply a polymorphic value to a polymorphic function, you get a
weak type variable, because you must not exclude the possibility that the function may
construct physically modifiable values. In other words, the result of the application is
always monomorphic.
(function x → x) [] ; ;
You get the same result with partial application:
let f a b = a ; ;
val f : ’a -> ’b -> ’a =
let g = f 1 ; ;
val g : ’_a -> int =
To get a polymorphic type back, you have to abstract the second argument of f and
then apply it:
let h x = f 1 x ; ;
val h : ’a -> int =
In effect, the expression which defines h is the functional expression function x →
f 1 x. Its evaluation produces a closure which does not risk producing a side effect,
because the body of the function is not evaluated.
In general, we distinguish so-called “non-expansive” expressions, whose calculation we
are sure carries no risk of causing a side effect, from other expressions, called “expan-
sive.” Objective Caml’s type system classifies expressions of the language according to
their syntactic form:
- “non-expansive” expressions include primarily variables, constructors of non-
mutable values, and abstractions;
- “expansive” expressions include primarily applications and constructors of mod-
ifiable values. We can also include here control structures like conditionals and
pattern matching.
Input-Output
Input-output functions do calculate a value (often of type unit) but during their
calculation they cause a modification of the state of the input-output peripherals:
modification of the state of the keyboard buffer, outputting to the screen, writing
in a file, or modification of a read pointer. The following two types are predefined:
in channel and out channel for, respectively, input channels and output channels.
When an end of file is met, the exception End of file is raised. Finally, the following
three constants correspond to the standard channels for input, output, and error in
Unix fashion: stdin, stdout, and stderr.
Input-Output 77
Channels
The input-output functions from the Objective Caml standard library manipulate com-
munication channels: values of type in channel or out channel. Apart from the three
standard predefined values, the creation of a channel uses one of the following func-
tions:
open in; ;
open out; ;
- : string -> out_channel =
open in opens the file if it exists^2 , and otherwise raises the exception Sys error.
open out creates the specified file if it does not exist or truncates it if it does.
let ic = open in "koala"; ;
val ic : in_channel =
let oc = open out "koala"; ;
val oc : out_channel =
The functions for closing channels are:
close in ; ;
close out ; ;
Reading and Writing
The most general functions for reading and writing are the following:
input line ; ;
input ; ;
- : in_channel -> string -> int -> int -> int =
output ; ;
- : out_channel -> string -> int -> int -> unit =
- input line ic: reads from input channel ic all the characters up to the first
carriage return or end of file, and returns them in the form of a list of characters
(excluding the carriage return).
- input ic s p l: attempts to read l characters from an input channel ic and
stores them in the list s starting from the pth^ character. The number of characters
actually read is returned.
- output oc s p l: writes on an output channel oc part of the list s, starting at
the p-th character, with length l.
- With appropriate read permissions, that is.
Control Structures 79
let () = print string "Higher" in print newline () else let () = print string "Lower" in print newline () in hilo n ; ; val hilo : int -> unit =
Here is an example session:
# hilo 64;;
type a number: 88
Lower
type a number: 44
Higher
type a number: 64
BRAVO
Control Structures
Input-output and modifiable values produce side-effects. Their use is made easier by
an imperative programming style furnished with new control structures. We present in
this section the sequence and iteration structures.
We have already met the conditional control structure on page 18, whose abbreviated
form if then patterns itself on the imperative world. We will write, for example:
let n = ref 1 ; ;
val n : int ref = {contents=1}
if !n > 0 then n := !n - 1 ; ;
Sequence
The first of the typically imperative structures is the sequence. This permits the left-
to-right evaluation of a sequence of expressions separated by semicolons.
Syntax : expr 1 ;... ; exprn
A sequence of expressions is itself an expression, whose value is that of the last expres-
sion in the sequence (here, exprn). Nevertheless, all the expressions are evaluated, and
in particular their side-effects are taken into account.
print string "2 = "; 1+1 ; ;
80 Chapter 3 : Imperative Programming
2 = - : int = 2
With side-effects, we get back the usual construction of imperative languages.
let x = ref 1 ; ;
val x : int ref = {contents=1}
x:=!x+1 ; x:=!x*4 ; !x ; ;
As the value preceding a semicolon is discarded, Objective Caml gives a warning when
it is not of type unit.
print int 1 ; 2 ; 3 ; ;
Characters 14-15: Warning: this expression should have type unit. 1- : int = 3
To avoid this message, you can use the function ignore:
print int 1 ; ignore 2 ; 3 ; ;
1- : int = 3
A different message is obtained if the value has a functional type, as Objective Caml
suspects that you have forgotten a parameter of a function.
let g x y = x := y ; ;
val g : ’a ref -> ’a -> unit =
let a = ref 10 ; ;
val a : int ref = {contents=10}
let u = 1 in g a ; g a u ; ;
Characters 13-16: Warning: this function application is partial, maybe some arguments are missing.
let u = !a in ignore (g a) ; g a u ; ;
As a general rule we parenthesize sequences to clarify their scope. Syntactically, paren-
thesizing can take two forms:
Syntax : ( expr )
Syntax : begin expr end
We can now write the Higher/Lower program from page 78 more naturally:
let rec hilo n =
print string "type a number: "; let i = read int () in
82 Chapter 3 : Imperative Programming
It is important to understand that loops are expressions like the previous ones which
calculate the value () of type unit.
let f () = print string "-- end\n" ; ;
val f : unit -> unit =
f (for i=1 to 10 do print int i; print string " " done) ; ;
1 2 3 4 5 6 7 8 9 10 -- end
Note that the string "-- end\n" is output after the integers from 1 to 10 have been
printed: this is a demonstration that the arguments (here the loop) are evaluated before
being passed to the function.
In imperative programming, the body of a loop (expr 2 ) does not calculate a value, but
advances by side effects. In Objective Caml, when the body of a loop is not of type
unit the compiler prints a warning, as for the sequence:
let s = [5; 4 ; 3 ; 2 ; 1 ; 0] ; ;
val s : int list = [5; 4; 3; 2; 1; 0]
for i=0 to 5 do List.tl s done ; ;
Characters 17-26: Warning: this expression should have type unit.
Example: Implementing a Stack
The data structure ’a stack will be implemented in the form of a record containing
an array of elements and the first free position in this array. Here is the corresponding
type:
type ’a stack = { mutable ind:int; size:int; mutable elts : ’a array } ; ;
The field size contains the maximal size of the stack.
The operations on these stacks will be init stack for the initialization of a stack,
push for pushing an element onto a stack, and pop for returning the top of the stack
and popping it off.
let init stack n = {ind=0; size=n; elts =[||]} ; ;
val init_stack : int -> ’a stack =
This function cannot create a non-empty array, because you would have to provide it
with the value with which to construct it. This is why the field elts gets an empty
array.
Two exceptions are declared to guard against attempts to pop an empty stack or to
add an element to a full stack. They are used in the functions pop and push.
exception Stack empty ; ;
exception Stack full ; ;
let pop p =
if p.ind = 0 then raise Stack empty else (p.ind <- p.ind - 1; p.elts.(p.ind)) ; ;
Control Structures 83
val pop : ’a stack -> ’a =
let push e p =
if p.elts = [||] then (p.elts <- Array.create p.size e; p.ind <- 1) else if p.ind >= p.size then raise Stack full else (p.elts.(p.ind) <- e; p.ind <- p.ind + 1) ; ; val push : ’a -> ’a stack -> unit =
Here is a small example of the use of this data structure:
let p = init stack 4 ; ;
val p : ’_a stack = {ind=0; size=4; elts=[||]}
push 1 p ; ;
for i = 2 to 5 do push i p done ; ;
Uncaught exception: Stack_full
p ; ;
- : int stack = {ind=4; size=4; elts=[|1; 2; 3; 4|]}
pop p ; ;
pop p ; ;
If we want to prevent raising the exception Stack full when attempting to add an
element to the stack, we can enlarge the array. To do this the field size must be
modifiable too:
type ’a stack =
{mutable ind:int ; mutable size:int ; mutable elts : ’a array} ; ;
let init stack n = {ind=0; size=max n 1 ; elts = [||]} ; ;
let n push e p =
if p.elts = [||] then begin p.elts <- Array.create p.size e; p.ind <- 1 end else if p.ind >= p.size then begin let nt = 2 * p.size in let nv = Array.create nt e in for j=0 to p.size-1 do nv.(j) <- p.elts.(j) done ; p.elts <- nv; p.size <- nt; p.ind <- p.ind + 1 end else
Order of Evaluation of Arguments 85
The sum of two matrices a and b is a matrix c such that cij = aij + bij.
let add mat p q =
if p.n = q.n && p.m = q.m then let r = create mat p.n p.m in for i = 0 to p.n-1 do for j = 0 to p.m-1 do mod mat r i j (p.t.(i).(j) +. q.t.(i).(j)) done done ; r else failwith "add_mat : dimensions incompatible"; ; val add_mat : mat -> mat -> mat =
add mat a a ; ;
- : mat = {n=3; m=3; t=[|[|0; 0; 0|]; [|0; 4; 2|]; [|0; 2; 0|]|]}
The product of two matrices a and b is a matrix c such that cij =
∑k=ma
k=1 aik.bkj
let mul mat p q =
if p.m = q.n then let r = create mat p.n q.m in for i = 0 to p.n-1 do for j = 0 to q.m-1 do let c = ref 0.0 in for k = 0 to p.m-1 do c := !c +. (p.t.(i).(k) *. q.t.(k).(j)) done; mod mat r i j !c done done; r else failwith "mul_mat : dimensions incompatible" ; ; val mul_mat : mat -> mat -> mat =
mul mat a a; ;
- : mat = {n=3; m=3; t=[|[|0; 0; 0|]; [|0; 5; 2|]; [|0; 2; 1|]|]}
Order of Evaluation of Arguments
In a pure functional language, the order of evaluation of the arguments does not matter.
As there is no modification of memory state and no interruption of the calculation, there
is no risk of the calculation of one argument influencing another. On the other hand, in
Objective Caml, where there are physically modifiable values and exceptions, there is
a danger in not taking account of the order of evaluation of arguments. The following
example is specific to version 2.04 of Objective Caml for Linux on Intel hardware:
let new print string s = print string s; String.length s ; ;
val new_print_string : string -> int =
86 Chapter 3 : Imperative Programming
(+) (new print string "Hello ") (new print string "World!") ; ;
World!Hello - : int = 12
The printing of the two strings shows that the second string is output before the first.
It is the same with exceptions:
try (failwith "function") (failwith "argument") with Failure s → s; ;
If you want to specify the order of evaluation of arguments, you have to make local
declarations forcing this order before calling the function. So the preceding example
can be rewritten like this:
let e1 = (new print string "Hello ")
in let e2 = (new print string "World!") in (+) e1 e2 ; ; Hello World!- : int = 12
In Objective Caml, the order of evaluation of arguments is not specified. As it happens,
today all implementations of Objective Caml evaluate arguments from left to right. All
the same, making use of this implementation feature could turn out to be dangerous
if future versions of the language modify the implementation.
We come back to the eternal debate over the design of languages. Should certain fea-
tures of the language be deliberately left unspecified—should programmers be asked
not to use them, on pain of getting different results from their program according to
the compiler implementation? Or should everything be specified—should programmers
be allowed to use the whole language, at the price of complicating compiler implemen-
tation, and forbidding certain optimizations?
Calculator With Memory
We now reuse the calculator example described in the preceding chapter, but this
time we give it a user interface, which makes our program more usable as a desktop
calculator. This loop allows entering operations directly and seeing results displayed
without having to explicitly apply a transition function for each keypress.
We attach four new keys: C, which resets the display to zero, M, which memorizes a
result, m, which recalls this memory and OFF, which turns off the calculator. This
corresponds to the following type:
type key = Plus | Minus | Times | Div | Equals | Digit of int
| Store | Recall | Clear | Off ; ;
It is necessary to define a translation function from characters typed on the keyboard
to values of type key. The exception Invalid key handles the case of characters that
do not represent any key of the calculator. The function code of module Char translates
a character to its ASCII-code.