As with any power that we are given, it is one thing to know it's there, it is another thing to understand it and yet another to use it effectively in your programs without abusing it. In this post I would like to go through a great example of this power and its correct use.
The example is taken from the book Structure and Interpretation of Computer Programs (SICP) section 1.3.3 Procedures as General Methods. The example code is written in Clojure instead of Scheme. All of my example code is a Clojure version of the examples in that section of the book so that you will be able to compare implementations. This whole exercise is not meant to be practical but illustrative. I am not trying to write the most efficient and robust code. The main point of this post is to show a style of programming that is enabled by the skilled use of first-class functions which allows the developer to work at the correct level of abstraction and create the maximum amount of reusable code.
Before we dive in, we will need a few helper functions to simplify the code below.
(def tolerance 0.001)
(defn sin [x]
(defn cos [x]
(defn square [x]
(* x x))
(defn cube [x]
(* x x x))
(defn average [x y]
(/ (+ x y) 2.0))
(defn abs [x]
(if (< x 0) (- x) x))
What Are First-Class Functions?
I wish I had studied SICP as a student. It would have saved me a lot of frustration. In that book, first-class elements of a language are described as elements which may be:
named by variables
passed as arguments to procedures
returned as the results of procedures
included in data structures
Below we will look at the use of the first three features.
A Function to Find the Square Root of a Number
Imagine that we are creating some software which will need to solve various types of math problems. We may decide that we need to function to calculate the square root of a number and therefore write the following code.
(defn sqrt [x]
(loop [guess 1.0]
(let [good-enough? #(< (abs (- (square %) x)) tolerance)
improve #(average % (/ x %))]
(if (good-enough? guess)
(recur (improve guess))))))
That’s not bad. It’s efficient and fairly clear. We may, at this point, decide that we are done and move on the next problem. If we program in this manner, we are going to end up with a lot more code than we actually need. The problem with this code is that there are hidden patterns which can be extracted, named and reused. Once we do this, we will not only have all of that reusable code, but we will also have a cleaner version of
sqrtand more readable code in general.
Finding the Fixed Point of a Function
Since we are writing a program that is going to be solving math problems, we may also need to find the fixed point of a function. From section 1.3.3 of SICP we have the definition of a fixed point of a function: “A number
xis called a fixed point of a function
xsatisfies the equation
f(x) = x. For some functions
fwe can locate a fixed point by beginning with an initial guess and applying
We can write a function that does this...
(defn fixed-point [f first-guess]
(loop [guess first-guess]
(let [next (f guess)]
(if (close-enough? guess next)
...and try it out by finding solutions to y = cos y and y = sin y + cos y.
; find solution to y = cos y
(fixed-point cos 1.0)
; find solution to y = sin y + cos y
(fixed-point #(+ (sin %) (cos %)) 1.0)
This is a useful technique which we are sure to use throughout our software.
What does this have to do with square roots? Well, if we have a number
xthen we are looking for a number
y * y = x. If we divide both sides of this equation by y we get
y = x / y. Given an
x, we need to find a solution to
y = x / yand we may be able to do this with our new
If we try this...
(defn sqrt [x]
(fixed-point #(/ x %) 1.0))
...it will not work. If we apply that function to a value of
xover and over again we just oscillate between two values forever.
To fix the above problem we can do this.
(defn sqrt [x]
(fixed-point #(average % (/ x %)) 1.0))
Instead of using the new value of the function as the new guess, we use the average of the old guess and the new value of the function. This is called average damping. We could again stop here. But average damping is a concept that we may use again in our software. We can extract this into a method...
(defn average-damp [f]
#(average % (f %)))
...and rewrite our
sqrtfunction as shown below.
"solve y -> x / y to find the square root"
(fixed-point (average-damp #(/ x %)) 1.0))
We could also create a cube-root function using the same pattern but finding a solution to
y = x / (y * y).
"solve y -> x / (y * y) to find the cube root"
(fixed-point (average-damp #(/ x (square %))) 1.0))
average-dampfunction is a function that returns a function. You pass it a function and it returns an average damped version of that function. We now have another powerful concept that we can use anywhere thanks to first-class functions. Also, our implementation of sqrt is much better. As stated in SICP “Notice how this formulation makes explicit the three ideas in the method: fixed-point search, average-damping, and the function
y -> x / y.”
If we are going to be finding solutions to equations then we will surely be using Newton’s method at some point. From SICP we have this description of Newton’s method.
x -> g(x)is a differentiable function, then a solution of the equation
g(x) = 0is a fixed point of the function
x -> f(x)where
Dg(x)is the derivative of
Notice that Newton’s method will use the fixed-point function that we have created. Nice!
To implement Newton’s method we will need a derivative function. Notice that the derivative function will need to be a function that takes a function
gand returns a function of
Here is my Clojure implementation.
(defn deriv [g]
(let [dx 0.00001]
#(/ (- (g (+ % dx)) (g %)) dx)))
We now use this to create the
newton-transformfunction which is also a high-order function...
(defn newton-transform [g]
#(- % (/ (g %) ((deriv g) %))))
...and finally we create the
(defn newtons-method [g guess]
(fixed-point (newton-transform g) guess))
Newton’s method can find the solution to the equation
g(x) = 0. If we wanted to use Newton’s method to find square roots then we would need to find the
(y * y) - x = 0.
"find the fixed point of y -> (y * y) - x"
(newtons-method #(- (square %) x) 1.0))
The Final Abstraction
Notice that our two different implementations of
sqrtfollow a pattern: find the fixed point of a transformed function based on an initial guess. Let’s extract that pattern for reuse.
(defn fixed-point-of-transform [g transform guess]
(fixed-point (transform g) guess))
We can now use this to implement
sqrtusing two different functions and function transformations.
(defn sqrt [x]
(fixed-point-of-transform #(/ x %)
(defn sqrt [x]
(fixed-point-of-transform #(- (square %) x)
These are clear implementations of
sqrtusing high level concepts that are reusable in many areas of our software.
Think about the simplicity, conciseness and maintainability of the software that you could write if you could master this kind of programming.
I have already noticed a big difference in the kind of code that I write as I have started to use Clojure. Many of the most commonly used concepts have already been extracted to functions and are included in the Clojure API. Someone who has internalized this API and knows how to use it correctly can do amazing things with very little code. For exmaple, take a look at this code from a blog post by Tim Bray entitled Idiomatic Clojure.
"Return a map of pages to hit counts in filename."
(apply merge-with +
(line-seq (reader filename))))))
In another post entitled Eleven Theses on Clojure Tim, who is otherwise inclined to like Clojure, writes that it is a handicap that Clojure is a Lisp and that some human minds may react poorly to the
find-widelyfunction above. This is true for a reader who is new to Clojure and not already familiar with the concepts that are being used. But for someone who has internalized the API, this is more readable than most other ways you could code this same functionality. This is made possible by embracing and mastering the use of first-class functions.