Dynamic programming
Not to be confused with Dynamic programming language or Dynamic problem.
Dynamic programming is both a mathematical optimization method and a computer programming method.
The method was developed by Richard Bellman in the 1950s and has found applications in numerous fields, from aerospace engineering to economics.
In both contexts it refers to simplifying a complicated problem by breaking it down into simpler sub-problems in a recursive manner.
While some decision problems cannot be taken apart this way, decisions that span several points in time do often break apart recursively.
Likewise, in computer science, if a problem can be solved optimally by breaking it into sub-problems and then recursively finding the optimal solutions to the sub-problems, then it is said to have optimal substructure.
If sub-problems can be nested recursively inside larger problems, so that dynamic programming methods are applicable, then there is a relation between the value of the larger problem and the values of the sub-problems.
In the optimization literature this relationship is called the Bellman equation.
Overview
Mathematical optimization
In terms of mathematical optimization, dynamic programming usually refers to simplifying a decision by breaking it down into a sequence of decision steps over time.
This is done by defining a sequence of value functions V1, V2, ..., Vn taking y as an argument representing the state of the system at times i from 1 to n. The definition of Vn(y) is the value obtained in state y at the last time n. The values Vi at earlier times i = n −1, n − 2, ..., 2, 1 can be found by working backwards, using a recursive relationship called the Bellman equation.
For i = 2, ..., n, Vi−1 at any state y is calculated from Vi by maximizing a simple function (usually the sum) of the gain from a decision at time i − 1 and the function Vi at the new state of the system if this decision is made.
Since Vi has already been calculated for the needed states, the above operation yields Vi−1 for those states.
Finally, V1 at the initial state of the system is the value of the optimal solution.
The optimal values of the decision variables can be recovered, one by one, by tracking back the calculations already performed.
Control theory
Alternatively, the continuous process can be approximated by a discrete system, which leads to a following recurrence relation analog to the Hamilton–Jacobi–Bellman equation:
Example from economics: Ramsey's problem of optimal saving
See also: Ramsey–Cass–Koopmans model
which can be simplified to
We see that it is optimal to consume a larger fraction of current wealth as one gets older, finally consuming all remaining wealth in period T, the last period of life.
Computer programming
There are two key attributes that a problem must have in order for dynamic programming to be applicable: optimal substructure and overlapping sub-problems.
If a problem can be solved by combining optimal solutions to non-overlapping sub-problems, the strategy is called "divide and conquer" instead.
This is why merge sort and quick sort are not classified as dynamic programming problems.
Optimal substructure means that the solution to a given optimization problem can be obtained by the combination of optimal solutions to its sub-problems.
Such optimal substructures are usually described by means of recursion.
For example, given a graph G=(V,E), the shortest path p from a vertex u to a vertex v exhibits optimal substructure: take any intermediate vertex w on this shortest path p. If p is truly the shortest path, then it can be split into sub-paths p1 from u to w and p2 from w to v such that these, in turn, are indeed the shortest paths between the corresponding vertices (by the simple cut-and-paste argument described in Introduction to Algorithms).
Hence, one can easily formulate the solution for finding shortest paths in a recursive manner, which is what the Bellman–Ford algorithm or the Floyd–Warshall algorithm does.
Overlapping sub-problems means that the space of sub-problems must be small, that is, any recursive algorithm solving the problem should solve the same sub-problems over and over, rather than generating new sub-problems.
For example, consider the recursive formulation for generating the Fibonacci series: Fi = Fi−1 + Fi−2, with base case F1 = F2 = 1.
Then F43 = F42 + F41, and F42 = F41 + F40.
Now F41 is being solved in the recursive sub-trees of both F43 as well as F42.
Even though the total number of sub-problems is actually small (only 43 of them), we end up solving the same problems over and over if we adopt a naive recursive solution such as this.
Dynamic programming takes account of this fact and solves each sub-problem only once.
This can be achieved in either of two ways:
- Top-down approach: This is the direct fall-out of the recursive formulation of any problem. If the solution to any problem can be formulated recursively using the solution to its sub-problems, and if its sub-problems are overlapping, then one can easily memoize or store the solutions to the sub-problems in a table. Whenever we attempt to solve a new sub-problem, we first check the table to see if it is already solved. If a solution has been recorded, we can use it directly, otherwise we solve the sub-problem and add its solution to the table.
- Bottom-up approach: Once we formulate the solution to a problem recursively as in terms of its sub-problems, we can try reformulating the problem in a bottom-up fashion: try solving the sub-problems first and use their solutions to build-on and arrive at solutions to bigger sub-problems. This is also usually done in a tabular form by iteratively generating solutions to bigger and bigger sub-problems by using the solutions to small sub-problems. For example, if we already know the values of F41 and F40, we can directly calculate the value of F42.
Some programming languages can automatically memoize the result of a function call with a particular set of arguments, in order to speed up call-by-name evaluation (this mechanism is referred to as call-by-need).
Some languages make it possible portably (e.g. Scheme, Common Lisp, Perl or D).
Some languages have automatic memoization built in, such as tabled Prolog and J, which supports memoization with the M. adverb.
In any case, this is only possible for a referentially transparent function.
Memoization is also encountered as an easily accessible design pattern within term-rewrite based languages such as Wolfram Language.
Bioinformatics
Dynamic programming is widely used in bioinformatics for the tasks such as sequence alignment, protein folding, RNA structure prediction and protein-DNA binding.
The first dynamic programming algorithms for protein-DNA binding were developed in the 1970s independently by Charles DeLisi in USA and Georgii Gurskii and Alexander Zasedatelev in USSR.
Recently these algorithms have become very popular in bioinformatics and computational biology, particularly in the studies of nucleosome positioning and transcription factor binding.
Examples: Computer algorithms
Dijkstra's algorithm for the shortest path problem
From a dynamic programming point of view, Dijkstra's algorithm for the shortest path problem is a successive approximation scheme that solves the dynamic programming functional equation for the shortest path problem by the Reaching method.
In fact, Dijkstra's explanation of the logic behind the algorithm, namely
is a paraphrasing of Bellman's famous Principle of Optimality in the context of the shortest path problem.
Fibonacci sequence
Using dynamic programming in the calculation of the nth member of the Fibonacci sequence improves its performance greatly.
Here is a naïve implementation, based directly on the mathematical definition:
Notice that if we call, say, fib(5), we produce a call tree that calls the function on the same value many different times:
- fib(5)
- fib(4) + fib(3)
- (fib(3) + fib(2)) + (fib(2) + fib(1))
- ((fib(2) + fib(1)) + (fib(1) + fib(0))) + ((fib(1) + fib(0)) + fib(1))
- (((fib(1) + fib(0)) + fib(1)) + (fib(1) + fib(0))) + ((fib(1) + fib(0)) + fib(1))
In particular, fib(2) was calculated three times from scratch.
In larger examples, many more values of fib, or subproblems, are recalculated, leading to an exponential time algorithm.
Now, suppose we have a simple map object, m, which maps each value of fib that has already been calculated to its result, and we modify our function to use it and update it.
The resulting function requires only O(n) time instead of exponential time (but requires O(n) space):
This technique of saving values that have already been calculated is called memoization; this is the top-down approach, since we first break the problem into subproblems and then calculate and store values.
In the bottom-up approach, we calculate the smaller values of fib first, then build larger values from them.
This method also uses O(n) time since it contains a loop that repeats n − 1 times, but it only takes constant (O(1)) space, in contrast to the top-down approach which requires O(n) space to store the map.
In both examples, we only calculate fib(2) one time, and then use it to calculate both fib(4) and fib(3), instead of computing it every time either of them is evaluated.
A type of balanced 0–1 matrix
Checkerboard
Sequence alignment
In genetics, sequence alignment is an important application where dynamic programming is essential.
Typically, the problem consists of transforming one sequence into another using edit operations that replace, insert, or remove an element.
Each operation has an associated cost, and the goal is to find the sequence of edits with the lowest total cost.
The problem can be stated naturally as a recursion, a sequence A is optimally edited into a sequence B by either:
- inserting the first character of B, and performing an optimal alignment of A and the tail of B
- deleting the first character of A, and performing the optimal alignment of the tail of A and B
- replacing the first character of A with the first character of B, and performing optimal alignments of the tails of A and B.
The partial alignments can be tabulated in a matrix, where cell (i,j) contains the cost of the optimal alignment of A[1..i] to B[1..j].
The cost in cell (i,j) can be calculated by adding the cost of the relevant operations to the cost of its neighboring cells, and selecting the optimum.
Different variants exist, see Smith–Waterman algorithm and Needleman–Wunsch algorithm.
Tower of Hanoi puzzle
The Tower of Hanoi or Towers of Hanoi is a mathematical game or puzzle.
It consists of three rods, and a number of disks of different sizes which can slide onto any rod.
The puzzle starts with the disks in a neat stack in ascending order of size on one rod, the smallest at the top, thus making a conical shape.
The objective of the puzzle is to move the entire stack to another rod, obeying the following rules:
- Only one disk may be moved at a time.
- Each move consists of taking the upper disk from one of the rods and sliding it onto another rod, on top of the other disks that may already be present on that rod.
- No disk may be placed on top of a smaller disk.
The dynamic programming solution consists of solving the functional equation
- S(n,h,t) = S(n-1,h, not(h,t)) ; S(1,h,t) ; S(n-1,not(h,t),t)
where n denotes the number of disks to be moved, h denotes the home rod, t denotes the target rod, not(h,t) denotes the third rod (neither h nor t), ";" denotes concatenation, and
- S(n, h, t) := solution to a problem consisting of n disks that are to be moved from rod h to rod t.
For n=1 the problem is trivial, namely S(1,h,t) = "move a disk from rod h to rod t" (there is only one disk left).
The number of moves required by this solution is 2 − 1.
If the objective is to maximize the number of moves (without cycling) then the dynamic programming functional equation is slightly more complicated and 3 − 1 moves are required.
Egg dropping puzzle
The following is a description of the instance of this famous puzzle involving N=2 eggs and a building with H=36 floors:
- Suppose that we wish to know which stories in a 36-story building are safe to drop eggs from, and which will cause the eggs to break on landing (using U.S. English terminology, in which the first floor is at ground level). We make a few assumptions:
- An egg that survives a fall can be used again.
- A broken egg must be discarded.
- The effect of a fall is the same for all eggs.
- If an egg breaks when dropped, then it would break if dropped from a higher window.
- If an egg survives a fall, then it would survive a shorter fall.
- It is not ruled out that the first-floor windows break eggs, nor is it ruled out that eggs can survive the 36th-floor windows.
- If only one egg is available and we wish to be sure of obtaining the right result, the experiment can be carried out in only one way. Drop the egg from the first-floor window; if it survives, drop it from the second-floor window. Continue upward until it breaks. In the worst case, this method may require 36 droppings. Suppose 2 eggs are available. What is the lowest number of egg-droppings that is guaranteed to work in all cases?
To derive a dynamic programming functional equation for this puzzle, let the state of the dynamic programming model be a pair s = (n,k), where
- n = number of test eggs available, n = 0, 1, 2, 3, ..., N − 1.
- k = number of (consecutive) floors yet to be tested, k = 0, 1, 2, ..., H − 1.
For instance, s = (2,6) indicates that two test eggs are available and 6 (consecutive) floors are yet to be tested.
The initial state of the process is s = (N,H) where N denotes the number of test eggs available at the commencement of the experiment.
The process terminates either when there are no more test eggs (n = 0) or when k = 0, whichever occurs first.
If termination occurs at state s = (0,k) and k > 0, then the test failed.
Now, let
- W(n,k) = minimum number of trials required to identify the value of the critical floor under the worst-case scenario given that the process is in state s = (n,k).
Then it can be shown that
- W(n,k) = 1 + min{max(W(n − 1, x − 1), W(n,k − x)): x = 1, 2, ..., k }
with W(n,0) = 0 for all n > 0 and W(1,k) = k for all k. It is easy to solve this equation iteratively by systematically increasing the values of n and k.
Faster DP solution using a different parametrization
Matrix chain multiplication
Credits to the contents of this page go to the authors of the corresponding Wikipedia page: en.wikipedia.org/wiki/Dynamic programming.