An Introduction to Optimization, 4th Edition,Table of contents
Featuring an elementary introduction to artificial neural networks, convex optimization, and multi-objective optimization, the Fourth Edition also offers: A new chapter on integer programming Expanded coverage of one-dimensional methods Updated and expanded sections on linear matrix inequalities Numerous new exercises at the end of each chapter MATLAB exercises and drill problems to reinforce the discussed theory and algorithms Numerous diagrams and figures that complement the written This new edition explores the essential topics of unconstrained optimization problems, linear programming problems, and nonlinear constrained optimization. The authors also present an Download An Introduction to Optimization 4th Edition Solution Manual PDF for free. Report "An Introduction to Optimization 4th Edition Solution Manual" Please fill this form, we will An Introduction to Optimization PDF Download Are you looking for read ebook online? Search for your book and save it on your Kindle device, PC, phones or tablets. Download An Introduction Featuring an elementary introduction to artificial neural networks, convex optimization, and multi-objective optimization, the Fourth Edition also offers: A new chapter on integer programming ... read more
Get Book. Skip to content. Mathematics Author : Edwin K. guides and leads the reader through the learning path. Basic definitions and notations are provided in addition to the related fundamental background for linear algebra, geometry, and calculus. This new edition explores the essential topics of unconstrained optimization problems, linear programming problems, and nonlinear constrained optimization. The authors also present an optimization perspective on global search methods and include discussions on genetic algorithms, particle swarm optimization, and the simulated annealing algorithm. Featuring an elementary introduction to artificial neural networks, convex optimization, and multi-objective optimization, the Fourth Edition also offers: A new chapter on integer programming Expanded coverage of one-dimensional methods Updated and expanded sections on linear matrix inequalities Numerous new exercises at the end of each chapter MATLAB exercises and drill problems to reinforce the discussed theory and algorithms Numerous diagrams and figures that complement the written presentation of key concepts MATLAB M-files for implementation of the discussed theory and algorithms available via the book's website Introduction to Optimization, Fourth Edition is an ideal textbook for courses on optimization theory and methods.
In addition, the book is a useful reference for professionals in mathematics, operations research, electrical engineering, economics, statistics, and business. Released on Author : Edwin K. Optimization is a precise procedure using design constraints and criteria to enable the planner to find the optimal solution. Optimization techniques have been applied in numerous fields to deal with different practical problems. This book is designed to give the reader a sense of the challenge of analyzing a given situation and formulating a model for it while explaining the assumptions and inner structure of the methods discussed as fully as possible.
It includes real-world examples and applications making the book accessible to a broader readership. Features Each chapter begins with the Learning Outcomes LO section, which highlights the critical points of that chapter. All learning outcomes, solved examples and questions are mapped to six Bloom Taxonomy levels BT Level. Book offers fundamental concepts of optimization without becoming too complicated. A wide range of solved examples are presented in each section after the theoretical discussion to clarify the concept of that section. A separate chapter on the application of spreadsheets to solve different optimization techniques. At the end of each chapter, a summary reinforces key ideas and helps readers recall the concepts discussed. The wide and emerging uses of optimization techniques make it essential for students and professionals. This book serves as a textbook for UG and PG students of science, engineering, and management programs.
With respect to this basis, the matrix A is diagonal [i. Taking the inner product of Ax with x yields On the other hand, The above follows from the definition of the inner product on n. We prove the result for the case when the n eigenvalues are distinct. For a general proof, see [62, p. If the basis {v , v , …, v } is normalized so that each element has norm of 1 2 n unity, then defining the matrix we have and hence A matrix whose transpose is its inverse is said to be an orthogonal matrix. Furthermore, the dimension of a subspace is equal to the maximum number of linearly independent vectors in. Thus, The orthogonal complement of is also a subspace see Exercise 3. We call the representation above the orthogonal decomposition of x with respect to. In the subsequent discussion we use the following notation. Let the range, or image, of A be denoted and the nullspace, or kernel, of A be denoted Note that A and A are subspaces see Exercise 3.
Hence, x A. Recall that the minors of a matrix Q are the determinants of the matrices obtained by successively removing rows and columns from Q. The principal minors are det Q itself and the determinants of matrices obtained by successively removing an ith row and an ith column. That is, the principal minors are The leading principal minors are det Q and the minors obtained by successively removing the last row and the last column. Let {e1,e2, …,en} be the natural basis for n, and let be a given vector in n. Let {v,v2,…,vn} be another basis for n. Then, the vector x is represented in the new basis as , where Accordingly, the quadratic form can be written as where Note that. Hence, if the quadratic form is positive definite, then all leading principal minors must be positive. A necessary condition for a real quadratic form to be positive semidefinite is that the leading principal minors be nonnegative.
However, this is not a sufficient condition see Exercise 3. In fact, a real quadratic form is positive semidefinite if and only if all principal minors are nonnegative for a proof of this fact, see [44, p. A symmetric matrix Q is said to be positive definite if the quadratic form x Qx is positive definite. The symmetric matrix Q is indefinite if it is neither positive semidefinite nor negative semidefinite. An alternative method involves checking the eigenvalues of Q, as stated below. For this, the result follows. For this, we use T as above and define which is easily verified to have the desired properties. In summary, we have presented two tests for definiteness of quadratic forms and symmetric matrices. We point out again that nonnegativity of leading principal minors is a necessary but not a sufficient condition for positive semidefiniteness. Because the set of matrices mxn can be viewed as the real vector space mn, matrix norms should be no different from regular vector norms.
An example of a matrix norm is the Frobenius norm, defined as where A m × n. Note that the Frobenius norm is equivalent to the Euclidean norm on mn. For our purposes, we consider only matrix norms that satisfy the following additional condition: 4. It turns out that the Frobenius norm satisfies condition 4 as well. In many problems, both matrices and vectors appear simultaneously. Therefore, it is convenient to construct the norm of a matrix in such a way that it will be related to vector norms. To this end we consider a special class of matrix norms, called induced norms. Let · n and · m be vector norms on n and m, respectively. We say that the matrix norm is induced by, or is compatible with, the given vector norms if for any matrix A m × n and any vector x n, the following inequality is satisfied: We can define an induced matrix norm as that is, A is the maximum of the norms of the vectors Ax where the vector x runs over the set of all vectors with unit norm.
When there is no ambiguity, we omit the subscripts m and n from · m and · n. Because of the continuity of a vector norm see Exercise 2. This fact follows from the theorem of Weierstrass see Theorem 4. The induced norm satisfies conditions 1 to 4 and the compatibility condition, as we prove below. Proof of Condition 1. Proof of Condition 2. Proof of Compatibility Condition. Proof of Condition 3. Then, we have which shows that condition 3 holds. Proof of Condition 4. Then, we have which shows that condition 4 holds. We have The matrix A A is symmetric and positive semidefinite. Using arguments similar to the above, we can deduce the following important inequalities. If an n × n matrix P is real symmetric positive definite, then where λmin P denotes the smallest eigenvalue of P, and λmax P denotes the largest eigenvalue of P. Example 3. For a more complete but still basic treatment of topics in linear algebra as discussed in this and the preceding chapter, see [47], [66], [95], [].
For a treatment of matrices, we refer the reader to [44], [62]. Numerical aspects of matrix computations are discussed in [41], [53]. EXERCISES 3. Hint: Use Exercise 3. Show that a. Is this matrix positive definite, negative definite, or indefinite? Is this matrix positive definite, negative definite, or indefinite on the subspace 3. Show that ·,· Q satisfies conditions 1 to 4 for inner products see Section 2. Show that the matrix norm induced by these vector norms is given by where aij is the i,j th element of A m × n. Define the norm · 1 on m similarly. Chapter 4 CONCEPTS FROM GEOMETRY 4. The line segment between two points x and y in n is the set of points on the straight line joining points x and y see Figure 4. Note that if z lies on the line segment between x and y, then Figure 4.
where α is a real number from the interval [0,1]. Hence, the line segment between x and y can be represented as 4. Thus, straight lines are hyperplanes in 2. In 3 three-dimensional space , hyperplanes are ordinary planes. By translating a hyperplane so that it contains the origin of n, it becomes a subspace of n see Figure 4. Figure 4. We call the vector u the normal to the hyperplane H. A linear variety is a set of the form for some matrix A m × n and vector b m. If the dimension of the linear variety is less than n, then it is the intersection of a finite number of hyperplanes. Examples of convex sets include the following: The empty set A set consisting of a single point A line or a line segment A subspace A hyperplane A linear variety A half-space n Theorem 4.
If Θ is a convex set and β is a real number, then the set is also convex. If Θ1 and Θ2 are convex sets, then the set is also convex. The intersection of any collection of convex sets is convex see Figure 4. Let βv1, βv2 βΘ, where v1, v2 Θ. Hence, and thus βΘ is convex. Let C be a collection of convex sets. Let where represents the intersection of all elements in C. Then, x1, x2 Θ for each Θ C. For example, in Figure 4. The neighborhood is also called a ball with radius ε and center x. A point x S is said to be an interior point of the set S if the set S contains some neighborhood of x; that is, if all points within some neighborhood of x are also in S see Figure 4. The set of all the interior points of S is called the interior of S. A point x is said to be a boundary point of the set S if every neighborhood of x contains a point in S and a point not in S see Figure 4.
Note that a boundary point of S may or may not be an element of S. The set of all boundary points of S is called the boundary of S. A set S is said to be open if it contains a neighborhood of each of its points; that is, if each of its points is an interior point, or equivalently, if S contains no boundary points. A set S is said to be closed if it contains its boundary see Figure 4. We can show that a set is closed if and only if its complement is open. A set that is contained in a ball of finite radius is said to be bounded. A set is compact if it is both closed and bounded. Compact sets are important in optimization problems for the following reason. Theorem 4. In other words, f achieves its minimum on Ω. See [, p. A hyperplane passing through y is called a hyperplane of support or supporting hyperplane of the set Θ if the entire set Θ lies completely in one of the two half-spaces into which this hyperplane divides the space n.
Recall that by Theorem 4. In what follows we are concerned with the intersection of a finite number of half-spaces. A set that can be expressed as the intersection of a finite number of half-spaces is called a convex polytope see Figure 4. A nonempty bounded polytope is called a polyhedron see Figure 4. Furthermore, there exists only one k-dimensional linear variety containing Θ, called the carrier of the polyhedron Θ, and k is called the dimension of Θ. For example, a zero- dimensional polyhedron is a point of n, and its carrier is itself. A one- dimensional polyhedron is a segment, and its carrier is the straight line on which it lies. For example, the boundary of a one- dimensional polyhedron consists of two points that are the endpoints of the segment.
A zero-dimensional face of a polyhedron is called a vertex, and a one-dimensional face is called an edge. EXERCISES 4. CHAPTER 5 ELEMENTS OF CALCULUS 5. Thus, a sequence of real numbers can be viewed as a set of numbers {x1, x2,…, xk,…}, which is often also denoted as {xk} or sometimes as , to indicate explicitly the range of values that k can take. Similarly, we can define decreasing and nonincreasing sequences. Nonincreasing or nondecreasing sequences are called monotone sequences. In this case we write or A sequence that has a limit is called a convergent sequence. The notion of a sequence can be extended to sequences with elements in n. Specifically, a sequence in n is a function whose domain is the set of natural numbers 1, 2,…, k,… and whose range is contained in n.
We use the notation {x 1 , x 2 ,…} or {x k } for sequences in n. For limits of sequences in n, we need to replace absolute values with vector norms. Theorem 5. We prove this result by contradiction. Suppose that a sequence {x k } has two different limits, say x1 and x2. In this case, we say that {xk} is bounded above. In this case, we say that {xk} is bounded below. Clearly, a sequence is bounded if it is both bounded above and bounded below. Any sequence {xk} in that has an upper bound has a least upper bound also called the supremum , which is the smallest number B that is an upper bound of {xk}.
Similarly, any sequence {xk} in that has a lower bound has a greatest lower bound also called the infimum. We prove the theorem for nondecreasing sequences. The proof for nonincreasing sequences is analogous. Suppose that we are given a sequence {x k } and an increasing sequence of natural numbers {mk}. The sequence is called a subsequence of the sequence {x k }. A subsequence of a given sequence can thus be obtained by neglecting some elements of the given sequence. Let {x mk } be a subsequence of {x k }, where {mk} is an increasing sequence of natural numbers. Next, we proceed by induction by assuming that mk. This means that It turns out that any bounded sequence contains a convergent subsequence.
This result is called the Bolzano-Weierstrass theorem see [2, p. Consider a function and a point x0 n. It turns out that f is continuous at x0 if and only if for any convergent sequence {x k } with limit x0, we have see [2, p. Therefore, using the notation introduced above, the function f is continuous at x0 if and only if We end this section with some results involving sequences and limits of matrices. These results are useful in the analysis of algorithms e. We say that a sequence {Ak} of m × n matrices converges to the m × n matrix A if Lemma 5. To prove this theorem, we use the Jordan form see, e. Lemma 5. The necessity of the condition is obvious. By Lemma 5. Thus, which completes the proof. A matrix-valued function is continuous at a point ξ0 r if Lemma 5. We follow []. We have where Thus, and Because A is continuous at ξ0, for all ξ close enough to ξ0, we have where θ 0,1. Then, and exists. Therefore, Because we obtain which completes the proof.
We wish to find an affine function that approximates f near the point x0. The function f is said to be differentiable on Ω if f is differentiable at every point of its domain Ω. The affine function is therefore given by This affine function is tangent to f at x0 see Figure 5. Figure 5. But Lej is the jth column of the matrix L. On the other hand, the vector xj differs from x0 only in the jth coordinate, and in that coordinate the difference is just the number t. Therefore, the left side of the preceding equation is the partial derivative Because vector limits are computed by taking the limit of each coordinate function, it follows that if then and the matrix L has the form The matrix L is called the Jacobian matrix, or derivative matrix, of f at x0, and is denoted Df x0. For convenience, we often refer to Df x0 simply as the derivative of f at x0.
We summarize the foregoing discussion in the following theorem. The columns of the derivative matrix Df x0 are vector partial derivatives. The vector is a tangent vector at x0 to the curve f obtained by varying only the jth coordinate of x. The matrix D2f x is called the Hessian matrix of f at x, and is often also denoted F x. In this case, we write f 1. If the components of f have continuous partial derivatives of order p, then we write f p. However, if the second partial derivatives of f are not continuous, then there is no guarantee that the Hessian is symmetric, as shown in the following well-known example. Example 5. We have We now proceed with computing the components of the Hessian and evaluating them at the point [0,0] one by one. By definition, if the limit exists.
By Theorem 5. Then, h is also differentiable and We end this section with a list of some useful formulas from multivariable calculus. In each case, we compute the derivative with respect to x. Then, It follows from the first formula above that if y n, then It follows from the second formula above that if Q is a symmetric matrix, then In particular, 5. A plot of the function f is shown in Figure 5. The level sets of f at levels 0. These level sets have a particular shape resembling bananas. But since γ lies on S, we have that is, h is constant. The curve on the shaded surface in Figure 5.
The notion of the gradient of a function has an alternative useful interpretation in terms of the tangent hyperplane to its graph. The point is a point on the graph of f. If f is differentiate at ξ, then the graph admits a nonvertical tangent hyperplane at. Observe that for the hyperplane to be tangent to the graph of f, the functions f and z must have the same partial derivatives at the point x0. We have Denote by gm x an auxiliary function obtained from Rm by replacing a by x. Applying the mean-value theorem yields where θ 0,1. To discuss this property further, we introduce the order symbols, O and o. Then, we write 1. Suppose that f m. Suppose that f 2. It is easy to see that M is a matrix whose rows are the rows D f evaluated at points that lie on the line segment joining x and y these points may differ from row to row.
For further reading in calculus, consult [13], [81], [83], [], [], []. EXERCISES 5. Find the Hessian F x. Evaluate using the chain rule. Evaluate and using the chain rule. Find in terms of t. Neglect terms of order three or higher. The variables x1, …, xn are often referred to as decision variables. The set Ω is a subset of n called the constraint set or feasible set. This vector is called the minimizer of f over Ω. It is possible that there may be many minimizers. In this case, finding any of the minimizers will suffice. There are also optimization problems that require maximization of the objective function, in which case we seek maximizers. Minimizers and maximizes are also called extremizers. Therefore, we can confine our attention to minimization problems without loss of generality. The problem above is a general form of a constrained optimization problem, because the decision variables are constrained to be in the constraint set Ω.
In this chapter we discuss basic properties of the general optimization problem above, which includes the unconstrained case. We refer to such constraints as functional constraints. In Parts III and IV we consider constrained optimization problems with functional constraints. In considering the general optimization problem above, we distinguish between two kinds of minimizers, as specified by the following definitions. Definition 6. In Figure 6. Figure 6. In other words, given a real-valued function f, the notation arg min f x denotes the argument that minimizes the function f a point in the domain of f , assuming that such a point is unique if there is more than one such point, we pick one arbitrarily.
Strictly speaking, an optimization problem is solved only when a global minimizer is found. However, global minimizers are, in general, difficult to find. Therefore, in practice, we often have to be satisfied with finding local minimizers. Then, and Given an optimization problem with constraint set Ω, a minimizer may lie either in the interior or on the boundary of Ω. To study the case where it lies on the boundary, we need the notion of feasible directions. To compute the directional derivative above, suppose that x and d are given. Example 6. We are now ready to state and prove the following theorem. Theorem 6. An alternative way to express the FONC is for all feasible directions d. Using directional derivatives, an alternative proof of Theorem 6.
In this case, any direction is feasible, and we have the following result. Corollary 6. A plot of the level sets of f is shown in Figure 6. A mobile user also called a mobile is located at position x see Figure 6. There are two base station antennas, one for the primary base station and another for the neighboring base station. Both antennas are transmitting signals to the mobile user, at equal power. However, the power of the received signal as measured by the mobile is the reciprocal of the squared distance from the associated antenna primary or neighboring base station.
We are interested in finding the position of the mobile that maximizes the signal-to-interference ratio, which is the ratio of the signal power received from the primary base station to the signal power received from the neighboring base station. We use the FONC to solve this problem. The next example illustrates that in some problems the FONC is not helpful for eliminating candidate local minimizers. However, in such cases, there may be a recasting of the problem into an equivalent form that makes the FONC useful. Which points in Ω satisfy the FONC for this set-constrained problem? Based on part b, is the FONC for this set-constrained problem useful for eliminating local-minimizer candidates?
Solution: a. Because of part a, all points in Ω satisfy the FONC for this set- constrained problem. No, the FONC for this set-constrained problem is not useful for eliminating local-minimizer candidates. We now derive a second-order necessary condition that is satisfied by a local minimizer. We prove the result by contradiction. Thus, Corollary 6. The result then follows from Corollary 6. In the examples below, we show that the necessary conditions are not sufficient. The point 0 satisfies the FONC but not SONC; this point is not a minimizer. It is a strict local minimizer. In this chapter we presented a theoretical basis for the solution of nonlinear unconstrained problems. In the following chapters we are concerned with iterative methods of solving such problems. Such methods are of great importance in practice.
Indeed, suppose that one is confronted with a highly nonlinear function of 20 variables. Then, the FONC requires the solution of 20 nonlinear simultaneous equations for 20 variables. These equations, being nonlinear, will normally have multiple solutions. We begin our discussion of iterative methods in the next chapter with search methods for functions of one variable. EXERCISES 6. Suppose that 0 is an interior point of Ω. Suppose that 0 is a local minimizer. Give an example of f such that the FONC, SONC, and TONC in part a hold at the interior point 0, but 0 is not a local minimizer of f over Ω. Show that your example is correct. Suppose that f is a third-order polynomial.
If 0 satisfies the FONC, SONC, and TONC in part a , then is this sufficient for 0 to be a local minimizer? Find the gradient and Hessian of f at the point [1,1]. Find the directional derivative of f at [1,1] with respect to a unit vector in the direction of maximal rate of increase. Find a point that satisfies the FONC interior case for f. Does this point satisfy the SONC for a minimizer? Find the directional derivative of f at [0,1] in the direction [1,0]. Find all points that satisfy the first-order necessary condition for f. Does f have a minimizer? If it does, then find all minimizer s ; otherwise, explain why it does not. Answer each of the following questions, showing complete justification. Find all point s satisfying the FONC. Which of the point s in part a satisfy the SONC? Which of the point s in part a are local minimizers?
Hint: Draw a picture with the constraint set and level sets of f. At what point s if any is the second-order necessary condition satisfied? Find the number such that the sum of the squared difference between and the numbers above is minimized assuming that the solution exists. Hint: 1 Maximizing θ is equivalent to maximizing tan θ. A heartbeat sensor is located at position x see Figure 6. The speeds at which the vehicle travels on land and water are v1 and v2, respectively. Suppose that the vehicle traverses a path that minimizes the total time taken to travel from A to B. Use the first-order necessary condition to show that for the optimal path above, the angles θ1 and θ2 in Figure 6.
Does the minimizer for the problem in part a satisfy the second-order sufficient condition? If the first buyer receives a fraction x1 of the piece of land, the buyer will pay you U1 x1 dollars. Similarly, the second buyer will pay you U2 x2 dollars for a fraction of x2 of the land. Your goal is to sell parts of your land to the two buyers so that you maximize the total dollars you receive. Other than the constraint that you can only sell whatever land you own, there are no restrictions on how much land you can sell to each buyer. Formulate the problem as an optimization problem of the kind by specifying f and Ω. Draw a picture of the constraint set. Find all feasible points that satisfy the first-order necessary condition, giving full justification. Among those points in the answer of part b, find all that also satisfy the second-order necessary condition. Suppose that we wish to minimize f over 2. Find all points satisfying the FONC. Do these points satisfy the SONC?
Show that we cannot have a solution lying in the interior of Ω. This is a linear programming problem see Part III. Find the vector x n such that the average squared distance norm between and x 1 ,…, x p , is minimized. Use the SOSC to prove that the vector found above is a strict local minimizer. How is related to the centroid or center of gravity of the given set of points {x 1 ,…,x p }? The constants q and r reflect the relative weights of these two objectives. The approach is to use an iterative search algorithm, also called a linesearch method. One-dimensional search methods are of interest for the following reasons. First, they are special cases of search methods used in multivariable problems. Second, they are used as part of general multivariable algorithms as described later in Section 7.
In an iterative algorithm, we start with an initial candidate solution x 0 and generate a sequence of iterates x 1 , x 2 ,…. The only property that we assume of the objective function f is that it is unimodal, which means that f has only one local minimizer. An example of such a function is depicted in Figure 7. Figure 7. The methods we discuss are based on evaluating the objective function at different points in the interval [a0,b0]. We choose these points in such a way that an approximation to the minimizer of f may be achieved in as few evaluations as possible. Consider a unimodal function f of one variable and the interval [a0,b0].
If we evaluate f at only one intermediate point of the interval, we cannot narrow the range within which we know the minimizer is located. We have to evaluate f at two intermediate points, as illustrated in Figure 7. We choose the intermediate points in such a way that the reduction in the range is symmetric, in the sense that Figure 7. where We then evaluate f at the intermediate points. However, we would like to minimize the number of objective function evaluations while reducing the width of the uncertainty interval. Because a1 is already in the uncertainty interval and f a1 is already known, we can make a1 coincide with b2.
Thus, only one new evaluation of f at a2 would be necessary. To find the value of ρ that results in only one new evaluation of f, see Figure 7. Without loss of generality, imagine that the original range [a0, b0] is of unit length. Then, to have only one new evaluation of f it is enough to choose ρ so that Figure 7. This rule was referred to by ancient Greek geometers as the golden section. Using the golden section rule means that at every stage of the uncertainty range reduction except the first , the objective function f need only be evaluated at one new point. Hence, N steps of reduction using the golden section method reduces the range by the factor Example 7. We wish to locate this value of x to within a range of 0.
After N stages the range [0,2] is reduced by 0. Iteration 1. We evaluate f at two intermediate points a1 and b1. Hence, the uncertainty interval is further reduced to Iteration 4. To derive the strategy for selecting evaluation points, consider Figure 7. From this figure we see that it is sufficient to choose the ρk such that Figure 7. Suppose that we are given a sequence ρ1, ρ2, … satisfying the conditions above and we use this sequence in our search algorithm. Then, after N iterations of the algorithm, the uncertainty range is reduced by a factor of Depending on the sequence ρ1, ρ2, …, we get a different reduction factor. The natural question is as follows: What sequence ρ1, ρ2, … minimizes the reduction factor above?
This problem is a constrained optimization problem that can be stated formally as Before we give the solution to the optimization problem above, we need to introduce the Fibonacci sequence F1, F2, F3, …. This sequence is defined as follows. The resulting algorithm is called the Fibonacci search method. We present a proof for the optimality of the Fibonacci search method later in this section. In the Fibonacci search method, the uncertainty range is reduced by the factor Because the Fibonacci method uses the optimal values of ρ1, p2, …, the reduction factor above is less than that of the golden section method. In other words, the Fibonacci method is better than the golden section method in that it gives a smaller final uncertainty range. We point out that there is an anomaly in the final iteration of the Fibonacci search method, because Recall that we need two intermediate points at each stage, one that comes from a previous iteration and another that is a new evaluation point.
In other words, the new evaluation point is just to the left or right of the midpoint of the uncertainty interval. This modification to the Fibonacci method is, of course, of no significant practical consequence. Therefore, in the worst case, the reduction factor in the uncertainty range for the Fibonacci method is Example 7. We start with We then compute The range is reduced to Iteration 2. We compute The range is reduced to Iteration 4. We now turn to a proof of the optimality of the Fibonacci search method. Skipping the rest of this section does not affect the continuity of the presentation. Therefore, the constraints above reduce to To proceed, we need the following technical lemmas.
In the statements of the lemmas, we assume that r1, r2, … is a sequence that satisfies Lemma 7. We proceed by induction. We have where we used the formation law for the Fibonacci sequence. Lemma 7. We have By Lemma 7. We are now ready to prove the optimality of the Fibonacci search method and the uniqueness of this optimal solution. Theorem 7. In other words, the values of r1, …, rN used in the Fibonacci search method form a unique solution to the optimization problem. By substituting expressions for r1, …, rN from Lemma 7. Hence, From the above we see that if and only if This is simply the value of r1 for the Fibonacci search method.
Note that fixing r1 determines r2, ··· rN uniquely. For further discussion on the Fibonacci search method and its variants, see []. As before, we assume that the objective function f is unimodal. The bisection method is a simple algorithm for successively reducing the uncertainty interval based on evaluations of the derivative. In other words, we reduce the uncertainty interval to [a0, x 0 ]. In this case, we reduce the uncertainty interval to [x 0 , b0]. With the new uncertainty interval computed, we repeat the process iteratively. At each iteration k, we compute the midpoint of the uncertainty interval. Call this point x k. Two salient features distinguish the bisection method from the golden section and Fibonacci methods.
This factor is smaller than in the golden section and Fibonacci methods. Example 7. The golden section method requires at least four stages of reduction. We can fit a quadratic function through x k that matches its first and second derivatives with that of the function f. Then, instead of minimizing f, we minimize its approximation q. Thus, an initial approximation to the root is very important. We obtain Example 7. Given measurements V1, …, Vn of the voltage at times t1, …, tn, respectively, we wish to find the best estimate of R. By the best estimate we mean the value of R that minimizes the total squared error between the measured voltages and the voltages predicted by the model. We derive an algorithm to find the best estimate of R using the secant method.
Three points are needed to initialize the iterations. The method is also sometimes called inverse parabolic interpolation. An approach similar to fitting or interpolation based on higher-order polynomials is possible. It is often practically advantageous to combine multiple methods, to overcome the limitations in any one method. For example, the golden section method is more robust but slower than inverse parabolic interpolation. This interval is also called a bracket, and procedures for finding such a bracket are called bracketing methods. A simple bracketing procedure is as follows. If it holds, then again we are done—the desired bracket is [x1, x3]. Otherwise, we continue with this process until the function increases. Typically, each new point chosen involves an expansion in distance between successive test points. For example, we could double the distance between successive points, as illustrated in Figure 7. In particular, iterative algorithms for solving such optimization problems to be discussed in the following chapters typically involve a line search at every iteration.
Note that choice of αk involves a one-dimensional minimization. This choice ensures that under appropriate conditions, Figure 7. Any of the one-dimensional methods discussed in this chapter including bracketing can be used to minimize ϕk. We may, for example, use the secant method to find αk. In this case we need the derivative of ϕk, which is This is obtained using the chain rule. Of course, other one-dimensional search methods may be used for line search see, e. Linesearch algorithms used in practice involve considerations that we have not yet discussed thus far. First, determining the value of αk that exactly minimizes ϕk may be computationally demanding; even worse, the minimizer of ϕk may not even exist.
Second, practical experience suggests that it is better to allocate more computational time on iterating the multidimensional optimization algorithm rather than performing exact line searches. These considerations led to the development of conditions for terminating linesearch algorithms that would result in low-accuracy line searches while still securing a sufficient decrease in the value of the f from one iteration to the next. The basic idea is that we have to ensure that the step size αk is not too small or too large. Some commonly used termination conditions are as follows. We start with some candidate value for the step size αk. If this candidate value satisfies a prespecified termination condition usually the first Armijo inequality , then we stop and use it as the step size. The algorithm backtracks from the initial value until the termination condition holds. For more information on practical linesearch methods, we refer the reader to [43, pp. C], [49], and [50].
Give an example of a desired final uncertainty range where the golden section method requires at least four iterations, whereas the Fibonacci method requires only three. You may choose an arbitrarily small value of ε for the Fibonacci method. Plot f x versus x over the interval [1, 2]. Display all intermediate steps using a table: c. Display all intermediate steps using a table: d. Use MATLAB to plot f x versus x over the interval [1, 2], and verify that f is unimodal over [1, 2]. Write a simple MATLAB program to implement the golden section method that locates the minimizer of f over [1, 2] to within an uncertainty of 0. Display all intermediate steps using a table as in Exercise 7. Note that 0 is the unique zero of g. Find an initial condition x 0 such that the algorithm cycles [i. You need not explicitly calculate the initial condition; it suffices to provide an equation that the initial condition must satisfy.
Hint: Draw a graph of g. For what values of the initial condition does the algorithm converge? You might also find it useful to experiment with your algorithm by writing a MATLAB program. Note that three points are needed to initialize the algorithm. Also determine the value of g at the solution obtained. The arguments to this function are the name of the M-file for the gradient, the current point, and the search direction. m is the M-file containing the gradient, × is the starting line search point, d is the search direction, and alpha is the value returned by the function [which we use in the following chapters as the step size for iterative algorithms see, e. The rationale for the stopping criterion above is that we want to reduce the directional derivative of f in the direction d by the specified fraction ε. To initialize the line search, apply the bracketing procedure in Figure 7.
Apply the golden section method to reduce the width of the uncertainty region to 0. Organize the results of your computation in a table format similar to that of Exercise 7. Repeat the above using the Fibonacci method. Goodman for furnishing us with references [49] and [50]. CHAPTER 8 GRADIENT METHODS 8. These methods use the gradient of the given function. In our discussion we use such terms as level sets, normal vectors, and tangent vectors. These notions were discussed in some detail in Part I. Figure 8. Thus, the direction of maximum rate of increase of a real- valued differentiable function at a point is orthogonal to the level set of the function through that point.
Hence, the direction of negative gradient is a good direction to search if we want to find a function minimizer. We proceed as follows. To formulate an algorithm that implements this idea, suppose that we are given a point x k. This procedure leads to the following iterative algorithm: We refer to this as a gradient descent algorithm or simply a gradient algorithm. The gradient varies as the search proceeds, tending to zero as we approach the minimizer. We have the option of either taking very small steps and reevaluating the gradient at every step, or we can take large steps each time. The first approach results in a laborious method of reaching the minimizer, whereas the second approach may result in a more zigzag path to the minimizer.
The advantage of the second approach is possibly fewer gradient evaluations. Among many different methods that use this philosophy the most popular is the method of steepest descent, which we discuss next. Gradient methods are simple to implement and often perform well. For this reason, they are used widely in practical applications. For a discussion of applications of the steepest descent method to the computation of optimal controllers, we recommend [85, pp. In Chapter 13 we apply a gradient method to the training of a class of neural networks. A typical sequence resulting from the method of steepest descent is depicted in Figure 8.
Observe that the method of steepest descent moves in orthogonal steps, as stated in the following proposition. Proposition 8. Hence, using the FONC and the chain rule gives us which completes the proof. Note that as each new point is generated by the steepest descent algorithm, the corresponding value of the function f decreases in value, as stated below. Hence, which completes the proof. In Proposition 8. We can use the above as the basis for a stopping termination criterion for the algorithm. To avoid dividing by very small numbers, we can modify these stopping criteria as follows: or Note that the stopping criteria above are relevant to all the iterative algorithms we discuss in this part. Example 8. We perform three iterations. Thus, Figure 8. A plot of ϕ2 α versus α is shown in Figure 8. The reader should be cautioned not to draw any conclusions from this example about the number of iterations required to arrive at a solution in general.
It goes without saying that numerical computations, such as those in this example, are performed in practice using a computer rather than by hand. The calculations above were written out explicitly, step by step, for the purpose of illustrating the operations involved in the steepest descent algorithm. The computations themselves were, in fact, carried out using a MATLAB program see Exercise 8. Let us now see what the method of steepest descent does with a quadratic function of the form where Q n×n is a symmetric positive definite matrix, b n, and x n. There is no loss of generality in assuming Q to be a symmetric matrix. But Hence, In summary, the method of steepest descent for the quadratic takes the form where Example 8.
See Figure 8. This example illustrates a major drawback in the steepest descent method. More sophisticated methods that alleviate this problem are discussed in subsequent chapters. To understand better the method of steepest descent, we examine its convergence properties in the next section. This means that the algorithm generates a sequence of points, each calculated on the basis of the points preceding it. The method is a descent method because as each new point is generated by the algorithm, the corresponding value of the objective function decreases in value i. We say that an iterative algorithm is globally convergent if for any arbitrary starting point the algorithm is guaranteed to generate a sequence of points converging to a point that satisfies the FONC for a minimizer.
When the algorithm is not globally convergent, it may still generate a sequence that converges to a point satisfying the FONC, provided that the initial point is sufficiently close to the point. In this case we say that the algorithm is locally convergent How close to a solution point we need to start for the algorithm to converge depends on the local convergence properties of the algorithm. A related issue of interest pertaining to a given locally or globally convergent algorithm is the rate of convergence; that is, how fast the algorithm converges to a solution point. We can investigate important convergence characteristics of a gradient method by applying the method to quadratic problems. We begin our analysis with the following useful lemma that applies to a general gradient algorithm.
Lemma 8. We are now ready to state and prove our key convergence theorem for gradient methods. Theorem 8. Let γk be as defined in Lemma 8. From Lemma 8. Note that by Lemma 8. We proceed by contraposition. This completes the proof. The assumption in Theorem 8. On the other hand, it is clear that for all k. Using the general theorem above, we can now establish the convergence of specific cases of the gradient algorithm, including the steepest descent algorithm and algorithms with fixed step size. Then, for any x n, we have Proof. Furthermore, by Lemma 8. The resulting algorithm is of the form We refer to the algorithm above as a fixed-step-size gradient algorithm. The algorithm is of practical interest because of its simplicity.
In particular, the algorithm does not require a line search at each step to determine αk, because the same step size α is used at each step.
Get Book. Skip to content. Mathematics Author : Edwin K. guides and leads the reader through the learning path. Basic definitions and notations are provided in addition to the related fundamental background for linear algebra, geometry, and calculus. This new edition explores the essential topics of unconstrained optimization problems, linear programming problems, and nonlinear constrained optimization. The authors also present an optimization perspective on global search methods and include discussions on genetic algorithms, particle swarm optimization, and the simulated annealing algorithm. Featuring an elementary introduction to artificial neural networks, convex optimization, and multi-objective optimization, the Fourth Edition also offers: A new chapter on integer programming Expanded coverage of one-dimensional methods Updated and expanded sections on linear matrix inequalities Numerous new exercises at the end of each chapter MATLAB exercises and drill problems to reinforce the discussed theory and algorithms Numerous diagrams and figures that complement the written presentation of key concepts MATLAB M-files for implementation of the discussed theory and algorithms available via the book's website Introduction to Optimization, Fourth Edition is an ideal textbook for courses on optimization theory and methods.
In addition, the book is a useful reference for professionals in mathematics, operations research, electrical engineering, economics, statistics, and business. Released on Author : Edwin K. Optimization is a precise procedure using design constraints and criteria to enable the planner to find the optimal solution. Optimization techniques have been applied in numerous fields to deal with different practical problems. This book is designed to give the reader a sense of the challenge of analyzing a given situation and formulating a model for it while explaining the assumptions and inner structure of the methods discussed as fully as possible.
It includes real-world examples and applications making the book accessible to a broader readership. Features Each chapter begins with the Learning Outcomes LO section, which highlights the critical points of that chapter. All learning outcomes, solved examples and questions are mapped to six Bloom Taxonomy levels BT Level. Book offers fundamental concepts of optimization without becoming too complicated. A wide range of solved examples are presented in each section after the theoretical discussion to clarify the concept of that section. A separate chapter on the application of spreadsheets to solve different optimization techniques. At the end of each chapter, a summary reinforces key ideas and helps readers recall the concepts discussed.
The wide and emerging uses of optimization techniques make it essential for students and professionals. This book serves as a textbook for UG and PG students of science, engineering, and management programs. It will be equally useful for Professionals, Consultants, and Managers. Special Features: Features more than tables and illustrations and an extensive bibliography. About The Book: " Successful track record. Author : B. Such methods have become of great importance in statistics for estimation, model fitting, etc. This text attempts to give a brief introduction to optimization methods and their use in several important areas of statistics. It does not pretend to provide either a complete treatment of optimization techniques or a comprehensive review of their application in statistics; such a review would, of course, require a volume several orders of magnitude larger than this since almost every issue of every statistics journal contains one or other paper which involves the application of an optimization method.
It is hoped that the text will be useful to students on applied statistics courses and to researchers needing to use optimization techniques in a statistical context. Lastly, my thanks are due to Bertha Lakey for typing the manuscript. Author : Bastien Chopard Publisher: Springer ISBN: Category : Computers Languages : en Pages : View Book Description The authors stress the relative simplicity, efficiency, flexibility of use, and suitability of various approaches used to solve difficult optimization problems. The authors are experienced, interdisciplinary lecturers and researchers and in their explanations they demonstrate many shared foundational concepts among the key methodologies. This textbook is a suitable introduction for undergraduate and graduate students, researchers, and professionals in computer science, engineering, and logistics.
It includes computational examples to aid students develop computational skills. Students are taken progressively through the development of the proofs, where they have the occasion to practice tools of differentiation Chain rule, Taylor formula for functions of several variables in abstract situations. Throughout this book, students will learn the necessity of referring to important results established in advanced Algebra and Analysis courses. Features Rigorous and practical, offering proofs and applications of theorems Suitable as a textbook for advanced undergraduate students on mathematics or economics courses, or as reference for graduate-level readers Introduces complex principles in a clear, illustrative fashion.
Readers will find a lucid and systematic introduction to the essential concepts of fuzzy set-based information granules, their processing and detailed algorithms. Timely topics and recent advances in fuzzy modeling and its principles, neurocomputing, fuzzy set estimation, granulation—degranulation, and fuzzy sets of higher type and order are discussed. In turn, a wealth of examples, case studies, problems and motivating arguments, spread throughout the text and linked with various areas of artificial intelligence, will help readers acquire a solid working knowledge. It is also ideally suited as a textbook for graduate and undergraduate students in science, engineering, and operations research. It enables professionals to apply optimization theory to engineering, physics, chemistry, or business economics.
It presents fundamentals with particular emphasis on the application to problems in the calculus of variations, approximation and optimal control theory. The reader is expected to have a basic knowledge of linear functional analysis.
An Introduction to Optimization PDFDrive,Book description
Ebook Center | An Introduction to Optimization - 4th Edition Author (s): Edwin K. P. Chong, Stanislaw H. Zak File Specification Extension PDF Pages Size MB An Introduction Download An Introduction to Optimization 4th Edition Solution Manual PDF for free. Report "An Introduction to Optimization 4th Edition Solution Manual" Please fill this form, we will An Introduction to Optimization (4th ed.) (Wiley Series in Discrete Mathematics and Optimization series) by Edwin K. P. Chong. A complete list of titles in this series appears at the end of this This new edition explores the essential topics of unconstrained optimization problems, linear programming problems, and nonlinear constrained optimization. The authors also present an Featuring an elementary introduction to artificial neural networks, convex optimization, and multi-objective optimization, the Fourth Edition also offers: A new chapter on integer programming Expanded coverage of one-dimensional methods Updated and expanded sections on linear matrix inequalities Numerous new exercises at the end of each chapter MATLAB exercises and drill problems to reinforce the discussed theory and algorithms Numerous diagrams and figures that complement the written Featuring an elementary introduction to artificial neural networks, convex optimization, and multi-objective optimization, the Fourth Edition also offers: A new chapter on integer programming ... read more
This rule was referred to by ancient Greek geometers as the golden section. CHAPTER 8 GRADIENT METHODS 8. This choice is governed by our desire to make the "best" decision. Basic Conjugate Direction Algorithm. The sequence is called a subsequence of the sequence {x k }. We treat column vectors in n as elements of n × 1. Show that the symmetric Huang family algorithms are conjugate direction algorithms.
To compute the directional derivative above, suppose that x and d are given. Mathematical optimization. In the examples below, we show that the necessary conditions are not sufficient. ChongStanislaw H. The only thing you are told is that the value of 7 is either 2 or —1. EXERCISES 9.
No comments:
Post a Comment