Skip to main content
留学咨询

辅导案例-ENG5322M

By May 15, 2020No Comments

Course assignment ENG5322M: Introduction to computer programming October 28, 2019 Instructions You should submit the following through the ENG5322M Moodle submission system by 4pm, 22nd November 2019: 1. Python notebook (.ipynb format) used for to solve each of the assignment problems. This can be a separate notebook for each problem or a single file. You will be marked according to the following criteria: • Program correctness (60%): Does your program solve the problem given to you? • Program readability (20%): Have you written your program in such a way that it is easy to follow? Have you included appropriate comments? • Testing (20%): Have you done appropriate testing to show your program works as intended? You are given two problems to solve for your project assignment which will be outlined in more detail shortly. The two problems are focussed on: 1. Conjugate gradient method 2. Multivariate least square method Notebook format I expect to see you explain your code throughout your notebook both using text and images, much like I have down in the lectures during this course. As a quick guide, the default type of a cell is code, but if you change it to ’markdown’ (look in the toolbar) and then hit shift and enter you will see it become text in the notebook. 1. Testing: You should write appropriate code that performs tests on your program. 1 1 PROBLEM 1: CONJUGATE GRADIENT METHOD 1 Problem 1: Conjugate gradient method Overview of task: Write a program that solves equations of the form Ax = b using Conjugate Gradient Method. Use this to solve the equation:  4 −1 1−1 4 −2 1 −2 4 x1x2 x3  = 12−1 5  (1) 1.1 Derivation The problem involves finding the vector x that minimizes the scalar function f(x) = 1 2 xTAx− bTx (2) where the matrix A is symmetric and positive definite. Because f(x) is minimized when its gradient ∇f = Ax− b, we see that the minimization is equivalent to solving Ax = b (3) Gradient methods solve this problem of minimization through iteration starting with an initial estimate of vector x0 and then iterate for k computations using xk+1 = xk + αksk (4) such that the step length αk is chosen so that xk+1 minimizes f(xk+1) in the search direction sk such that xk+1 satisfies A(xk + αksk) = b (5) If we now introduce a residual rk = b−Axk (6) and pre-multiplying both sides by sTk and solving for αk gives αk = sTk rk sTkAsk (7) While intuitively we can choose sk = −∇f = rk as this is the direction of the largest negative change in f(x) (called method of steepest descent), it converges very slow. A better choice is to use conjugate gradient method that uses the search direction sk+1 = rk+1 + βksk (8) 2 1.2 Hint 1 PROBLEM 1: CONJUGATE GRADIENT METHOD The constant βk is then chosen so that the two successive search directions are conjugate to each other, i.e., sTk+1Ask = 0 (9) Subsituiting Eq. 9 into Eq. 8 gives: βk = − rTk+1Ask sTkAsk (10) The algorithm is then as follows: 91 ∗2.7 Iterative Methods Here is the outline of the conjugate gradient algorithm: Choose x0 (any vector will do). Let r0 ← b− Ax0. Let s0 ← r0 (lacking a previous search direction, choose the direction of steepest descent). do with k = 0, 1, 2, . . .: αk ← s T k rk sTkAsk . xk+1 ← xk + αksk . rk+1 ← b− Axk+1. if |rk+1| ≤ ε exit loop (ε is the error tolerance). βk ←− rTk+1Ask sTkAsk . sk+1 ← rk+1 + βksk . It can be shown that the residual vectors r1, r2, r3, . . . produced by the algorithm are mutually orthogonal; that is, ri · r j = 0, i ̸= j . Now suppose that we have carried out enough iterations to have computed thewhole set ofn residual vectors. The resid- ual resulting from the next iteration must be a null vector (rn+1 = 0), indicating that the solution has been obtained. It thus appears that the conjugate gradient algorithm is not an iterative method at all, because it reaches the exact solution after n compu- tational cycles. In practice, however, convergence is usually achieved in less than n iterations. The conjugate gradient method is not competitive with direct methods in the solution of small sets of equations. Its strength lies in the handling of large, sparse systems (where most elements of A are zero). It is important to note that A enters the algorithm only through its multiplication by a vector; that is, in the form Av, where v is a vector (either xk+1 or sk). If A is sparse, it is possible to write an efficient subrou- tine for the multiplication and pass it, rather than A itself, to the conjugate gradient algorithm. ! conjGrad The function conjGrad shown next implements the conjugate gradient algorithm. The maximum allowable number of iterations is set to n (the number of unknowns). Note that conjGrad calls the function Av that returns the product Av. This func- tion must be supplied by the user (see Example 2.18). The user must also supply the starting vector x0 and the constant (right-hand-side) vector b. The function returns the solution vector x and the number of iterations. ## module conjGrad ’’’ x, numIter = conjGrad(Av,x,b,tol=1.0e-9) Conjugate gradient method for solving [A]{x} = {b}. Figure 1: Pseudo algorithm for conjugate gradient method. It can be shown hat the residual vectors r1, r2, r3, … are mutually o thogonal i.e. ri.rj = 0, i 6= j Please note that while writi g the algorithm, if you want to multiply a tra spose of a vector with a regular vector, say xTy, you can use in numpy, np.dot(x,y). While implementing the algorithm, set the tolerance to very low tol=1.0e-9. 1.2 Hint The algorithm requires calculating r, α, and β (if you can work back from the worked example) along with modulus of , then you can break down this whole algorithm to very simple ma- trix/vector multiplications/addition/subtraction that you have to do in a loop whether it is for or a while loop: a) Multiplying a matrix A with x and s btracting from a vector b to get r. This can all be done using numpy arrays, A = np.array([[ 4, -1 ,1], [ -1, 4 ,-2], [ 1, -2 ,4]]) b = np.array([12, -1, 5]) x = np.ones(3) then r = Ax− b is nothing but r = np.dot(A,x) – b 3 1.2 Hint 1 PROBLEM 1: CONJUGATE GRADIENT METHOD b) You also have to solve for α and β. If I take α, then the numerator bit is nothing but multiplying the transpose of s with r (can be done using np.dot). The denominator is transpose of s with As which you can again solve using np.dot provided if you can precalculate A times s using np.dot. numerator = np.dot(s,r) As = np.dot(A,s) denominator = np.dot(s,As) alpha = numerator / denominator You can similarly take hints from above to solve for β c) xk+1 = xk + αksk, i.e., calculating next value of x is nothing but x = x + alpha * s or in short form x+= alpha * s We have also covered the += operator in the lectures. You can similarly solve for s. d) To calculate |r|, you can simply use rnorm = np.dot(r,r) and then use an if statement to see if rnorm is less than a tolerance say 1e-5 to break from the loop. If I have said not to use numpy library function then it means not to use an equivalent function such as np.linalg.solve() to cheat which, we already covered in the lectures and act as a one line solution def conjugate_gradient_cheating(A,b) x=np.linalg.solve(A,b) return x instead I want you to implement your own algorithm: def conjugate_gradient_snazzy(A,b) return x 4 2 PROBLEM 2: MULTIVARIATE LEAST SQUARES METHOD 2 Problem 2: Multivariate least squares method We want to find the regression coefficients a0, a1,…,am for the following equation: yi = a0 + a1x1i + a2x2i + a3x3i + …+ amxmi (11) To solve this, let us suppose, we take a case for m = 2 and add an error term i to our equation: yi + i = a0 + a1x1i + a2x2i (12) We can then obtain an objective function S to minimize as: i = a0 + a1x1i + a2x2i − yi (13) 2i = (a0 + a1x1i + a2x2i − yi)2 (14) S = n∑ i=1 2i = n∑ i=1 (a0 + a1x1i + a2x2i − yi)2 (15) Minimizing S with respect to a0, a1, and a2, we get: ∂S ∂a0 = 2 ∑ (a0 + a1x1i + a2x2i − yi) (16) ∂S ∂a1 = 2 ∑ (a0 + a1x1i + a2x2i − yi)x1i (17) ∂S ∂a2 = 2 ∑ (a0 + a1x1i + a2x2i
− yi)x2i (18) Equating these to zero, we obtain three equations in three unknowns, a0, a1, a2: a0 + a1 < x1 > +a2 < x2 >=< y > (19) a0 < x1 > +a1 < x1x1 > +a2 < x1x2 >=< x1y > (20) a0 < x2 > +a1 < x1x2 > +a2 < x2x2 >=< x2y > (21) 5 2.1 Hint 2 PROBLEM 2: MULTIVARIATE LEAST SQUARES METHOD where < . > represents average quantity defined as: < x >= 1 n n∑ i=1 xi (22) We can represent in matrix form to use matrix methods as: 1 < x1 > < x2 >< x1 > < x1x1 > < x1x2 > < x2 > < x1x2 > < x2x2 > a0a1 a2  =  < y >< x1y > < x2y >  (23) or XA = Y . The matrix of unknown coefficients, A, is obtained by finding the inverse of X, so that A =X−1Y . Extending to m variables, we get: X =  1 < x1 > < x2 > … < xm > < x1 > < x1x1 > < x1x2 > .. < x1xm > < x2 > < x1x2 > < x2x2 > .. < x2xm > … … … … … < xm > < x1xm > < x2xm > … < xmxm >  ,Y =  < y > < x1y > < x2y > … < xmy >  (24) Overview of task: The assignment is to use the above logic to find the coefficients a0,a1,a2 for the data given in Table 1 (also provided as a tab-delimited text file). The exact solution is y = −2.145− 3.117×1 + 1.486×2. 2.1 Hint To program this efficiently in python, we can make use of a trick. Imagine we get the values of x and y from file, and since we are reading one line at a time, rather than calculating the averages, we can simply maintain the sums by taking 1/n out and cancelling it on both sides: 1 n  n ∑ni=1 x1i ∑ni=1 xi2∑n i=1 x1i ∑n i=1 x1ix1i ∑n i=1 x1ix2i∑n i=1 xi2 ∑n i=1 x1ix2i ∑n i=1 x2ix2i a0a1 a2  = 1 n  ∑ni=1 yi∑n i=1 x1iyi∑n i=1 x2iyi  (25) If you can import all the function from numpy library from numpy import *, and if you have a variable m as the number of independent variables, and if you can define x=zeros((m+1,m+1)), y=zeros((m+1,1)), and xi=[0.0]*(m+1), then you can populate both matrices x and y in a nested-loop with two equations, x[j,k] += xi[j]*xi[k] and y[j,0] += yi*xi[j], provided if you have xi[0]=1, xi[1] the value of first predictor (first column), xi[2] the value of second predictor (second column), and so on. yi is dependent variable (last column) in Table 1. Finally, you can simply convert numpy arrays x and y to numpy matrix so that you can get the inverse automatically: X=mat(x.copy()), Y=mat(y.copy()), and A=X.I*Y. Also, the data file provided is tab-delimited. Look up how to split a string using .split() function. 6 2.1 Hint 2 PROBLEM 2: MULTIVARIATE LEAST SQUARES METHOD Table 1: Data to be fitted x1 x2 y 0.408 0.58 -2.39 0.429 0.51 -2.53 0.492 0.53 -3.38 0.529 0.6 -2.72 0.569 0.58 -2.95 0.677 0.64 -3.32 0.703 0.61 -3.45 0.841 0.66 -3.81 0.911 0.73 -3.9 0.936 0.72 -4.11 0.96 0.57 -4.24 0.978 0.84 -4.81 0.993 0.69 -4.1 1.038 0.75 -4.28 1.051 0.76 -4.05 1.119 0.85 -4.23 1.134 0.75 -4.58 1.16 0.85 -4.4 1.209 0.72 -5.05 1.272 0.82 -4.79 1.303 0.86 -4.76 1.353 0.98 -4.65 1.367 0.9 -5.18 1.388 0.71 -5.5 1.425 1.04 -5.01 1.453 0.9 -5.15 1.484 0.77 -5.8 1.503 0.93 -5.33 1.536 0.81 -5.23 1.563 0.83 -6.11 1.655 1.2 -5.34 1.684 0.96 -6.13 1.897 1.24 -6.45 7

admin

Author admin

More posts by admin