程序代写代做代考 algorithm flex go graph Let's start by importing necessary packages. We'll use numpy as np, scipy.optimize as opt, and matplotlib.pyplot as plt

Let’s start by importing necessary packages. We’ll use numpy as np, scipy.optimize as opt, and matplotlib.pyplot as plt
In [ ]:

Solving Systems of Nonlinear Equations¶
Now that we have covered root finding algorithms for univariate scalar functions (Assignment 12), optimization algorithms for univariate scalar functions (Assignment 13), and optimization algoritms for multivariate scalar functions (Assignment 14), it’s time to tackle solving systems of equations. Broadly speaking, these are classified as sets of multivariate functions. Our goal in “solving” them is to find the values of the complete set of variables that results in each of our equations being equal to zero. It is high dimensional root finding. A less abstract example is something we’re used to: a system of nonlinear equations:
Important: To be clear, in this assignment we are covering best practices for solving systems of nonlinear equations here. You can apply these methods to a system of linear equations, but there are much more efficient, closed-form analytical solutions based on linear algebra and matrix inversions. If you are dealing with linear systems, then you want to use linear methods since they are more efficient and stable than the iterative solutions we’ll use here.
$$x^2 – 3y = 14.75z – w^4$$$$25 = x^3 – z$$$$0 = \ln(z) – z^2 + 2x + 3$$$$z + xw = 74$$
It is clear that we cannot solve any one equation above to determine the values for w, x, y, and z that satisfy these equations. We have to solve all four of them together, as a system, to do that. We do have analytical methods that are useful – most students default to a substitution method, for example. In general, this is fine if you want an analytical solution but we should note two problems with the approach. First, soliving large nonlinear systems this way requires extensive algebraic manipulations, which is both extremely tedious and extremely error-prone. Second, it isn’t always possible to solve a system of nonlinear equations using analytical methods. More frequently, we will rely on numerical methods to solve nonlinear systems. Similar to the cases we’ve already considered (e.g., a Newton-Raphson iteration), they are based on iterative solutions that proceed until the system reaches some convergence threshold. Just like with univariate equations, our convergence threshold is that we want the equations to evaluate to zero, it’s just in this case, we want each and every equation to equal zero.
Now that we have a basic familiarity, we’ll jump straight into the root finding algorithms available in Python. Solving nonlinear systems numerically is, in general, very difficult, and it requires both good algorithms and good initial guesses, so we’ll skip writing our own routines like we did with univariate functions, but it is important to know that the basic approach of iteratively updating variable values until reaching your convergence threshold is the same.
When we solve a system of equations numerically in Python (or in most softwares), we need to write that system as a set of equations that should evaluate to zero once we find our solution. So no matter how the functions are presented to me, I will convert them into a form where the left-hand-side is zero, and the right hand side specifies the function. Following this logic, I would write the above system as:
$$0 = -x^2 – + 3y + 14.75z – w^4$$$$0 = x^2 – z – 25$$$$0 = \ln(z) – z^2 + 2x + 3$$$$0 = z + xw – 74$$
That is the form we want to work with moving forward.
Just as in root finding and optimization for scalar equations, we want to construct a function that encodes this system and returns the left hand side of our equation(s). No problem, right? We know how to create a multivariate function, so this is straightforward. We will walk through the way we would probably do this intuitively, and then we’ll explain what that won’t work for a numerical solution.
def F(w,x,y,z):
eq1 = -x**2 +3*y + 14.75*z – w**4
eq2 = x**2 – z – 25
eq3 = np.log(z) – z**2 + 2*x + 3
eq4 = z + x*w – 74
return eq1, eq2, eq3, eq4 #Technically, this will return a tuple (eq1, eq2, eq3, eq4).

Now, I can run that function for any (w,x,y,z) set and see the LHS value returned by each equation. There is about a 0\% chance they are all equal to zero.
F(1,2,3,4)
In [ ]:

Technically, there is nothing wrong with our function. It evaluates the system. It is just not compatible with Python syntax for numerical root finding, so this function form is not approprite for a multivariate root finding algorithm. We have to do two things to make this work:
1. We have to recognize that our root finding algorithm is iterative. It is going to vary the values of w, x, y, and z until it converges. As in the case with optimization of multivariate functions, when we intend to vary these values iteratively, we usually need to pass them as the first argument to the function as a collection (a vector, array, list, tuple, etc.). That means we will pack w, x, y, and z into a single vector-like variable, just like we did when regressing $K_m$ and $V_{max}$ with the Michaelis-Menten model in Assignment 14.
2. Similar to the way our root finding algorithm wants all of our unknown variables input as a vector-like quantity, it wants our function to return the left-hand-side solution to each equation in a vector-like set of identical size to our set of variables.
Taking both of these things together, we are now working with a multivariate vector function. It is multivariate because it depends on 4 different independent variables, and it is a vector function because it returns a vector quantity instead of a scalar quantity. It’s easy to do, we just need to understand the two above points, which is what the root finding algorithm needs from us. With that in mind, we’ll construct our system as a multivariate vector function (with a vector argument). Python is actually pretty forgiving in terms of whether you provide inputs as lists, arrays, tuples, or a mix thereof, but in general, I prefer to use a consistent type and dimensionality for inputs and outputs:
def F(var): #the name “var” is arbitrary; it is name of local variable that contains w, x, y, and z

#We’ll work with our “vector” argument var

w = var[0] #I like to relabel elements of the vector variable just so I can recognize the equations at a glance
x = var[1]
y = var[2]
z = var[3]

#Now we’ll define our “vector” output; I’ll work with a list, but I usually use arrays since it is Matlab-like

LHS1 = -x**2 + 3*y + 14.75*z – w**4
LHS2 = x**2 – z – 25
LHS3 = np.log(z) – z**2 + 2*x + 3
LHS4 = z + x*w – 74
retval = [LHS1, LHS2, LHS3, LHS4] #i’m using a list, but you can return a tuple or array just as easily.
return retval

Now, if we want to run this function for the same input as last time, we need to provide a vector argument. Again, I’ll use a list since that is mostly what we’ve worked with in the course.
F([1,2,3,4]) #Using a 4-element list because it is consistent with my return structure.
In [ ]:

Now that we’ve constructed the function, the syntax for the root finding algorithm is straightforward. We’ll use scipy.optimize.root. This is the multivariate analog of scipy.optimize.root_scalar, which we used in Assignment 12. As usual, I have aliased scipy.optimize as opt, so my syntax is:
var0 = [10,10,10,10] #This is a list with my 4 initial guesses at the “roots” for w, x, y, and z
opt.root(F, var0)

That’s about it! If you’re familiar with Matlab, fsolve is also available in scipy, and it has a very similar syntax:
opt.fsolve(F, var0)

According to the documentation, fsolve is deprecated and opt.root is the preferred root finding method in Python. It seems quite flexible as it includes several advanced root finding algorithms (such as hybrid, levenberg-marquardt, kyrlov, and excitingmixing!). Method selection is easy with opt.root; similar to everything we’ve seen in Scipy.optimize.minimize (opt.minimize), you can select the algorithm using the method keyword. So if I wanted to change from the default hybr method to Levenberg-Marquardt, I would do so by adding the method keyword argument.
opt.root(F, var0, method = ‘LM’)

That said, I suspect many of these are tailor made for specific types of problems. Hybrid and LM are fairly general purpose, tried and true algoritms. As far as I can tell, hybrid is based on MINPACK, and it is similar to fsolve in both Python and Matlab. LM uses a least-squares method. It is also one of the solvers that is used by Matlab’s fsolve. I have had pretty good luck with both ‘hybr’ and ‘LM’ using their default options. The others algorithms might be problem specific and/or require more inputs from the user, like a full Jacobian for the system of equations.
As with minimization routines, the algorithms are all highly configurable using various options and keyword arguments, but these n-dimensional root finding algorithms using have algorithm specific options rather than options that are universal to all algorithms. You’ll need to consult the specific algoritm to see what options are available for that particular method.
In [ ]:

A Word of Caution
Solving algebraic systems of nonlinear equations using numerical methods is incredibly difficult; you need a good algorithm and good-to-very good initial guesses. Unfortunately, it can be difficult to provide good initial guesses. We should use our physical insights about the systems we are studying to make initial guesses that we think will be very close to the actual solution, otherwise, it is unlikely that our solvers will converge for difficult nonlinear systems. In fact, my original example was so difficult that I couldn’t solve it, and I backed off to the one in the current worksheet, which still is only easily solved by ‘hybr’ and ‘LM’ algorithms. I was not able to get others to converge easily.
One final useful skill
Providing a Jacobian is usually a good idea, and it can be essential to performance and stability for difficult systems. Most of the algorithms available in opt.root will accept a Jacobian input. Since it is such an essential skill, we’ll demonstrate its usage. The Jacobian is the set of partial derivatives of each function in your system with respect to each variable. In the system above, that means my Jacobian will be a 4×4 matrix since I’d have to take a partial derivative of each equation with respect to w, x, y, and z. It isn’t hard, but it is tedious (although you may want to look at symbolic and numerical methods for generating Jacobians, which can make it much easier. We won’t cover that here). We’ll go throuh a simpler example and create the corresponding Jacobian.
Here’s a simple system of equations written as functions of (a,b)
$$0 = a^3 – e^{-b}$$$$0 = ab – b^2 + 5$$
As a multivariate vector function; I’m returning a numpy array:
def q(v):
a = v[0]
b = v[1]
eq1 = a**3 – np.exp(-b)
eq2 = a*b – b**2 + 5
return np.array([eq1, eq2])

Now for the jacobian, we need to calculate partial derivatives. a is my first variable, b is my second variable. So I need to create a matrix that is:
$$J = \begin{vmatrix} \frac{df_1}{da} & \frac{df_1}{db} \\ \frac{df_2}{da} & \frac{df_2}{db} \end{vmatrix}$$
In other words, the first row contains the partial derivatives of the first function with respect to each variable, the second row contains the partial derivative of the second function with respect to each variable, and so forth. For this system, that becomes:
$$J = \begin{vmatrix} 3a^2 & e^{-b}\\ b & a – 2b \end{vmatrix}$$
As a python function (using np.arrays):
def jacobian(v):
a = v[0]
b = v[1]
jac11 = 3*a**2
jac12 = np.exp(-b)
jac21 = b
jac22 = a – 2*b
jac = np.array([[jac11, jac12], [jac21, jac22]])
return jac

Note: The reason I’m using np.arrays here is that they index like a matrix, and I’m most comfortable treating a Jacobian as a matrix. But you could do this with a list or a tuple if you wanted to.
Then we add the jacobian to the optimization routine using the following syntax:
v0 = np.array([1,1])
opt.root(q, v0, method = ‘hybr’, jac = jacobian)
In [ ]:

Now that we have started working with multivariate vector functions, there is one final tool I want to cover from Scipy.optimize – least_squares.
This routine accomplishes the same things that we did in Assignment 14 for the Michaelis-Menten example; however, it is specifically designed for nonlinear least squares and offers some nice options like robust fitting, whereas opt.minimize just minimizes the function that we supply to it. In general, writing alternative “least squares” algorithms and properly scaling data requires a good understanding of the methods and the data, and the opt.least_squares algorithm is built to handle some of this for us. The major difference between opt.minimize and opt.least_squares (other than the latter being specifically designed for least squares problem) is that opt.minimize works on a scalar function, while opt.least_squares works on a vector function. Specifically, we provide opt.least_squares with a function that calculates the vector of residuals, the difference between measurement and model, that includes every data point, and then opt.least_squares generates the loss function automatically from that full vector of residuals. Revisiting that Michaelis-Menten example, let’s load the data again:
In [5]:
CS = np.array([0, 0.001, 0.002, 0.005, 0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 1, 2, 5]) #mmol/L
rate = np.array([0, 0.017537467, 0.030941975, 0.080327165, 0.1643835, 0.26569368, 0.745442547, 1.295792328, 2.419014706, 4.0402125, 5.534947297, 5.127217742, 7.074911496]) #mmol/L/min

Now we’ll set up the least squares problem by creating a function that returns a vector of residuals for each data point; we still need it to take variable parameters ($V_{max}$ and $K_m$) as a vector input, and we have to pass that input as the first argument to the function. We also need to pass our experimental concentrations and rates in order to calculate model predictions and residuals:
def resid(par, Cexp, rexp):
Vmax = par[0]
Km = par[1]
rmod = (Vmax*Cexp)/(Km + Cexp)
residual = (rexp – rmod)
return residual
resid(np.array([7,0.4]), CS, rate)

The bottom line in that section of code is just running the function for an initial guess of $V_{max} = 7$ and $K_m = 0.4$.
In [ ]:

Now that we have a function that returns residuals when provides 1) the parameter set, 2) experimental concentrations, and 3) experimental rates, we are ready to use opt.leastsquares to find optimal values for $V{max}$ and $K_m$. I’m using the args keyword to pass CS and rate to the function, and I’m adding the verbose keyword to ask for a detailed output of iteration progress.
par0 = np.array([7, 0.4])
opt.least_squares(resid, par0, args = (CS, rate), verbose = 2)
In [ ]:

You can add bounds to this routine when using either trf or dogbox algorithms:
opt.least_squares(resid, par0, method = ‘trf’, args = (CS, rate), verbose = 2, bounds = [(6.8, 0.1), (7.5, 0.6))

Note that the structure of the bounds here is different from opt.minimize; it is like the convention used for opt.curve_fit. Here, the bounds argument is always formatted as $[(\textrm{lower bounds}), (\textrm{upper bounds})]$, where lower bounds and upper bounds are tuples that contain the lower bound on each parameter (in the same sequence as the parameters are packed into par), and upper bound contains the corresponding upper bounds.
In [ ]:

One of the nicest features of opt.least_squares is that it provides you with a few options for generating the loss function for more robust curve fitting. The default option is ‘linear’, which generates a standard SSE as the cost function. There are also ‘soft_l1’, ‘huber’, ‘cauchy’, and ‘arctan’, which may do a better job of handling outliers in the data set. As an illustration, here is the optimal fit generated using default options; we will plot it against the experimental data; we’ll also print out the parameter values and the loss function value at the optimum.
import matplotlib.pyplot as plt
sol = opt.least_squares(resid, par0, args = (CS, rate))
print(‘Vmax and Km are’, sol.x)
print(‘The minimum of the objective function is’, sol.cost)
modfun = lambda Vmax, Km, Cexp: (Vmax*Cexp)/(Km + Cexp)
Cmodel = np.linspace(0, 5, 100)
plt.plot(Cmodel, modfun(sol.x[0], sol.x[1], Cmodel), color = ‘k’, linewidth = 1)
plt.scatter(CS, rate)
In [ ]:

Now let’s contrast that with soft_l1 and cauchy loss functions.
import matplotlib.pyplot as plt
default = opt.least_squares(resid, par0, args = (CS, rate))
soft = opt.least_squares(resid, par0, args = (CS, rate), loss = ‘soft_l1’)
cauchy = opt.least_squares(resid, par0, args = (CS, rate), loss = ‘cauchy’)
print(‘Vmax and Km using standard sum of squares are’, default.x)
print(‘Vmax and Km using soft_l1 are’, soft.x)
print(‘Vmax and Km using cauchy are’, cauchy.x)
print(‘The minimum of the objective function using standard sum of squares is’, default.cost)
print(‘The minimum of the objective function using soft_l1 is’, soft.cost)
print(‘The minimum of the objective function using cauchy is’, cauchy.cost)
modfun = lambda Vmax, Km, Cexp: (Vmax*Cexp)/(Km + Cexp)
Cmodel = np.linspace(0, 5, 100)
plt.plot(Cmodel, modfun(default.x[0], default.x[1], Cmodel), color = ‘k’, linewidth = 1)
plt.plot(Cmodel, modfun(soft.x[0], soft.x[1], Cmodel), color = ‘b’, linewidth = 1)
plt.plot(Cmodel, modfun(cauchy.x[0], cauchy.x[1], Cmodel), color = ‘g’, linewidth = 1)
plt.scatter(CS, rate)
In [ ]:

You should see a small difference in the parameter values and model fits and notice that the soft_l1 and cauchy cost functions minimize the influence of what appears to be an outlier at about 2.0 mmol/L. I would not take a decrease in the cost function from ‘linear’ to ‘soft_l1’ to ‘cauchy’ to be evidence of a better fit. They formulate different cost functions, and so there is no good way to directly compare between them. You would need to look at an $R^2$ for the model and consider whether outlier rejection is appropriate or not based on statistical analysis – all of which is far beyond the scope of this Assignment.
Added Bonus opt.least_squares even lets you generate your own cost function by providing a function with the loss keyword argument, but that is a fairly advanced topic.
Alternative loss functions use the f_scale parameter to determine the margin it uses to define what is an outlier, and it is usually of critical importance. It’s default value is 1 (set above). Try some different values for f_scale in the cell below. You can clearly see the difference it makes when we set f_scale to 2, 1, 0.5, and 0.1 for the soft_l1 loss function. Again, this is an advanced topic. I would caution against too much manipulation without some pretty good background in statistics. I’m presenting it here just to give you a sense of the flexibility of the opt.least_squares package for advanced nonlinear regression.
In [ ]:

Problem Set¶
Solve the following problems using opt.root (solving nonlinear systems) or opt.least_squares (nonlinear regression).
Problem 1
$$9x^2 + y^2 = 9$$$$y^2 = 3x + 3$$
This is a nonlinear system of equations; you can easily solve this one analytically to verify your answer. Note that there is likely more than one solution; the (x,y) pair you find will depend on your initial guesses. If you want to find all roots, solve it analytically or graphically. For a graphical solution, solve each equation as y(x) and graph y vs. x. You should be able to locate intersections, which correspond to solutions.
In [ ]:

Problem 2
$$x^2 + 4y^2 = 9 – 3z$$$$18y – 14x^2 + 45 = 0$$$$x + y^2 – z^3 = 0$$
Again, you may have multiple solutions. An analytical solution for x, y, and z is tractable here, as is a graphical solution (though it is a bit more of a brain bender).
In [ ]:

Problem 3
The following system describes the concentrations of species A, B, C, and D found in the exit stream of a continuous stirred tank reactor in which the following reactions are occuring:
$$A \leftrightarrow B$$$$A + B \leftrightarrow 2C$$$$2B \leftrightarrow D + C$$
First, I can describe the rates of these three reactions as shown below:
\begin{align} r_1 &= k_{1f}C_A – k_{1r}C_B \\ r_2 &= k_{2f}C_AC_B – k_{2r}C_C^2 \\ r_3 &= k_{3f}C_B^2 – k_{1r}C_DC_C \\ \end{align}
Now, I can write material balances on each species: \begin{align} 0 &= F{Af} – F{A} – (r_1 + r2)V \ 0 &= F{Bf} – F_{B} + (r_1 – r_2 – 2r3)V \ 0 &= F{Cf} – F_{C} + (2r_2 + r3)V \ 0 &= F{Df} – F_{D} + r_3V \ \end{align}
Here, $F_j$ is a molar flowrate, which you can express as a product of concentration, $C_j$, and volumetric flowrate, Q, for any species in the reactor:
$$F_j = C_jQ$$
For this particular reactor, the relevant parameters required to solve the system are given in the table below:
Parameter
Value
Units

Parameter
Value
Units
$F_{Af}$
47534
mol hr$^{-1}$

$k_{1f}$
1.5
hr$^{-1}$
$F_{Bf}$
0
mol hr$^{-1}$

$k_{1r}$
0.48
hr$^{-1}$
$F_{Cf}$
0
mol hr$^{-1}$

$k_{2f}$
9.74
L mol$^{-1}$ hr$^{-1}$
$F_{Df}$
0
mol hr$^{-1}$

$k_{2r}$
2.52
L mol$^{-1}$ hr$^{-1}$
$Q$
1400
L min$^{-1}$

$k_{3f}$
0.89
L mol$^{-1}$ hr$^{-1}$
$V$
12
m$^3$

$k_{3r}$
3.75
L mol$^{-1}$ hr$^{-1}$
Solve for the concentrations of A, B, C, and D at the reactor exit. Hint: This is a bit more in depth than anything we have done in class, but your goal is still to create a function that accepts the 4 concentrations in a vector argument and returns the Left Hand Side of the 4 material balances in a vector-like structure. Everything else in this problem is either a fixed-parameter that can be hard-coded into the function (like V or $k_{1f}) or is a function of CA, CB, CC, and CD so it can be calculated inside the body of the function. Once you pass the 4 concentrations to your function, you then use them to calculate molar flowrates and reaction rates using the above equations. Once that is done, you can solve for the left hand side of your system of 4 equations (material balances), and return the LHS of each equation in a vector-type structure. Solve using opt.root.
In [ ]:

Problem 4
In Assignment 12, we worked with chemical equilibrium:
We haveintroduced the idea of an equilibrium constant (K). Briefly, we can calculate an equilibrium constant for any reaction i (K$_i$) provided that we know the enthalpy change of reaction ($\Delta$H) and the entropy change of reaction ($\Delta$S) at a given reaction temperature:
$$\Delta G_i = \Delta H_i + T\Delta S_i$$$$ K_i = e^\left(\frac{-\Delta G_i}{RT}\right)$$
Let’s say we have a reaction, ethylene dimerization to make 1-butene. It has $\Delta H = -45.7$ kJ mol$^{-1}$ and $\Delta S = -150$ J mol$^{-1}$ K$^{-1}$.
If we know the equilibrium constant, we can easily calculate the mole fraction of 1-butene in the reactor at equilibrium using the following equation:
$$K = \frac{y_B}{\left(1-y_B\right)^2} \cdot \frac{P_0}{P}$$
Here, y$_B$ is the mole fraction of 1-butene in the reactor, K is the equilibrium constant for ethylene dimerization, and P$_0$ is a reference state pressure of $1.01325 \times 10^5$ Pa.
If we operate the reactor at 15 atm, what is the mole fraction of 1-butene at chemical equilibrium for all temperatures between 373K and 1000K? Graph your solution (butene mole fraction vs. temperature).
Note: In Assignment 12 you used a scalar root finding algorithm to solve this type of problem. Here, formulate this as a system of equations; encode it using a multivariate vector function; and use opt.root to solve it (instead of opt.netwon or opt.root_scalar). Instead of using a loop, you should have a system of roughly 630 equations here. Good luck!
In [ ]:

Problem 5
We solved this problem in Assignment 14 (optimization). Most likely, you used root.minimize. We will repeat it here using opt.least_squares, and then solve something a bit more advanced.
Download the data in the Kinetics1.csv file; this is a single experiment wherein you measure the concentration of a component in your reaction chamber as a function of time. Time is in minutes, and concentration is in mol/L.
You know that the reaction is first order in the concentration of this species, and you expect it’s concentration should be well-described by the following model:
$$C(t) = C_0e^{-kt}$$
Where $C_0$ is the iniitial concentration of this component, and k is a rate constant. Based on your data, estimate the values of these two parameters. If you don’t immediately recognize this type of problem, it is a nonlinear least squares regression; you should develop an equation that returns a vector set of residuals for each measurement, and then pass that function to opt.least_squares for parameter estimation.
In [ ]:

Problem 6
Repeat the analysis for the data set in Kinetics2.csv (using opt.least_squares).
In [ ]:

Problem 7
Download the data set in Kinetics3.csv. This one includes data measured at various temperatures and reaction times. The “rate constant” is therefore no longer “constant” for this reaction, so you have to model it as a function of temperature, namely:
$$k = Ae^{\left(\frac{-E}{RT}\right)}$$
Otherwise, you can model the time-dependence of concentration as usual once the rate constant is evaluated at a specific temperature:
$$C = C_0e^{-kt}$$
Estimate values for $C_0$, A, and E simultaneously using opt.least_squares.
In [ ]:

Problem 8
Revisit the CSTR material balances in Problem 3. You suspect that your kinetic parameters are not entirely correct. In particular, you think the forward rate constant for reaction 1 ($k_{1f}$) is inaccurate. You make a quick measurement of the exit concentrations coming out of the reactor, and you find (all units in moles per liter):
\begin{align} C_A &= 0.362 \\ C_B &= 0.116 \\ C_C &= 0.102 \\ C_D &= 1.48\times10^{-3}\\ \end{align}
Can you estimate a better value for $k_{1f}$ from this data? Hint: You will need to solve the system of equations from Problem 3 in order to calculate the residuals that you want to use in opt.least_squares or opt.minimize.
In a nutshell, you are going to need to solve a system of equations using opt.root for every iteration taken by opt.least squares. Your opt.least_squares routine is thus a “wrapper” around your opt.root routine. We have not solved a problem like this in the course, but it is a useful method to learn, and it will challenge you to abstract your understanding of both opt.root and opt.least_squares a bit more.

Related Posts