Source: Deep Learning on Medium
Rules of calculus — Multivariate
In the real world, it is very difficult to explain behavior as a function of only one variable, and economics is no different.
First, define the functions themselves. We want to describe behavior where a variable is dependent on two or more variables. Every rule and notation described from now on is the same for two variables, three variables, four variables, and so on, so we’ll use the simplest case; a function of two independent variables. Conventionally, z is the dependent variable (like y in univariate functions) and x and y are the independent variables (like x in univariate functions):
For example, suppose that the following function describes some behavior:
Differentiating this function still means the same thing — still we are looking for functions that give us the slope, but now we have more than one variable, and more than one slope.
Visualize this by recalling from graphing what a function with two independent variables looks like. Whereas a 2-dimensional picture can represent a univariate function, our z function above can be represented as a 3-dimensional shape. Think of the x and y variables as being measured along the sides of a chessboard. Then every combination of x and y would map onto a square somewhere on the chessboard. For example, suppose x=1 and y=1. Start at one of the corners of the chessboard. Then move one square in on the x side for x=1, and one square up into the board to represent y=1. Now, calculate the value of z.
The function z takes on a value of 4, which we graph as a height of 4 over the square that represents x=1 and y=1. Map out the entire function this way, and the result will be a shape, usually looking like a mountain peak in typical economic analysis problems.
Now back to slope. Imagine standing on the mountain shape, facing parallel to the x side of the chessboard. If you allow x to increase, while holding y constant, then you would move forward in a straight line along the mountain shape. We define the slope in this direction as the change in the z variable, or a change in the height of the shape, in response to a movement along the chessboard in one direction, or a change in the variable x, holding y constant.
Formally, the definition is: the partial derivative of z with respect to x is the change in z for a given change in x, holding y constant. Notation, like before, can vary. Here are some common choices:
Now go back to the mountain shape, turn 90 degrees, and do the same experiment. Now, we define a second slope as the change in the height of the z function in response to a movement forward on the chessboard (perpendicular to the movement measured by the first slope calculation), or a change in the y variable, holding the x variable constant. Typical notation for this operation would be
Therefore, calculus of multivariate functions begins by taking partial derivatives, in other words, finding a separate formula for each of the slopes associated with changes in one of the independent variables, one at a time. Before we discuss economic applications, let’s review the rules of partial differentiation.
Basic rules of partial differentiation
The rules of partial differentiation follow exactly the same logic as univariate differentiation. The only difference is that we have to decide how to treat the other variable. Recall that in the previous section, slope was defined as a change in z for a given change in x or y, holding the other variable constant. There’s our clue as to how to treat the other variable. If we hold it constant, that means that no matter what we call it or what variable name it has, we treat it as a constant. Suppose, for example, we have the following equation:
If we are taking the partial derivative of z with respect to x, then y is treated as a constant. Since it is multiplied by 2 and x and is constant, it is also defined as a coefficient of x. Therefore,
Therefore, once all other variables are held constant, then the partial derivative rules for dealing with coefficients, simple powers of variables, constants, and sums/differences of functions remain the same, and are used to determine the function of the slope for each independent variable. Let’s use the function from the previous section to illustrate.
First, differentiate with respect to x, holding y constant:
Note that there were no y variables in the first term, so differentiation was exactly like the univariate process; in the last term there were no x variables, therefore the derivative is zero, according to the constant rule, since y is treated as a constant.
Now, take the partial derivative with respect to y, holding x constant:
Again, note that the first term had no “variables” in it, since x is being treated as a constant, therefore the derivative of that term is 0.
To make sure you have a clear picture of more than one slope in a function, let’s evaluate the two partial derivatives at the point on the function where x = 1 and y = 2:
How do we interpret this information? First, note that when x = 1 and y = 2, then the function z takes on a value of 3. At this point on our “mountain’ or 3 dimensional shape, we can evaluate the change in the function z in 2 different directions. First, the change in z with respect to x is 10. In other words, the slope in a direction parallel to the x-axis is 10. Now turn 90 degrees. The slope in a direction perpendicular to our previous slope is 6, therefore not quite as steep. Also, note that although each slope depends on the change in only one variable, the position or fixed value of the other variable does matter; since you need both x and y to actually calculate the numerical values of slope. We’ll come back to this in the next section, and look at the economic meaning behind this relatedness. But first, back to the rules.
The product and quotient of functions rules follow exactly the same logic: hold all variables constant except for the one that is changing in order to determine the slope of the function with respect to that variable. To illustrate the product rule, first let’s redefine the rule, using partial differentiation notation:
Now use the product rule to determine the partial derivatives of the following function:
To illustrate the quotient rule, first redefine the rule using partial differentiation notation:
Use the new quotient rule to take the partial derivatives of the following function:
Not-so-basic rules of partial differentiation
Just as in the previous univariate section, we have two specialized rules that we now can apply to our multivariate case.
First, the generalized power function rule. Again, we need to adjust the notation, and then the rule can be applied in exactly the same manner as before.
When a multivariate function takes the following form:
Then the rule for taking the derivative is:
Use the power rule on the following function to find the two partial derivatives:
The composite function chain rule notation can also be adjusted for the multivariate case:
Then the partial derivatives of z with respect to its two independent variables are defined as:
Let’s do the same example as above, this time using the composite function notation where functions within the z function are renamed. Note that either rule could be used for this problem, so when is it necessary to go to the trouble of presenting the more formal composite function notation? As problems become more complicated, renaming parts of a composite function is a better way to keep track of all parts of the problem. It is slightly more time consuming, but mistakes within the problem are less likely.
The final step is the same, replace u with function g:
Special cases in multivariate functions
The last two special cases in multivariate differentiation also follow the same logic as their univariate counterparts.
The rule for differentiating multivariate natural logarithmic functions, with appropriate notation changes is as follows:
Then the partial derivatives of z with respect to its independent variables are defined as:
Let’s do an example. Find the partial derivatives of the following function:
The rule for taking partials of exponential functions can be written as:
Then the partial derivatives of z with respect to its independent variables are defined as:
One last time, we look for partial derivatives of the following function using the exponential rule:
Higher order partial and cross partial derivatives
The story becomes more complicated when we take higher order derivatives of multivariate functions. The interpretation of the first derivative remains the same, but there are now two second order derivatives to consider.
First, there is the direct second-order derivative. In this case, the multivariate function is differentiated once, with respect to an independent variable, holding all other variables constant. Then the result is differentiated a second time, again with respect to the same independent variable. In a function such as the following:
There are 2 direct second-order partial derivatives, as indicated by the following examples of notation:
These second derivatives can be interpreted as the rates of change of the two slopes of the function z.
Now the story gets a little more complicated. The cross-partials, fxy and fyx are defined in the following way. First, take the partial derivative of z with respect to x. Then take the derivative again, but this time, take it with respect to y, and hold the x constant. Spatially, think of the cross partial as a measure of how the slope (change in z with respect to x) changes, when the y variable changes. The following are examples of notation for cross-partials:
We’ll discuss economic meaning further in the next section, but for now, we’ll just show an example, and note that in a function where the cross-partials are continuous, they will be identical. For the following function:
Take the first and second partial derivatives.
Now, starting with the first partials, find the cross partial derivatives:
Note that the cross partials are indeed identical, a fact that will be very useful to us in future optimization sections.
Source: Columbia Education
Visit: MLAIT for more updates on ML, AI and Cloud