There has to be a way to transform it. Here come two more mathematical characters. Meet exponential and logarithms.
Now let us look at exponential. This character is again a common character in high school math. An exponential is a function that has two operators. A base (b) and an exponent (n). It is defined as bn. it takes the form:
f(x) = bx
The base can be any positive number. Again Euler’s number (e) is a common base used in statistics.
Geometrically, an exponential relationship has following structure:
- An increase in x doesn’t yield a corresponding increase in y. Until a threshold is reached.
- After the threshold, the value of y shoots up rapidly for a small increase in x.
The logarithm is an interesting character. Let us only understand its personality applicable for regression models. The fundamental property of a logarithm is its base. The typical base of the logarithm is 2, 10 or e.
Let us take an example:
- How many 2s do we multiply to get 8? 2 x 2 x 2 = 8 i.e. 3
- This can also be expressed as: log2(8) = 3
The logarithm of 8 with base 2 is 3
There is another common base for logarithms. It is called as “Euler’s number (e).” Its approximate value is 2.71828. It is widely used in statistics. The logarithm with base e is called as Natural Logarithm.
It also has interesting transformative capabilities. It transforms an exponential relation into a linear relation. Let us look at an example:
The diagram below, shows an exponential relationship between y and x:
If logarithms are applied to both x and y, the relationship between log(x) and log(y) is linear. It looks something like this:
Elasticity is the measurement of how responsive an economic variable is to a change in another.
Say that we have a function: Q = f(P) then the elasticity of Q is defined as:
E = P/Q x dQ/dP
- dq/dP is the average change of Q wrt change in P.
Bringing it all together:
Now let us bring these three mathematical characters together. Derivatives, Logarithms and Exponential. Their rules of engagement are as follows:
- Logarithm of e is 1 i.e. log(e) = 1
- The logarithm of an exponential is exponent multiplied by the base.
- Derivative of log(x) is : 1/x
Let us take an example. Imagine a function y expressed as follows:
- y = b^x.
- => log(y) = x log (b)
So does it mean for linear regression models? Can we do mathematical juggling to make use of derivatives, logarithms, and exponents? Can we rewrite the linear model equation to find the rate of change of y wrt change in x?
First, let us define relationship between y and x as an exponential relationship
- y = α x^β
- Let us first express this as a function of log-log: log(y) = log(α) + β.log(x)
- Doesn’t equation #1 look similar to regression model: Y= β0 + β1 . x1 ? where β0 = log(α); β1 = β. This equation can be now rewritten as: log(y) = β0 + β1. log(x1)
- But how does it represent elasticity? Let us take derivative of log(y) wrt x, we get the following:
- d. log(y)/ dx = β1. log(x1)/dx.=> 1/y . dy/dx = β1 . 1/x => β1 = x/y . dy/dx.
- The equation of β1 is the elasticity.
Now that we understand the concept, let us see how Fernando build a model. He builds the following model:
log(price) = β0 + β1. log(engine size) + β2. log(horse power) + β3. log(width)
He wants to estimate the change in car price as a function of the change in engine size, horse power, and width.
Fernando trains the model in his statistical package and gets the following coefficients.
The equation of the model is:
log(price) = -21.6672 + 0.4702.log(engineSize) + 0.4621.log(horsePower) + 6.3564 .log(width)
Following is the interpretation of the model:
- All coefficients are significant.
- Adjusted r-squared is 0.8276 => the model explains 82.76% of variation in data.
- If the engine size increases by 4.7% then the price of the car increases by 10%.
- If the horse power increases by 4.62% then the price of the car increases by 10%.
- If the width of the car increases by 6% then the price of the car increases by 1 %.
Fernando has now built the log-log regression model. He evaluates the performance of the model on both training and test data.
Recall, that he had split the data into the training and the testing set. The training data is used to create the model. The testing data is the unseen data. Performance on testing data is the real test.
On the training data, the model performs quite well. The adjusted R-squared is 0.8276 => the model can explain 82.76% variation on training data. For the model to be acceptable, it also needs to perform well on testing data.
Fernando tests the model performance on test data set. The model computes the adjusted r-squared as 0.8186on testing data. This is good. It means that model can explain 81.86% of variation even on unseen data.
Note that the model estimates the log(price) and not the price of the car. To convert the estimated log(price) into the price, there needs to be a transformation.
The transformation is treating the log(price) as an exponent to the base e.
e^log(price) = price
The last few posts have been quite a journey. Statistical learning laid the foundations. Hypothesis testing discussed the concept of NULL and alternate hypothesis. Simple linear regression models made regression simple. We then progressed into the world of multivariate regression models. Then discussed model selection methods. In this post, we discussed the log-log regression models.
So far the regression models built had only numeric independent variables. The next post we will deal with concepts of interactions and qualitative variables.
Originally published here