R Introduction Part 3

6 minute read | Published: December 13, 2015

This is my lecture note written for undergraduate R programming course, which is a supplementary course of the Advanced Macroeconomics course.
Most figures and tables are extracted from R in Action.
All exercises are copied from my teacher Jing Fang’s R Workshop, you can find the data for exercises on the website of R Workshop.

Nonlinear Regression

What is Nonlinear Regression?

In statistics, nonlinear regression is a form of regression analysis in which observational data are modeled by a function which is a nonlinear combination of the model parameters and depends on one or more independent variables.
1.Linear Regression $Y\_i=\beta\_1+\beta\_2X\_i+\mu\_i$ 2.Nonlinear Regression $Y\_i=\beta\_1e^{\beta\_2X\_i}+\mu\_i$

How to cope with Nonlinear Regression?

1.Delta Method

Single parameter situation
If we know the standard error of parameter $\hat{\theta}$, however, what we are really interested is the standard error of parameter $\gamma=g(\theta)$, how to estimate its standard error?
1.If $g(\theta)$ is a linear function, you know how to get it.
2.If $g(\theta)$ is not a linear function, then:
We assume the standard error of $\hat{\theta}$ is $S_{\theta}$, and the standard error of $\hat{\gamma}$ is $S_{\gamma}$, the relationship between them is $S\_{\gamma}\equiv|g^\prime(\hat{\theta})|S\_{\theta}$
Multi-parameters situation
If $\theta$ and $\gamma$ are both vectors, then how? We assume the first one is a $k$-dimension vector and the latter is a $l$-dimension vector, $l\leqq k$. $\gamma$ is a function of $\theta$: $\gamma=g(\theta)$, where $g(\theta)$ is a monotonic and continuous $l$-dimension vector function. Then we can get the covariance matrix of $\hat{\gamma}$ using the following equation: $\widehat{Var}(\hat{\gamma})\equiv\hat{G}\widehat{Var}(\hat{\theta})\hat{G}^T \qquad (1)$ where $\widehat{Var}(\hat{\theta})$ is the estimation of covariance matrix, $\hat{G}$ is Jacoby matrix.
Then, how to get $\widehat{Var}(\hat{\theta})$ ?
Regardless of the form of the error covariance matrix $\Omega$, the covariance matrix of $\hat{\theta}$ equals to: $\widehat{Var}(\hat{\theta})=E((\hat{\theta}-\theta\_0)(\hat{\theta}-\theta\_0)^T)=(X^TX)^{-1}X^T{\Omega}X(X^TX)^{-1} \qquad (2)$ where $X$ is the matrix of independent variables, $\Omega$ is error covariance matrix.

Example
Suppose we have the following regression: $Y\_i=\theta\_0+\frac{\alpha}{1-\alpha}X\_1+\frac{1}{2}\frac{\sigma-1}{\sigma}\frac{\alpha}{(1-\alpha)^2}X\_2$ So, in this case we can get $\theta=\begin{pmatrix} \alpha \\ \sigma \end{pmatrix}$ $\gamma=\begin{pmatrix} \frac{\alpha}{1-\alpha} \\ \frac{1}{2}\frac{\sigma-1}{\sigma}\frac{\alpha}{(1-\alpha)^2} \end{pmatrix}$
$\gamma=g(\theta) \Rightarrow \begin{cases}\gamma\_1=g\_1(\alpha,\sigma)\\\ \gamma\_2=g\_2(\alpha,\sigma) \end{cases}$ Construction of Jacoby matrix
$\hat{G} = \begin{bmatrix} \frac{\partial\gamma\_1}{\partial\theta\_1} & \frac{\partial\gamma\_1}{\partial\theta\_2} \\\ \frac{\partial\gamma\_2}{\partial\theta\_1} & \frac{\partial\gamma\_2}{\partial\theta\_2} \end{bmatrix}$ So, $\hat{G} = \begin{bmatrix} \frac{\partial(\frac{\alpha}{1-\alpha})}{\partial\alpha} & \frac{\partial(\frac{\alpha}{1-\alpha})}{\partial\sigma} \\\ \frac{\partial(\frac{1}{2}\frac{\sigma-1}{\sigma}\frac{\alpha}{(1-\alpha)^2})}{\partial\alpha} & \frac{\partial(\frac{1}{2}\frac{\sigma-1}{\sigma}\frac{\alpha}{(1-\alpha)^2})}{\partial\sigma} \end{bmatrix}$ By conducting OLS regression, we can get $\hat{\gamma_1}$, $\hat{\gamma_2}$ and the error covariance matrix $\Omega$.
Following equation (2), we can get: $\widehat{Var}(\hat{\gamma})=(X^TX)^{-1}X^T{\Omega}X(X^TX)^{-1}$ Then, according to equation (1), we can get: $\widehat{Var}(\hat{\theta})\equiv(\hat{G})^{-1}\widehat{Var}(\hat{\gamma})(\hat{G}^T)^{-1}$

2. nls(.) function

nls is a build-in function in R, which is used to determine the nonlinear (weighted) least-squares estimates of the parameters of a nonlinear model.

Usage of nls:

nls(formula, data, start, control, algorithm, trace, subset, weights, na.action, model, lower, upper, ...)

Note: You can find out more explanation on parameter settings by browsing R help documentation.

Example
Suppose we still have the following regression: $Y\_i=\theta\_0+\frac{\alpha}{1-\alpha}X\_1+\frac{1}{2}\frac{\sigma-1}{\sigma}\frac{\alpha}{(1-\alpha)^2}X\_2$ We can use nls to get the estimation of parameters $\alpha$, $\sigma$ directly.

#In this method, we need to tell R the approximate initial value of alpha and sigma in the first place.
lm=nls(y~constant+alpha/(1-alpha)*x1+1/2*(sigma-1)/sigma*alpha/(1-alpha)^2*x2,start=list(constant=5,alpha=0.3,sigma=1.5),trace=F)
summary(lm)

Notice on paper replication

You need to replicate the paper The Solow Model with CES Technology: Nonlinearities and Parameter Heterogeneity as the final assignment in this course. Related data can be download from here.
You only need to replicate the Table 1, Table 2 and Table 4.
You need to use Delta method to estimate the standard errors of parameters of Restricted Basic Solow-CD Model, Restricted Extended Solow-CD Model and Restricted Basic Solow-CES Model in Table 1 and Table 2.
You need to use dur_john.Rdata for Table 1. And you need to delete those following rows of data to get the right answer: 13, 14, 6, 19, 36, 44, 45, 50, 51, 56, 59, 62, 66, 68, 69, 72, 78, 81, 82, 91, 111, 114, 118.
You need to use datamp2.Rdata to replicate Table 2.
You need to use datamp1.Rdata to replicate Table 4.
Data description in the original file are not totally correct. Please see the following data description, which I have amended.

Data description

Winford H. Masanjala and Chris Papageorgiou, “The Solow Model with CES Technology: Nonlinearities and Parameter Heterogeneity”, Journal of Applied Econometrics, Vol. 19, No. 2, 2004, pp. 171-201.

Documentation for data in dur_john.Rdata

CODE=Country number in Summers-Heston dataset.
NONOIL=1 for nonoil producing countries.
INTER=1 for countries with better quality data.
OECD=1 for OECD countries.
GDP60=Per capita GDP in 1960.
GDP85=Per capita GDP in 1985.
GDPGRO=Average growth rate of per capita GDP (1960-1985).
POPGRO=Average growth rate of working-age population (1960-1985).
IONY=Average ratio of investment (including Government Investment) to GDP(1960-1985).
SCHOOL=Average fraction of working-age population enrolled in secondary school (1960-1985).
LIT60=fraction of the population over 15 years old that is able to read and write in 1960.
NA indicates that the observation is missing. This dataset has also being used in Durlauf and Johnson (JAE 1995). There are 121 observations for each variable. All of the data with the exception of LIT60 are from Mankiw, Romer and Weil (QJE 1992), who in turn constructed the data from Penn World Tables 4.0. LIT60 is from the World Bank’s World Development Report.

Documentation for data in datamp1.txt

CODE=Country number in Summers-Heston dataset.
GDP60=Per capita GDP in 1960.
GDP85=Per capita GDP in 1985.
POPGRO=Average growth rate of working-age population (1960-1985). IONY=Average ratio of investment to GDP (1960-1985).
SCHOOL=Average fraction of working-age population enrolled in secondary school (1960-1985).
LIT60=fraction of the population over 15 years old that is able to read and write in 1960.
There are 96 observations for each variable. All of the data with the exception of LIT60 are from Mankiw, Romer and Weil (QJE 1992) who in turn constructed the data from Penn World Tables 4.0. LIT60 is from the World Bank’s World Development Report.

Documentation for data in datamp2.txt

CODE=Country number in Summers-Heston dataset.
GDP60=Per capita GDP in 1960.
GDP85=Per capita GDP in 1985.
IONY=Average ratio of investment to GDP (1960-1995).
SCHOOL=Average fraction of working-age population enrolled in secondary school (1960-1995).
POPGRO=Average growth rate of working-age population (1960-1995).
There are 90 observations for each variable. All of the data are from Bernanke and Gurkaynak (NBER Macroeconomics Annual 2001) who constructed the data from Penn World Tables 6.0.