Introduction to Panel Threshold Model

• Part One: Slides

• Part Two: R code for this lecture

Python Quiz

• If you need to read the original questions, please visit the website of Quantitative Economics .
• Warning: This is my practice code for Quiz, I do not guarantee the accuracy of my answer

Largest palindrome product

• A palindromic number reads the same both ways. The largest palindrome made from the product of two 2-digit numbers is 9009 = 91 × 99.
• Find the largest palindrome made from the product of two 3-digit numbers.

10001st prime

• By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6th prime is 13.
• What is the 10001st prime number?

Special Pythagorean triplet

• A Pythagorean triplet is a set of three natural numbers, a < b < c, for which, a^2+b^2=c^2
• There exists exactly one Pythagorean triplet for which a+b+c=1000. Find the product abc.

Largest product in a series

• The four adjacent digits in the 1000-digit number that have the greatest product are 9 × 9 × 8 × 9 = 5832.

731671765313306249192251196744265747423553491949349698352031277450632623957831 801698480186947885184385861560789112949495459501737958331952853208805511125406 987471585238630507156932909632952274430435576689664895044524452316173185640309 871112172238311362229893423380308135336276614282806444486645238749303589072962 904915604407723907138105158593079608667017242712188399879790879227492190169972 088809377665727333001053367881220235421809751254540594752243525849077116705560 136048395864467063244157221553975369781797784617406495514929086256932197846862 248283972241375657056057490261407972968652414535100474821663704844031998900088 952434506585412275886668811642717147992444292823086346567481391912316282458617 866458359124566529476545682848912883142607690042242190226710556263211111093705 442175069416589604080719840385096245544436298123098787992724428490918884580156 166097919133875499200524063689912560717606058861164671094050775410022569831552 0005593572972571636269561882670428252483600823257530420752963450

• Find the thirteen adjacent digits in the 1000-digit number that have the greatest product. What is the value of this product?

Summation of primes

• The sum of the primes below 10 is 2 + 3 + 5 + 7 = 17.
• Find the sum of all the primes below two million.

I am pretty sure this is not the most effective method. The code is still robust on the data scale of 200000, but I cannot get the result if the data scale expands to 2000000. I am still looking for some alternative methods.

R Introduction Part 3

• This is my lecture note written for undergraduate R programming course, which is a supplementary course of the Advanced Macroeconomics course.
• Most figures and tables are extracted from R in Action.
• All exercises are copied from my teacher Jing Fang‘s R Workshop, you can find the data for exercises on the website of R Workshop.

Nonlinear Regression

What is Nonlinear Regression?

In statistics, nonlinear regression is a form of regression analysis in which observational data are modeled by a function which is a nonlinear combination of the model parameters and depends on one or more independent variables.
1.Linear Regression
$$Y_i=\beta_1+\beta_2X_i+\mu_i$$
2.Nonlinear Regression
$$Y_i=\beta_1e^{\beta_2X_i}+\mu_i$$

How to cope with Nonlinear Regression?

1.Delta Method

• Single parameter situation
If we know the standard error of parameter $\hat{\theta}$, however, what we are really interested is the standard error of parameter $\gamma=g(\theta)$, how to estimate its standard error?
1.If $g(\theta)$ is a linear function, you know how to get it.
2.If $g(\theta)$ is not a linear function, then:
We assume the standard error of $\hat{\theta}$ is $S_{\theta}$, and the standard error of $\hat{\gamma}$ is $S_{\gamma}$, the relationship between them is
$$S_{\gamma}\equiv|g^\prime(\hat{\theta})|S_{\theta}$$
• Multi-parameters situation
If $\theta$ and $\gamma$ are both vectors, then how? We assume the first one is a $k$-dimension vector and the latter is a $l$-dimension vector, $l\leqq k$. $\gamma$ is a function of $\theta$: $\gamma=g(\theta)$, where $g(\theta)$ is a monotonic and continuous $l$-dimension vector function. Then we can get the covariance matrix of $\hat{\gamma}$ using the following equation:
$$\widehat{Var}(\hat{\gamma})\equiv\hat{G}\widehat{Var}(\hat{\theta})\hat{G}^T \qquad (1)$$
where $\widehat{Var}(\hat{\theta})$ is the estimation of covariance matrix, $\hat{G}$ is Jacoby matrix.
Then, how to get $\widehat{Var}(\hat{\theta})$ ?
Regardless of the form of the error covariance matrix $\Omega$, the covariance matrix of $\hat{\theta}$ equals to:
$$\widehat{Var}(\hat{\theta})=E((\hat{\theta}-\theta_0)(\hat{\theta}-\theta_0)^T)=(X^TX)^{-1}X^T{\Omega}X(X^TX)^{-1} \qquad (2)$$
where $X$ is the matrix of independent variables, $\Omega$ is error covariance matrix.

Example
Suppose we have the following regression:
$$Y_i=\theta_0+\frac{\alpha}{1-\alpha}X_1+\frac{1}{2}\frac{\sigma-1}{\sigma}\frac{\alpha}{(1-\alpha)^2}X_2$$
So, in this case we can get $\theta=\begin{pmatrix} \alpha \\ \sigma \end{pmatrix}$    $\gamma=\begin{pmatrix} \frac{\alpha}{1-\alpha} \\ \frac{1}{2}\frac{\sigma-1}{\sigma}\frac{\alpha}{(1-\alpha)^2} \end{pmatrix}$
$$\gamma=g(\theta) \Rightarrow \begin{cases}\gamma_1=g_1(\alpha,\sigma)\\ \gamma_2=g_2(\alpha,\sigma) \end{cases}$$
Construction of Jacoby matrix
$$\hat{G} = \begin{bmatrix} \frac{\partial\gamma_1}{\partial\theta_1} & \frac{\partial\gamma_1}{\partial\theta_2} \\ \frac{\partial\gamma_2}{\partial\theta_1} & \frac{\partial\gamma_2}{\partial\theta_2} \end{bmatrix}$$
So,
$$\hat{G} = \begin{bmatrix} \frac{\partial(\frac{\alpha}{1-\alpha})}{\partial\alpha} & \frac{\partial(\frac{\alpha}{1-\alpha})}{\partial\sigma} \\ \frac{\partial(\frac{1}{2}\frac{\sigma-1}{\sigma}\frac{\alpha}{(1-\alpha)^2})}{\partial\alpha} & \frac{\partial(\frac{1}{2}\frac{\sigma-1}{\sigma}\frac{\alpha}{(1-\alpha)^2})}{\partial\sigma} \end{bmatrix}$$
By conducting OLS regression, we can get $\hat{\gamma_1}$, $\hat{\gamma_2}$ and the error covariance matrix $\Omega$.
Following equation (2), we can get:
$$\widehat{Var}(\hat{\gamma})=(X^TX)^{-1}X^T{\Omega}X(X^TX)^{-1}$$
Then, according to equation (1), we can get:
$$\widehat{Var}(\hat{\theta})\equiv(\hat{G})^{-1}\widehat{Var}(\hat{\gamma})(\hat{G}^T)^{-1}$$

2. nls(.) function

• nls is a build-in function in R, which is used to determine the nonlinear (weighted) least-squares estimates of the parameters of a nonlinear model.

• Usage of nls:

Note: You can find out more explanation on parameter settings by browsing R help documentation.

Example
Suppose we still have the following regression:
$$Y_i=\theta_0+\frac{\alpha}{1-\alpha}X_1+\frac{1}{2}\frac{\sigma-1}{\sigma}\frac{\alpha}{(1-\alpha)^2}X_2$$
We can use nls to get the estimation of parameters $\alpha$, $\sigma$ directly.

Notice on paper replication

• You need to replicate the paper The Solow Model with CES Technology: Nonlinearities and Parameter Heterogeneity as the final assignment in this course. Related data can be download from here.
• You only need to replicate the Table 1, Table 2 and Table 4.
• You need to use Delta method to estimate the standard errors of parameters of Restricted Basic Solow-CD Model, Restricted Extended Solow-CD Model and Restricted Basic Solow-CES Model in Table 1 and Table 2.
• You need to use dur_john.Rdata for Table 1. And you need to delete those following rows of data to get the right answer: 13, 14, 6, 19, 36, 44, 45, 50, 51, 56, 59, 62, 66, 68, 69, 72, 78, 81, 82, 91, 111, 114, 118.
• You need to use datamp2.Rdata to replicate Table 2.
• You need to use datamp1.Rdata to replicate Table 4.
• Data description in the original file are not totally correct. Please see the following data description, which I have amended.

Data description

Winford H. Masanjala and Chris Papageorgiou, “The Solow Model with CES
Technology: Nonlinearities and Parameter Heterogeneity”, Journal of Applied
Econometrics, Vol. 19, No. 2, 2004, pp. 171-201.

Documentation for data in dur_john.Rdata

CODE=Country number in Summers-Heston dataset.
NONOIL=1 for nonoil producing countries.
INTER=1 for countries with better quality data.
OECD=1 for OECD countries.
GDP60=Per capita GDP in 1960.
GDP85=Per capita GDP in 1985.
GDPGRO=Average growth rate of per capita GDP (1960-1985).
POPGRO=Average growth rate of working-age population (1960-1985).
IONY=Average ratio of investment (including Government Investment) to GDP(1960-1985).
SCHOOL=Average fraction of working-age population enrolled in secondary school (1960-1985).
LIT60=fraction of the population over 15 years old that is able to read and write in 1960.
NA indicates that the observation is missing. This dataset has also being used in Durlauf and Johnson (JAE 1995).

There are 121 observations for each variable. All of the data with the exception of LIT60 are from Mankiw, Romer and Weil (QJE 1992), who in turn constructed the data from Penn World Tables 4.0. LIT60 is from the World Bank’s World Development Report.

Documentation for data in datamp1.txt

CODE=Country number in Summers-Heston dataset.
GDP60=Per capita GDP in 1960.
GDP85=Per capita GDP in 1985.
POPGRO=Average growth rate of working-age population (1960-1985).
IONY=Average ratio of investment to GDP (1960-1985).
SCHOOL=Average fraction of working-age population enrolled in secondary school (1960-1985).
LIT60=fraction of the population over 15 years old that is able to read and write in 1960.

There are 96 observations for each variable. All of the data with the exception of LIT60 are from Mankiw, Romer and Weil (QJE 1992) who in turn constructed the data from Penn World Tables 4.0. LIT60 is from the World Bank’s World Development Report.

Documentation for data in datamp2.txt

CODE=Country number in Summers-Heston dataset.
GDP60=Per capita GDP in 1960.
GDP85=Per capita GDP in 1985.
IONY=Average ratio of investment to GDP (1960-1995).
SCHOOL=Average fraction of working-age population enrolled in secondary school (1960-1995).
POPGRO=Average growth rate of working-age population (1960-1995).

There are 90 observations for each variable. All of the data are from Bernanke and Gurkaynak (NBER Macroeconomics Annual 2001) who constructed the data from Penn World Tables 6.0.

R Introduction Part 2

• This is my lecture note written for undergraduate R programming course, which is a supplementary course of the Advanced Macroeconomics course.
• Most figures and tables are extracted from R in Action.
• All exercises are copied from my teacher Jing Fang‘s R Workshop, you can find the data for exercises on the website of R Workshop.

Regression wiht R

As you can see in the following table, the term regression can be confusing because there are so many specialized varieties. In this class we will only focus on OLS, Nonparametric Regression and Robust Regression.

OLS regression

$$\hat{Y_i}=\hat{\beta_0}+\hat{\beta_1}X_{1,i}+…+\hat{\beta_k}X_{k,i} \qquad i=1…n$$
where $n$ is the number of observatiions and $k$ is the number of predictor variables. In this equation:
$\hat{Y_i}$: is the predicted value of the dependent variable for observation $i$ (specifically, it is the estimated mean of the $Y$ distribution, conditional on the set of predictor values).
$X_{j,i}$: is the $j^{th}$ predictor value for the $i^{th}$ observation.
$\hat{\beta_0}$: is the intercept (the predicted value of $Y$ when all the predictor variables equal 0).
$\hat{\beta_j}$: is the regression coefficient for the $j^{th}$ predictor (slope representing the change in $Y$ for a unit change in $X_j$).

To properly interpret the coefficients of the OLS model, you must satisfy a number of statistical assumptions:

• Normality—For fixed values of the independent variables, the dependent
variable is normally distributed.
• Independence—The $Y_i$ values are independent of each other.
• Linearity—The dependent variable is linearly related to the independent
variables.
• Homoscedasticity—The variance of the dependent variable doesn’t vary with the
levels of the independent variables.

lm(.) function

• OLS regression can be conducted by using lm function.
• Usage of lm:

Note: You can find out more explanation on parameter settings by browsing R help documentation.

• An example:
Suppose you need to conduct OLS regression on $y=c+x_1+x_2+…+x_n$, data have been stored in a data frame named mydata.
You type the following commend in RStudio to tell R to conduct OLS regression:

There are some symbols listed in the following table, which are commonly used in regression.

Regression diagnostics

• Test whether the coefficients in regression model satisfy certain constrains.

• Test whether some variables in your regression model are combined significant.
1.Use anova function

Heteroskedasticity

• What is heteroskedasticity?
One of our assumptions for linear regression is that the random error terms in our regression have same variance. However, there are times that regression ends with heteroskedasticity, which means the random error terms have different variances.

• What may happen if heteroskedasticity does exist in our regression?
In this situation, t-test and F-test may become not reliable and misleading. Those supposed-to-be-significant coefficients are not significant any more.

• How to discern heteroskedasticity?

• How to get robust standard error when heteroskedasticity appears?

R Introduction Part 1

• This is my lecture note written for undergraduate R programming course, which is a supplementary course of the Advanced Macroeconomics course.
• Most figures and tables are extracted from R in Action.
• All exercises are copied from my teacher Jing Fang‘s R Workshop, you can find the data for exercises on the website of R Workshop.

What is R ?

R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, and is currently developed by the R Development Core Team, of which Chambers is a member. The R language is widely used among statisticians and data miners for developing statistical software and data analysis. For more information, please see Wiki: R(programming language))

Why choose R?

1. R is a GNU project and totally free for everyone.
2. R runs on a wide array of platforms, like Windows, Unix and Mac OS. You can even run R programming on your iPhone or iPad!
3. R can easily import data from various types of data sources, including text files, database management systems, statistical packages, and specialized data repositories. It can write data out to these systems as well.
4. R contains advanced statistical routines not yet available in other packages. In fact, new methods become available for download on a weekly basis.
5. R has state-of-the-art graphics capabilities (try the package: ggplot2).
6. R has a global community of more than 2 million users and developers who voluntarily contribute their time and technical expertise to maintain, support and extend the R language and its environment, tools and infrastructure.

Comparing R with other mainstream softwares

• Stata: Widely used in Economics field.
• SAS: excels at coping with big data, the syntax is awkward.
• Matlab: has strong ability of numerical value calculation and is incredible large.
• SPSS: widely used in Business Management.
• Eviews: is good at Time Series Analysis.

Most important: They are all commercial software, which means they are not free to individual user.

Installation of R

• How to get R?
• RStudio
RStudio is an amazing IDE for R. I strongly recommend you to run R program on RStudio.

Packages of R

One feature of R is that most functions can be achieved by the installation of different packages. There are 6523 packages on the websit of CRAN and the number is still growing.

• Install packages
1.Install packages manually
2.Install packages by using command install.package()
(1) Use command install.packages(), then choose the package you need.
(2) Use command install.packages("lmtest"), R will install the “lmtest” package and those supporting packages automatically.
1. Load packages by using command library()
Use library(lmtest) to load the “lmtest” package before you want to use this package.
Note: You only need to install packages once, but you need to reload those needed packages every time after you restart RStudio.
2. Update packages
Use command update.packages() to update those packages you have installed.

Other notes on using R

• R is a case sensitive language.
• = and <- can both be used as assignment symbol.
Personally, I do not think there is a big difference between them and I highly recommend using = instead of <-. For more information on the subtle difference between these two symbol, please see stackoverflow and 知乎.
• We use # in R to indicate the following part is comment and should not be compiled.
• R cannot read setwd("c:\myprogram"), you should use ‘setwd(“c:/myprogram”)’ or setwd("c:\\myprogram").
• Please wisely use help() or ?the_name_of_funciton to read help documentation. The website Rseeker and Google are also really helpful.

Create dataset

A dataset is usually a rectangular array of data with rows representing observations
and columns representing variables. The following table provides an example of a hypothetical
patient dataset.

• How to construct a dataset?
1.Import or type data into the data structures.
2.Choose a type of data structures to store data.

• Data input
From the following figure, we can see that R can cope with different data formats.

read.table() #This can be used to read txt file, the argument na.strings="." should be used to convert the missing value (“.” in txt file) to NA value.
read.csv() #This can be used to read CSV file, highly recommended.
load() #This can be used to load Rdata file.
Note: You can find out more information on how to input other different types of data in the book R in Action (R语言实战).

• Data structures
R has a wide variety of objects for holding data, including scalars, vectors, matrices,
arrays, data frames, and lists. They differ in terms of the type of data they can hold,
how they are created, their structural complexity, and the notation used to identify and
access individual elements. The figure shows a diagram of these data structures.

1.Vectors
Vectors are one-dimensional arrays that can hold numeric data, character data, or logical
data. The combine function c() is used to form the vector.

The comment operation symbols in matrix:
Dimensions of matrix x: dim(x)
Rows of matrix x: nrow(x)
Columns of matrix x: col(x)
Transpose of matrix x: t(x)
The value of the determinant of matrix x: det(x)
If the determinant is not 0, then we can get the inverse of matrix x: solve(x)
Eigenvalue and eigenvector: y=eigen(x), then y$val is eigenvalue, y$vec is eigenvector.
Multiplication of matrices: a %*% b
Arithmetic operations and power operation on every element of matrix: + - * / ^

3.Data frames
Personally, I think data frame is the most important and common data structure you will deal with in R. A data frame is more general than a matrix in that different columns can contain different modes of data (numeric, character, etc.).

The different ways to read data from data frame:

Useful data operation functions

1.summary
summary is a generic function used to produce result summaries of the results of various model fitting functions.
summary(data$example) or summary(data) 2.sum sum returns the sum of all the values present in its arguments. sum(data$example, na.rm=TRUE)

3.nrow and ncol
nrow and ncol return the number of rows or columns present in x.

4.cbind and rbind
Take a sequence of vector, matrix or data-frame arguments and combine by columns or rows, respectively.

5.replace
replace replaces the values in x with indices given in list by those given in values.
data$V1=replace(data$V1,data$V1<5,0) In this example, if the data in column V1 is less than 5, then those data will be converted to 0. This function is really useful when you want to convert your numerical variable to dummy variable. 6.subset Return subsets of vectors, matrices or data frames which meet conditions. subset(data, subset, select) 7.aggregate Splits the data into subsets, computes summary statistics for each, and returns the result in a convenient form. aggregate(data, by=list(data$variable), mean)
In this example, based on the column variable, R splits the data into subsets and computes mean of those subsets.