Simple Linear Regression

Suppose that the researcher wants to find how increasing the amount $ x$ of a certain chemical in the soil increases the amount $ y$ of that chemical found in the plants. Then $ n$ independent experiments are conducted with different amounts of the chemical, and data are produced, consisting of the amounts $ Y_1,\ldots,Y_n$ of the chemical in the plants paired with the respective amounts $ x_1,\ldots,x_n$ of the chemical in soil.

Amount in soil Amount in plant
$ x_1$ $ Y_1$
$ \vdots$ $ \vdots$
$ x_n$ $ Y_n$

We call the variable for $ x_i$'s independent (or explanatory) variable, and the variable for $ Y_i$'s dependent (or response) variable. When the relationship between the independent variable $ x_i$ and the dependent variable $ Y_i$ is approximated by

$\displaystyle Y_i = \beta_0 + \beta_1 x_i + \epsilon_i,

it is called a simple linear regression model. Here $ \epsilon_i$ is introduced as ``random error'' due to difference between plants, soils, and other possible factors in the experiment.

© TTU Mathematics