Multivariate adaptive regression splines
From Wikipedia, the free encyclopedia
In statistics, multivariate adaptive regression splines (MARS) is a form of regression analysis introduced by Jerome H. Friedman in 1991.[1] It is a non-parametric regression technique and can be seen as an extension of linear models that automatically models nonlinearities and interactions between variables.
The term "MARS" is trademarked and licensed to Salford Systems. In order to avoid trademark infringements, many open source implementations of MARS are called "Earth".[2][3]
The basics[edit]
This section introduces MARS using a few examples. We start with a set of data: a matrix of input variables x, and a vector of the observed responses y, with a response for each row in x. For example, the data could be:
x y
10.5 16.4
10.7 18.8
10.8 19.7
... ...
20.6 77.0
Here there is only one independent variable, so the x matrix is just a single column. Given these measurements, we would like to build a model which predicts the expected y for a given x.
A linear model
A linear model for the above data is
{\displaystyle {\hat {y}}=-37+5.1x} {\hat {y}}=-37+5.1x
The hat on the {\displaystyle {\hat {y}}} {\hat {y}} indicates that {\displaystyle {\hat {y}}} {\hat {y}} is estimated from the data. The figure on the right shows a plot of this function: a line giving the predicted {\displaystyle {\hat {y}}} {\hat {y}} versus x, with the original values of y shown as red dots.
The data at the extremes of x indicates that the relationship between y and x may be non-linear (look at the red dots relative to the regression line at low and high values of x). We thus turn to MARS to automatically build a model taking into account non-linearities. MARS software constructs a model from the given x and y as follows
{\displaystyle {\begin{aligned}{\hat {y}}=&\ 25\\&+6.1\max(0,x-13)\\&-3.1\max(0,13-x)\\\end{aligned}}} {\begin{aligned}{\hat {y}}=&\ 25\\&+6.1\max(0,x-13)\\&-3.1\max(0,13-x)\\\end{aligned}}
A simple MARS model of the same data
The figure on the right shows a plot of this function: the predicted {\displaystyle {\hat {y}}} {\hat {y}} versus x, with the original values of y once again shown as red dots. The predicted response is now a better fit to the original y values.
In statistics, multivariate adaptive regression splines (MARS) is a form of regression analysis introduced by Jerome H. Friedman in 1991.[1] It is a non-parametric regression technique and can be seen as an extension of linear models that automatically models nonlinearities and interactions between variables.
The term "MARS" is trademarked and licensed to Salford Systems. In order to avoid trademark infringements, many open source implementations of MARS are called "Earth".[2][3]
The basics[edit]
This section introduces MARS using a few examples. We start with a set of data: a matrix of input variables x, and a vector of the observed responses y, with a response for each row in x. For example, the data could be:
x y
10.5 16.4
10.7 18.8
10.8 19.7
... ...
20.6 77.0
Here there is only one independent variable, so the x matrix is just a single column. Given these measurements, we would like to build a model which predicts the expected y for a given x.
A linear model
A linear model for the above data is
{\displaystyle {\hat {y}}=-37+5.1x} {\hat {y}}=-37+5.1x
The hat on the {\displaystyle {\hat {y}}} {\hat {y}} indicates that {\displaystyle {\hat {y}}} {\hat {y}} is estimated from the data. The figure on the right shows a plot of this function: a line giving the predicted {\displaystyle {\hat {y}}} {\hat {y}} versus x, with the original values of y shown as red dots.
The data at the extremes of x indicates that the relationship between y and x may be non-linear (look at the red dots relative to the regression line at low and high values of x). We thus turn to MARS to automatically build a model taking into account non-linearities. MARS software constructs a model from the given x and y as follows
{\displaystyle {\begin{aligned}{\hat {y}}=&\ 25\\&+6.1\max(0,x-13)\\&-3.1\max(0,13-x)\\\end{aligned}}} {\begin{aligned}{\hat {y}}=&\ 25\\&+6.1\max(0,x-13)\\&-3.1\max(0,13-x)\\\end{aligned}}
A simple MARS model of the same data
The figure on the right shows a plot of this function: the predicted {\displaystyle {\hat {y}}} {\hat {y}} versus x, with the original values of y once again shown as red dots. The predicted response is now a better fit to the original y values.
还没人转发这篇日记