In statistics, nonlinear regression is a form of regression analysis in which observational data are modeled by a function which is a nonlinear combination of the model parameters and depends on one or more independent variables. The data are fitted by a method of successive approximations.
The assumption underlying this procedure is that the model can be approximated by a linear function, namely a first-order Taylor series: where. It follows from this that the least squares estimators are given by The nonlinear regression statistics are computed and used as in linear regression statistics, but using J in place of X in the formulas. The linear approximation introduces bias into the statistics. Therefore, more caution than usual is required in interpreting statistics derived from a nonlinear model.
The best-fit curve is often assumed to be that which minimizes the sum of squared residuals. This is the ordinary least squares approach. However, in cases where the dependent variable does not have constant variance, a sum of weighted squared residuals may be minimized; see weighted least squares. Each weight should ideally be equal to the reciprocal of the variance of the observation, but weights may be recomputed on each iteration, in an iteratively weighted least squares algorithm.
Linearization
Transformation
Some nonlinear regression problems can be moved to a linear domain by a suitable transformation of the model formulation. For example, consider the nonlinear regression problem with parameters a and b and with multiplicative error termU. If we take the logarithm of both sides, this becomes where u = ln, suggesting estimation of the unknown parameters by a linear regression of ln on x, a computation that does not require iterative optimization. However, use of a nonlinear transformation requires caution. The influences of the data values will change, as will the error structure of the model and the interpretation of any inferential results. These may not be desired effects. On the other hand, depending on what the largest source of error is, a nonlinear transformation may distribute the errors in a Gaussian fashion, so the choice to perform a nonlinear transformation must be informed by modeling considerations. For Michaelis–Menten kinetics, the linear Lineweaver–Burk plot of 1/v against 1/ has been much used. However, since it is very sensitive to data error and is strongly biased toward fitting the data in a particular range of the independent variable, , its use is strongly discouraged. For error distributions that belong to the exponential family, a link function may be used to transform the parameters under the Generalized linear model framework.
Segmentation
The independent or explanatory variable can be split up into classes or segments and linear regression can be performed per segment. Segmented regression with confidence analysis may yield the result that the dependent or response variable behaves differently in the various segments. The figure shows that the soil salinity initially exerts no influence on the crop yield of mustard, until a critical or threshold value, after which the yield is affected negatively.