Investment Studio > Expressions > Functions > Statistical > GROWTH

float array[*][1] growth(float array known_ys, float array known_xs = {1, 2, 3 [, ...]}, float array new_xs = known_xs, boolean fit_constant = TRUE)

Returns a column vector containing y values for the specified points (new_xs) on the exponential surface in N dimensions (independent variables)

  N  
y(x1, x2, x3, ... xN) = b P mk^xk
  k = 1  

providing the best (least squares) fit to the known data pairs (known_ys, known_xs).

known_ys is the set of known y values. If it contains a single column (row), each column (row) in known_xs is interpreted as containing the values of a separate independent variable. If known_ys contains multiple columns and rows, known_xs is interpreted as containing the values of a single independent variable; corresponding elements in the two arrays are then determined by order of appearance (reading from left to right, top to bottom).

All values in known_ys must be > 0.

known_xs is the set of known x values. It can contain one or more independent variables. If only one independent variable is used, known_xs and known_ys can have any shape(s) as long as they contain the same number of elements; corresponding elements in the two arrays are then determined by order of appearance (reading from left to right, top to bottom). If more than one independent variable is used, known_ys must be a column (row) vector, and known_xs must contain a column (row) for each independent variable.

known_xs may be omitted, in which case it defaults to the array {1, 2, 3, ...} with the same number of elements as known_ys.

new_xs is the set of x values for which y values are to be computed using the best (least squares) fit of an exponential surface to the known data pairs. It must contain a column (row) for each independent variable, i.e. for each column (row) in known_xs if known_ys contains a single column (row).

new_xs may be omitted, in which case it defaults to known_xs.

fit_constant is used to specify how the factor b of the exponential surface fit is to be computed. If fit_constant = TRUE, the value of b returned by the normal least square fit is used. If fit_constant = FALSE, b is forced to = 1 and the mk parameters are adjusted accordingly to still fit the known y values.

If fit_constant is omitted, it defaults to TRUE.

All array elements are converted to float, with exclusion if conversion fails.

Examples

The Dow Jones Industrial Average opened the first four decades of the twentieth century at {68.3, 98.34, 108.76, 244.2}. This accelerating series looks like a good candidate for an exponential fit. To check the validity of such a fit, we try

=growth({68.13, 98.34, 108.76, 244.2}, {1900, 1910, 1920, 1930})

which returns (with rounding to two decimals) {{64.05}, {94.89}, {140.58}, {208.26}}.

There is a large error (a 29% overshoot) in 1920. Since an interruption of exponential growth is to be expected in a war decade (World War I raged in the period 1914-1918) we exclude 1920 as an outlier and try again:

=growth({68.13, 98.34, 244.2}, {1900, 1910, 1930})

Again with rounding to two decimals, this returns {{66.44}, {102.11}, {241.16}}, which is excellent, with a max deviation of less than 3.8% (1910). This is in itself remarkable: it suggests that even the disruption of WWI wasn't enough to derail the long-term trend of the decades preceding it, and that the market was able not only to recover but to actually catch up with the trend once the disturbance was over.

Encouraged by this success, we decide to use our exponential fit to forecast the first closes in 1940 and 1950:

=growth({68.13, 98.34, 244.2}, {1900, 1910, 1930}, {1940, 1950})

Again rounding to two decimals, the result is {{370.61}, {569.55}}. Alas, this is way off: the actual closes were 108.76 (1940) and 244.2 (1950). The reason is of course the bursting of the 1929 bubble, the subsequent Big Depression and World War II.

On a much shorter time scale, a glance at a logarithmic chart of the DJIA in the 90s shows that something (some would say "the Internet") happened at the end of 1994 which caused the growth rate to suddenly triple compared to the first half of the decade. The closing quotes of 1994, 1995, 1996, and 1997 give us a very neat exponential fit: rounded to two decimals,

=growth({3834.4, 5117.1, 6448.3, 7908.3}, {1994, 1995, 1996, 1997})

returns {{3922.16}, {4987.51}, {6342.22}, {8064.90}}, with a max deviation of less than 2.6%. But trying to forecast the market of the following years using this fit would have led us increasingly astray: with the usual rounding,

=growth({3834.4, 5117.1, 6448.3, 7908.3}, {1994, 1995, 1996, 1997}, {1998, 1999, 2000, 2001})

returns {{10255.50}, {13041.12}, {16583.37}, {21087.77}}, versus actual end of year closes of 9181.4 (89.5% of projection), 11497.1 (88.2%), 10788 (65.1%) and 10021.5 (47.5%). Although the index kept climbing two more years after 1997, the pattern established in 1995 was broken in 1998.

No tree grows all the way to heaven.

See also forecast, linest, logest, trend.