Complexity International      ISSN 1320-0682     
Volume 02 April 1995

Time Series Forecasting with Neural Networks

Feng Lin, Xing Huo Yu, Shirley Gregor and Richard Irons
Department of Mathematics and Computing
Central Queensland University

Rockhampton, QLD 4702, Australia

Faculty of Business
Central Queensland University
Rockhampton, QLD 4702, Australia

Email: linf@jasper.cqu.edu.au, x.yu@cqu.edu.au

Abstract:

A scheme for time series forecasting with a neural network is discussed in this paper. This scheme consists of three phases: detection of input patterns, determination of the number of neurons in hidden layer(s), and construction of a neural network forecaster. In the detection phase, autocorrelation analysis is used to identify input patterns of time series for training. Determination of the number of neurons in the hidden layer is done with Baum-Haussler rules. The calculated number of neurons for the hidden layers and the determined input patterns are then used to construct the neural network forecaster. Computer simulations are presented to show the effectiveness of the scheme.


Introduction

Detecting trends and patterns in financial data is of great interest to the business world to support the decision-making process. So far, the primary means of detecting trends and patterns has involved statistical methods such as statistical clustering and regression analysis [1, 7]. The mathematical models associated with these methods for economical forecasting, however, are linear and may fail to forecast the turning points in economic cycles [8] because in many cases the data they model may be highly nonlinear.

A new generation of methodologies, including neural networks, knowledge-based systems and genetic algorithms, has attracted attention for analysis of trends and patterns. In particular, neural networks are being used extensively for financial forecasting with stockmarkets, foreign exchange trading, commodity future trading and bond yields.

Stockmarket prediction is an area of financial forecasting which attracts a great deal of attention [11]. In financial theory, the efficient market hypothesis (EMH), in its weak form, predicts that analysis of time series data alone will provide no excess return over a simple buy and hold strategy. However, it does not deny that such prediction is possible from inside information. Predictive success with neural networks and univariate time series would be contrary to this form of the EMH. Research on using neural networks has been carried out to retrieve trends and patterns of stock markets [9, 10, 12, 13]. It should be pointed out that much work in this area remains confidential, possibly due to fears of loss of competitive advantage by possessors.

Application of neural networks in time series forecasting [9, 10, 12, 13] is based on the ability of neural networks to approximate nonlinear functions. The most popular treatment of input data is feeding the neural networks with either the data at each observation, or the data from several successive observations. Denote the data at instant k as y(k), where y may be a vector, then the above treatment can be described as tex2html_wrap_inline412 or tex2html_wrap_inline414 respectively, where NN() stands for the neural network forecaster and l is the number of successive observations. This treatment considers the time series as a nonlinear time series and tends to generate a nonlinear "auto-regression" model to fit the series. So far, there have been few papers describing how to choose inputs for the neural network forecaster in order to achieve better forecasting performance. It is our belief that the performance of a neural network forecaster is much affected by input data patterns.

Autocorrelation analysis has been often used in time series forecasting using statistical approaches such as ARMA models. This analysis is mainly used in detecting the autocorrelations between successive observations of time series, and used in the well-known ARIMA models with Box-Jenkins methods that are very efficient in forecasting linear time series [7].

Autocorrelation analysis can be used to determine the correct input patterns for nonlinear time series forecasting with a neural network. This paper presents a scheme for time series forecasting with a neural network. The scheme contains three phases: detection of input patterns, determination of the number of neurons in hidden layer(s), and construction of the neural network forecaster. In the detection phase, autocorrelation analysis is used to identify input patterns of time series for training. Determination of the number of neurons in hidden layer(s) is done with Baum-Haussler rules [5]. The neural network forecaster is then constructed with the determined input patterns and the number of neurons in hidden layer(s). The Hang Seng Index is used to illustrate an application of the scheme.

This paper is organised as follows. In Section 2 the time series forecasting scheme is proposed. In Section 3 simulation results are presented to show the effectiveness of the proposed forecaster. These results are compared with other forecasters. The neural network forecaster is examined in two situations: short-term forecasting and long-term forecasting. Section 4 presents our conclusions.


A scheme for time series forecasting with neural networks

Feedforward neural networks are composed of layers of neurons in which the input layer of neurons is connected to the output layer of neurons through one or more layers of intermediate neurons. The training process of the neural network involves adjusting the weights till a desired input/output relationship is obtained. The majority of adaptation learning algorithms are based on the Widrow-Hoff back-propagation algorithm [15, 16].

The neural network forecaster can be described as follows:

  equation32

where z is either original observations or processed data, and tex2html_wrap_inline422 are residuals. The processing of the input data and the number of e's and z's are needed to determine the performance of the forecaster.

In this paper, we propose a scheme for time series forecasting with a feedforward neural network. The scheme include three phases:

Detailed discussions on each phase are given in the following sections.


Detection of input patterns

Input pattern detection is done using autocorrelations analysis. The Appendix gives a brief description of autocorrelation analysis.

The detection involves two steps:

Step 1
For a given time series, calculate the autocorrelation coefficients. If a trend is detected, then differencing should be used to remove the trend. This step should be repeated until the trend is removed to a reasonable degree. This step is important as autocorrelation coefficients may be used to determine the lags of residuals for short term forecasting.

Step 2
Calculate the partial autocorrelation coefficients. The information will tell us how tex2html_wrap_inline428 is auto-correlated to tex2html_wrap_inline430. Choose those partial autocorrelation coefficients that are significantly different from the rest of the coefficients. The largest lag between any two of the coefficients is the number of inputs needed for the neural network forecaster.

Determination of the number of neurons in the hidden layer(s)

The number of neurons in the hidden layer is a concern in the application of neural networks to time series forecasting. A rule of thumb [5], known as the Baum-Haussler rule, is used to determine the number of hidden neurons to be used:

  equation54

where tex2html_wrap_inline432 is the number of hidden neurons, tex2html_wrap_inline434 is the number of training examples, tex2html_wrap_inline436 is the error tolerance, tex2html_wrap_inline438 is the number of data points per training example, and tex2html_wrap_inline440 is the number of output neurons.

This rule generally ensures that neural networks generalise, rather than memorise.


Construction of the neural network forecaster

Based on the input patterns determined in Section 2.1, and the number of neurons in the hidden layer determined in Section 2.2, we can construct the neural network forecaster.

There are two cases that should be considered in time series forecasting: short-term forecasting and long-term forecasting. By short-term forecasting we mean that the neural network forecaster is actually a one-step-ahead predictor.

With determined input patterns and the number of neurons in hidden layer(s), we propose the neural network forecaster for the short-term forecasting as shown in Figure 1 in which tex2html_wrap_inline442 represents the delay operator; that is, tex2html_wrap_inline444 . This structure is distinguished from other neural network forecasters in that residuals are considered as inputs as well. This structure is inspired by mechanism of conventional statistical forecasting models such as ARIMA models, which consider forecasting as a decision made based on several previous successive actual observations, and residuals that are the difference between actual observations and their predictions.

Long-term prediction is of importance in determining the future trend of a time series that requires several or a number of steps ahead prediction. Residuals are no longer available, as actual future data is not known. For long-term forecasting, because residuals are not available, the "feedback" loops from output in Figure 1 should be removed. Training long-term forecasters does not involve residual terms - this is apparently different from training short-term forecasters.

  figure73
Figure 1: Neural network forecaster for short-term forecasting


Computer simulations

To show the effectiveness of the scheme proposed, we considered average weekly data of the Hang Seng Index from 1980 to 1990 from the Hong Kong stock market. We used the index from the first 500 weeks to train the neural network forecaster and forecast the next 20 to 50 weeks in order to make comparisons. We chose a three-layered feedforward neural network with the QuickProp algorithm [14]. All simulations were done on an ULTRIX 4.2. The performance of the neural network forecaster was measured in terms of Mean Absolute Percentage Error (MAPE) [6]; that is,

equation82

where tex2html_wrap_inline456 is the time series or processed time series, n is the number of observations in the series.

Autocorrelation analysis was first used to check the autocorrelations between successive observations. SYSTAT Intelligence Software [4] was used to calculate the autocorrelation coefficients and partial autocorrelation coefficients. The results are shown in Figure 2.

  figure90
Figure 2: Autocorrelation Coefficients and Partial Autocorrelation Coefficients.

Figure 2(a) shows that the observations in the index are autocorrelated, and Figure 2(b) depicts that the autocorrelation coefficients between tex2html_wrap_inline460 and tex2html_wrap_inline462 , tex2html_wrap_inline460 and tex2html_wrap_inline466 , tex2html_wrap_inline460 and tex2html_wrap_inline470 , and tex2html_wrap_inline460 and tex2html_wrap_inline474 , are significantly different from others. Thus, the neural network forecaster without differencing should use four inputs; that is, tex2html_wrap_inline476 .

Differencing was then used to remove the trend in the time series. Figure 2(d) shows that after one lag differencing, the items of the differenced index are only autocorrelated with lags 1, 2, and 3. Figure 2(c) shows that there should be two residual terms to be used in the forecaster with lags 1 and 2.

The number of neurons in the hidden layer was determined as follows: Let tex2html_wrap_inline478 , tex2html_wrap_inline480 , and tex2html_wrap_inline482 , from (2)

equation114

The number of neurons in the hidden layer should be tex2html_wrap_inline484 at most for the input numbers tex2html_wrap_inline486 respectively.

For simplicity of description, denote NN[i,j,k,l] as a neural network forecaster with i inputs, difference with j lag, k outputs and l residual terms. The numbers of neurons in the hidden layer for NN[1,0,1,0], NN[3,0,1,0], NN[4,0,1,0], NN[3,1,1,0], NN[3,1,1,2] were chosen to be tex2html_wrap_inline508 respectively.


Short-term forecasting

We trained several neural network forecasters for short-term forecasting.

Table 1 shows the MAPE values for different neural network forecasters, when forecasting a further 60 weeks.

  table120
Table 1: MAPE values of neural network forecasters

It was seen that the performance gets better when the input number was increased. A substantial improvement was read when NN[3,1,1,2] was used. Comparing to NN[1,0,1,0], there was about 0.22% improvement. Bearing in mind that there are terms in the Hang Seng Index reaching above 2000 points, this improvement indicates that the NN[3,1,1,2] is at least about 44 points more accurate than NN[1,0,1,0].

Figure 3 shows the performance of these neural network forecasters.

  figure128
Figure 3: Performance of neural network forecasters for short-term forecasting


Long-term forecasting

  figure135
Figure 4: Performance of neural network forecasters for long-term forecasting

Since residuals were no longer available for forecasting more than one step, we removed the "feedback" loops in Figure 1. We trained NN[1,0,1,0], NN[3,0,1,0], NN[4,0,1,0], NN[3,1,1,0] and compared their performance by running each of them for a further 20 weeks. The performance of these forecasters are shown in Figure 4. Apparently NN[3,1,1,0] was better than the other forecasters in that it detected the variations in the first four weeks of the Hang Seng Index and made some effort to track it. The other three forecasters simply monotonically increased their values regardless of ups and downs of the Hang Seng Index. The simulation results indicated that appropriate input patterns would improve forecasting performance.


Conclusion

A scheme for time series forecasting with a neural network has been proposed in this paper that consists of three phases: detection of input patterns, determination of the number of neurons in hidden layer(s), and construction of the neural network forecaster. Autocorrelation analysis has been proved to be effective in identifying input patterns. The Baum-Haussler rule has been used to determine the number of neurons in the hidden layers in order to avoid memorisation problems. A short-term neural network forecaster has been constructed that takes into account input patterns and achieves more accurate results. A modification to the short-term neural network forecaster was used for long-term forecasting and showed certain improvement, indicating that appropriate input patterns may improve forecasting performance. However, it still fails to give long-term prediction. This may be due to lack of consideration of other external influences on the stockmarket indexes. It is intended to extend the present work to multivariate time series.


Appendix : Autocorrelation analysis

In time series forecasting using statistical approaches, the autocorrelation function is extremely useful in obtaining a partial description of a time series for forecasting [7]. Autocorrelation coefficients measure the degree of correlation between neighbouring data observations in a time series. Assuming the time series is tex2html_wrap_inline532 , the autocorrelation coefficient is estimated from sampling observations as follows:

equation145

tex2html_wrap_inline534 describes the autocorrelation of tex2html_wrap_inline428 and tex2html_wrap_inline430 . The autocorrelations for different samples of observations would form a distribution of values around k that is called the sampling distribution of autocorrelations. The sampling distribution of autocorrelation coefficients tex2html_wrap_inline534 is normal with

eqnarray164

where tex2html_wrap_inline544 and tex2html_wrap_inline546 stand for the means and variance of tex2html_wrap_inline534 . To determine at what risk level we are willing to conclude that the data is not random when in reality it is, we can use the following limits:

equation170

Partial autocorrelation coefficients measure the degree of association between tex2html_wrap_inline428 and tex2html_wrap_inline430 when the effect of other time lags on y are held constant. Partial autocorrelation coefficients are defined in terms of the last autoregressive term of an AR model of m lags. Denote tex2html_wrap_inline558 as the partial autocorrelation coefficient, and tex2html_wrap_inline560 as the estimated partial autocorrelation coefficient, then tex2html_wrap_inline562 are the m partial autocorrelations of the AR(m) model as defined in the following equations:

eqnarray179

Solving the above set of equations will determine tex2html_wrap_inline562 .


References

1
Hanke J. E. & Reitsch A. G. (1984), Business Forecasting, 2nd Edition, Allyn and Bacon.

2
Yong T. P. (1993), "Using Neuro Forecaster for time series forecasting", Proceedings of First Symposium on Intelligent Systems Applications, Singapore, pp. 182-189.

3
Eberhart R. C. & Dobbins R. W. (1990), Neural Network PC Tools - A Practical Guide, San Diego: Academic Press.

4
SYSTAT (1990) Intelligent software, Systat Inc., IL 60201-3793, US.

5
Baum E. B. & Haussler D. (1988), "What size net gives valid generalization?", Neural Computation, 1, pp. 151-160.

6
Makridakis et al. (1983), Forecasting: Methods and Applications, New York: Wiley.

7
Jarrett J. (1991), Business Forecasting Methods, Basil Blackwell.

8
Ling C. S. (1993), "Choosing the right neural network model for trading", Proceedings of First Symposium on Intelligent Systems Applications, Singapore, pp. 26-33.

9
Wang H. A. & Chan A. K.-H. (1993), "A feedforward neural network model for Hang Seng Index", Proceedings of 4th Australian Conference on Information Systems, Brisbane, pp. 575-585.

10
Windsor C. G. & Harker A. H. (1990), "Multi-variate financial index prediction - a neural network study", Proceedings of International Neural Network Conference, Paris, France, pp. 357-360.

11
Jagielska I. (1993), "The application of neural networks to business information systems", Proceedings of 4th Australian Conference on Information Systems, Brisbane, pp. 565-574.

12
White H. (1988), "Economic prediction using Neural Networks: The case of the IBM daily stock returns", Proceedings of IEEE International Conference on Neural Networks, pp. 451-458.

13
Rao V. B. & Rao H. V. (1993), tex2html_wrap_inline570 Neural Networks and Fuzzy Logic, MIS Press.

14
Goodman P., Rosen D. & Plummer A. (1993), NevProp Version 1, University of Nevada.

15
Widrow B. & Winter R. (1988), "Neural nets for adaptive filtering and adaptive patter recognition", Computer, 12, pp. 25-39.

16
Rumelhart D. E., McClelland J. L. & the PDP Research Group (1986), Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 1, Cambridge: MIT Press.

About this document ...

Time Series Forecasting with Neural Networks

This document was generated using the LaTeX2HTML translator Version 96.1 (Feb 5, 1996) Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.

The command line arguments were:
latex2html cmxhk.tex.

The translation was initiated by Pam Milliken on Wed Oct 30 11:47:33 EST 1996


Complexity International (1995) 2