Friday, April 5, 2019
A Study On Business Forecasting Statistics Essay
A Study On Business Forecasting Statistics EssayThe aim of this report is to show my understanding of business forecasting utilize information which was drawn from the UK national statistics. It is a quarterly series of total consumer recognise gross alter in the UK from the second quarter 1993 to the second quarter 2009.The report answers four-spot key questions that be relevant to the coursework.In this separate the selective information will be examined, looking for seasonal worker worker arranges, make outs and cycles. Each measure closure represents a single piece of info, which must be split into trend-cycle and seasonal effect. The line graph in look 1 identifies a clear up(a) trend-cycle, which must be removed so that the seasonal effect foot be predicted. phase 1 displays long-term credit lending in the UK, which has late been hit by an economic crisis. Figure 2 also proves there is evidence of a trend because the ACF value do non come down to zero. Even though the trend is clear in Figure 1 and 2 the seasonal pattern is not. Therefore, it is important the trend-cycle is removed so the seasonal effect put up be estimated clearly. Using a process called differencing will remove the trend whilst keeping the pattern. draw scattering plots and calculating correlation coefficients on the differenced entropy will reveal the pattern recall.Scatter while correlationThe following diagram (Figure 3) represents the correlation between the true credit lending data and four lags (quarters). A strong correlation is represented by is showed by a straight-line relationship.As depicted in Figure 3, the scatter plot diagrams show that the credit lending data against lag 4 represents the shell straight line. Even though the last diagram represents the straightest line, the seasonal pattern is still unclear. Therefore differencing must be used to resolve this issue.DifferencingDifferencing is used to remove a trend-cycle comp 1nt. Figure 4 resul ts display an ACF graph, which indicates a four-point pattern repeat. Moreover, figure 5 shows a line graph of the first difference. The graph displays a four-point repeat but the trend is still clearly apparent. To remove the trend completely the data must differenced a second time. First differencing is a useful tool for removing non-stationary. However, first differencing does not always eliminate non-stationary and the data may accept to be differenced a second time. In practice, it is not essential to go beyond second differencing, because real data generally involve non-stationary of only the first or second level.Figure 6 and 7 displays the second difference data. Figure 6 displays an ACF graph of the second difference, which reinforces the idea of a four-point repeat. Suffice to say, figure 7 proves the trend-cycle component has been completely removed and that there is in fact a four-point pattern repeat.Question 2Multiple regression involves fitting a linear expression by minimising the sum of square toesd deviations between the sample data and the fitted prototype. There are several models that regression push aside fit. Multiple regression can be implemented using linear and nonlinear regression. The following section explains ternary regression using dummy varyings.Dummy variables are used in a multiple regression to fit trends and pattern repeats in a holistic way. As the credit lending data is now seasonal, a common method used to handle the seasonality in a regression modeling is to use dummy variables. The following section will include dummy variables to indicate the quarters, which will be used to indicate if there are any quarterly influences on sales. The three new variables can be defined Q1 = first quarter Q2 = second quarter Q3 = third quarterTrend and seasonal models using model variablesThe following equations are used by SPSS to create different outputs. Each model is judged in terms of its alter R2. analog trend + seasonal m odel data = a + c time + b1 x Q1 + b2 x Q2 + b3 x Q3 + errorQuadratic trend + seasonal modelData = a + c time + b1 x Q1 + b2 x Q2 + b3 x Q3 + errorCubic trend + seasonal modelData = a + c time + b1 x Q1 + b2 x Q2 + b3 x Q3 + errorInitially, data and time columns were inputted that displayed the trends. Moreover, the sales data was regressed against time and the dummy variables. Due to multi-collinearity (i.e. at least one of the variables beingness completely determined by the others) there was no need for all four variables, just Q1, Q2 and Q3.Linear regressionLinear regression is used to define a line that comes closest to the original credit lending data. Moreover, linear regression finds values for the slope and intercept that find the line that minimizes the sum of the square of the vertical distances between the points and the lines. sit around Summary vexRR wholeAdjusted R SquareStd. Error of the Estimate1.971a.943.9393236.90933Figure 8. SPSS output displaying the correct coefficient of determination R square upCoefficientsa sit downUnstandardized Coefficients standardised CoefficientstSig.BStd. Error beta1(Constant)17115.8161149.16614.894.000time767.06826.084.97229.408.000Q1-1627.3541223.715-.054-1.330.189Q2-838.5191202.873-.028-.697.489Q3163.7821223.715.005.134.894Figure 9The alter coefficient of determination R squared is 0.939, which is an excellent fit (Figure 8). The coefficient of variable time, 767.068, is positive, indicating an upward trend. every last(predicate) the coefficients are not real at the 5% level (0.05). Hence, variables must be removed. Initially, Q3 is removed because it is the least probatory variable (Figure 9). Once Q3 is removed it is still apparent Q2 is the least significant value. Although Q3 and Q2 is removed, Q1 is still not significant. All the quarterly variables must be removed, therefore, leaving time as the only variable, which is significant.Coefficientsa warningUnstandardized CoefficientsStandardized Coef ficientstSig.BStd. ErrorBeta1(Constant)16582.815866.87919.129.000time765.44326.000.97029.440.000Figure 10 The following table (Table 1) analyses the original forecast against the holdback data using data in Figure 10. The following equation is used to work the predicted values.Predictedvalues = 16582.815+765.443*time Original DataPredicted Values50878.0060978.5152199.0061743.9550261.0062509.4049615.0063274.8447995.0064040.2845273.0064805.7242836.0065571.1743321.0066336.61Table 1Suffice to say, this model is ineffective at predicting future values. As the original holdback data decreases for each quarter, the predicted values increase during time, showing no significant correlation.Non-Linear regressionNon-linear regression aims to find a relationship between a response variable and one or more explanatory variables in a non-linear fashion. (Quadratic)Model SummarybModelRR SquareAdjusted R SquareStd. Error of the Estimate1.986a.972.9692305.35222Figure 11CoefficientsaModelUnstandardi zed CoefficientsStandardized CoefficientstSig.BStd. ErrorBeta1(Constant)11840.9961099.98010.765.000time1293.64275.6811.63917.093.000time2-9.0791.265-.688-7.177.000Q1-1618.275871.540-.054-1.857.069Q2-487.470858.091-.017-.568.572Q3172.861871.540.006.198.844Figure 12The quadratic non-linear adjusted coefficient of determination R squared is 0.972 (Figure 11), which is a slight improvement on the linear coefficient (Figure 8). The coefficient of variable time, 1293.642, is positive, indicating an upward trend, whereas, time2, is -9.079, which is negative. Overall, the positive and negative values indicate a curve in the trend.All the coefficients are not significant at the 5% level. Hence, variables must also be removed. Initially, Q3 is removed because it is the least significant variable (Figure 9). Once Q3 is removed it is still apparent Q2 is the least significant value. Once Q2 and Q3 have been removed it is unmistakable Q1 is under the 5% level, meaning it is significant (Figure 13).CoefficientsaModelUnstandardized CoefficientsStandardized CoefficientstSig.BStd. ErrorBeta1(Constant)11698.512946.95712.354.000time1297.08074.5681.64317.395.000time2-9.1431.246-.693-7.338.000Q1-1504.980700.832-.050-2.147.036Figure 13Table 2 displays analysis of the original forecast against the holdback data using data in Figure 13. The following equation is used to calculate the predicted valuesQuadPredictedvalues = 11698.512+1297.080*time+(-9.143)*time2+(-1504.980)*Q1 Original DataPredicted Values50878.0056172.1052199.0056399.4550261.0055103.5349615.0056799.2947995.0056971.7845273.0057125.9842836.0055756.9243321.0057379.54Table 2Compared to Table 1, Table 2 presents predicted data values that are closer in range, but are not faultless enough.Non-Linear model (Cubic)Model SummarybModelRR SquareAdjusted R SquareStd. Error of the Estimate1.997a.993.9921151.70013CoefficientsaModelUnstandardized CoefficientsStandardized CoefficientstSig.BStd. ErrorBeta1(Constant)17430.277710.19724 .543.000time186.53196.802.2361.927.060time238.2173.8592.8979.903.000time3-.544.044-2.257-12.424.000Q1-1458.158435.592-.048-3.348.002Q2-487.470428.682-.017-1.137.261Q312.745435.592.000.029.977Figure 15The adjusted coefficient of determination R squared is 0.992, which is the best fit (Figure 14). The coefficient of variable time, 186.531, and time2, 38.217, is positive, indicating an upward trend. The coefficient of time3 is -.544, which indicates a curve in trend. All the coefficients are not significant at the 5% level. Hence, variables must be removed. Initially, Q3 is removed because it is the least significant variable (Figure 15). Once Q3 is removed it is still apparent Q2 is the least significant value. Once Q3 and Q2 have been removed Q1 is now significant but the time variable is not so it must also be removed.CoefficientsaModelUnstandardized CoefficientsStandardized CoefficientstSig.BStd. ErrorBeta1(Constant)18354.735327.05956.120.000time245.502.9563.44947.572.000time3-.623 .017-2.586-35.661.000Q1-1253.682362.939-.042-3.454.001Figure 16Table 3 displays analysis of the original forecast against the holdback data using data in Figure 16. The following equation is used to calculate the predicted valuesCubPredictedvalues = 18354.735+45.502*time2+(-.623)*time3+(-1253.682)*Q1 Original DataPredicted Values50878.0049868.6952199.0048796.0850261.0046340.2549615.0046258.5147995.0044786.0845273.0043172.8942836.0040161.5343321.0039509.31Table 3Suffice to say, the cubic model displays the most accurate predicted values compared to the linear and quadratic models. Table 3 shows that the original data and predicted values gradually decrease.Question 3Box Jenkins is used to find a suitable formula so that the residuals are as small as possible and exhibit no pattern. The model is built only involving a few steps, which may be repeated as necessary, resulting with a specific formula that replicates the patterns in the series as closely as possible and also produces accu rate forecasts.The following section will show a combination of decomposition and Box-Jenkins ARIMA approaches.For each of the original variables analysed by the procedure, the Seasonal Decomposition procedure creates four new variables for the modelling data SAF Seasonal factors SAS Seasonally adjusted series, i.e. de-seasonalised data, representing the original series with seasonal variations removed. STC Smoothed trend-cycle component, which is smoothed version of the seasonally adjusted series that shows twain trend and cyclic components. ERR The residual component of the series for a particular observation Autoregressive (AR) models can be effectively coupled with moving mean(a) (MA) models to form a general and useful crystallize of time series models called autoregressive moving average (ARMA) models,. However, they can only be used when the data is stationary. This fellowship of models can be extended to non-stationary series by allowing differencing of the data series. These are called autoregressive integrated moving average (ARIMA) models.The variable SAS will be used in the ARIMA models because the original credit lending data is de-seasonalised. As the data in Figure 19 is de-seasonalised it is important the trend is removed, which results in seasonalised data. Therefore, as mentioned before, the data must be differenced to remove the trend and create a stationary model.Model StatisticsModelNumber of PredictorsModel Fit statisticsLjung-Box Q(18)Number of OutliersStationary R-squaredNormalized BICStatisticsDFSig.Seasonal adjusted series for creditlending from SEASON, MOD_2, MUL EQU 4-Model_10.48514.04018.69315.2280Model StatisticsModelNumber of PredictorsModel Fit statisticsLjung-Box Q(18)Number of OutliersStationary R-squaredNormalized BICStatisticsDFSig.Seasonal adjusted series for creditlending from SEASON, MOD_2, MUL EQU 4-Model_10.47613.87216.57217.4840ARMA (3,2,0) Original DataPredicted Values50878.0050335.2984352199.0050252.0059550261.00 50310.4427749615.0049629.7523347995.00
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment