Over the last few years, a large number of Fuzzy Time Series (FTS) forecasting models have been formulated and proposed in the literature to handle the complex and incomplete problems. However, the accuracy of a model is problem specific and varies among data sets. Though numerous models claimed superior over statistical and single machine learning-based models, achieving improved forecasting accuracy is still a challenging task. In the fuzzy time series model, the lengths of interval and fuzzy relationship group considered to be important factors that influence the forecasting accuracy of model. So, this research presents an FTS forecasting model based on Graph-Based Clustering technique (GBC). The graph-based clustering, which is an algorithm for data clustering, is untiled at the fuzzification stage to obtain unequal length of intervals. The proposed model is applied to forecast two numerical datasets as enrolments data of the University of Alabama and dataset of number of new confirmed cases of COVID-19 in Vietnam. The forecasting results obtained from the proposed model are compared to those produced by the other models at forecasting enrollments at the University of Alabama. It is observed that the proposed model achieves higher forecasting accuracy than its counterparts for all orders of fuzzy relationship.
Forecasting events in everyday life such as temperature, stock market, population growth, car accidents, economic growth and crop production is the major scientific problem in the field of forecasting. To give forecast results for these types of problems with 100% accuracy may not be possible but the results obtained from them have the smallest possible forecasting error. Previously, many traditional forecasting models have been proposed to address various problems such as: Autoregressive (AR), Moving Average (MA), autoregressive moving average (ARMA) and the ARIMA model. Nevertheless, these approaches require linear assumptions and substantial amounts of historical data, the FTS forecasting models proposed by Song and Chissom [1,2]. Do not request to limit other linear observations and assumptions. They have introduced two fuzzy time series models by using max-min operator while setting up a fuzzy relational matrix to handle uncertain and ambiguous data and applied them for forecasting the enrollment at University of Alabama. However, the limitations of these models are that they are unconvinced to determine the length of interval in the universal of discourse and time-consuming to compute when the huge fuzzy logic relational matrix. Therefore, to overcome these disadvantages and provide better prediction accuracy, the first-order FTS model proposed by Chen [3] uses simple arithmetic operations in dealing with fuzzy relationship groups instead of fuzzy relationships max-min operator [1] to forecast enrolments of the University of Alabama. Since then, the fuzzy time series model has been interested and exploited more by many researchers. They obtained various innovations from Chen's model in terms of determining the lengths of intervals including equal interval lengths [4-8] and different interval lengths [11-16], building groups of fuzzy relationships [10,14-16] and defuzzification process [14,15,17-19]. Specifically, Huarng [5] proposed an efficient computational method to determine the appropriate interval length from the universal of discourse. He suggested that the results of the forecasting model are greatly influenced by the different interval lengths. Other studies in research works [4,9,20-23] have offered diverse approaches in forecasting based on high-order FTS models to get through the disadvantages of first-order predicting models [1-5]. Moreover, Singh [18] introduced a new predictive model that aims to reduce the computational amount of fuzzy relational matrices or find a suitable defuzzification procedure for predicting the university enrollments and crop production.
Recently, many authors have hybridized intelligent calculation techniques with different FTS models to solve complex forecasting problems. For example, Lee and his fellow workers [20] considered a high-order FTS model for temperature prediction and TAIFEX based on a genetic algorithm. Furthermore, they also applied the annealing technique [21] in determining the length of each division to achieve better forecasting accuracy. In addition, using a Genetic Algorithm (GA) to optimize the intervals in the universal of discourse, Chen and Chung introduced two first-order forecasting models [4] and high-order [8] to predict the enrollments of Alabama University. Furthermore, to obtain optimal intervals and avoid unexpected results from the mutation stage in GA (genetic algorithms), Eren Bas et al. proposed a new GA called MGA to predict car accidents in Belgium and number of students enrollments at the Alabama schools. Currently, the application of PSO to select the appropriate interval in the fuzzy time series forecasting model has attracted the attention of many researchers. They demonstrate that choosing the appropriate interval using PSO increases the performance of forecasting model significantly, as can be seen in research works [13,14,24,25]. Specifically, Kuo et al. [13] proposed a new forecasting model by combining the PSO technique with the FTS model to improve prediction accuracy. They also used PSO to come up with a new model for TAIFEX prediction based on their new defuzzification rule. Research works in [23,26] provided the two-factor high-order FTS models to forecast the Taiwan stock market and temperature (TAIFEX) with the same purpose of using PSO in choosing optimal intervals. Besides, Park and his collaborators [27] also proposed a two-factor high-order FTS model that combines PSO to achieve more appropriate forecasting results. Huang et al. [19] introduced a hybrid predictive model combined with PSO by correcting the predictive output rule on the university admissions problem. Chen et al. [16] used the PSO technique not only to yield optimal intervals but also to obtain optimal weight vectors. They proposed a forecasting model that uses optimal interval distribution and optimal weight vectors to predict TAIFEX and the NTD/USD exchange rate. Cheng et al. [28] have proposed the FTS model to predict TAIFEX based on two advantages: using PSO to obtain the appropriate interval lengths and the K-means algorithm to classify the indexes of fuzzy sets into clusters. In addition, another technique to determine interval length such as clustering methods is used in the FTS model to minimize forecasted error. Each of them can be mentioned as raw clustering [9], automatic clustering [10], fuzzy C-Mean clustering [25] are introduced in recent works. Another FTS model uses artificial neural networks to forecast the daily average temperature of Taipei based on two-factors high-order FTS [17].
As the analysis of the works mentioned above, determining the appropriate interval length, establishing the fuzzy relationship and creating the output prediction rules are considered challenging tasks and have a great influence on the predictive accuracy of the model. Another factor that also plays an essential role in improving the predictive efficiency of the model is the observed factors apart from the main forecasted factor. Although there have been remarkable achievements in using the length of each interval as well as exploiting the output prediction rules, these problems still attract the attention of many researchers. With a view to improving the predictive efficiency of the FTS forecasting model, this study presents a new forecasting model which uses a graph-based clustering technique to determine different length of intervals on dataset of number of new confirmed cases of COVID-19 in Vietnam. In this approach, initially, we proposed a new algorithm for finding the best interval lengths based on a graph - based clustering algorithm. Then, we define various fuzzy sets based on these evolved intervals and fuzzy the historical data into fuzzy sets. Based on these fuzzified values, we derive the FRs. Then, we obtain weighted fuzzy relationship groups according to their chronological order (FRGs) from the FRs. Later, all these FRGs are used to obtain the forecasting results based on the weighted defuzzification method.
The rest of this paper is organized as follows: Basic definitions of fuzzy time series and algorithms are given in Sec2. Section 3 presents a forecasting model which combines with the FTS and Graph-based clustering algorithm. Section 4 evaluates the models’ performance and compares obtained results to those of other models. Finally, Section 5 provides some conclusions.
The Fundamental Theories
In this section, we briefly introduce general knowledge related to FTS which is proposed in Song and Chissom [2] and Chen [3] and improved by research work in Chen [4].
Basic Concepts of Fuzzy Time Series
The concepts of FTS were defined by Song and Chissom [2] and Chen [3], in which the historical data are given in the form of fuzzy sets [1]. Assume that Y(t) (t = . . , 0, 1, 2 . . ) a real subset R (Y(t) ⊆ R), regarded as the UoD on which the fuzzy sets fi(t) (i = 1,2 … ) are defined. If F(t) including the collection of f1(t), f2(t), …,, then F(t) is namely an FTS which is defined on Y(t).
If there exists Fuzzy Relationship (FR) between F(t − 1) and F(t), namely R(t − 1, t), such that they can be expressed as: F(t) = F(t − 1)* R(t − 1, t) or F(t − 1) → F(t) ; Where R(t − 1, t) is the first-order fuzzy relationship between F(t) and F(t − 1) and "*" represents the max-min composition operator. Here F(t) and F(t − 1) are fuzzy sets. If, let Ai = F(t) and Aj = F(t-1), the relationship between F(t) and F(t − 1) is replaced by Ai → Aj, where Ai and Aj are called the current state and the next state of fuzzy relationship, respectively.
Let F(t) be a fuzzy time series. If F(t) is derived by more fuzzy sets F(t − 1), F(t − 2), … , F(t − m + 1) F(t − m), then fuzzy relationship between them can be represented as F(t − m), … , F(t − 2), F(t − 1) → F(t). This relationship is called the
-order FTS model [7].
To create the forecasting rules, the method of constructing fuzzy relationship by time-variant grouping [25], i.e., censusing, the fuzzy relationships achieved from training dataset according to the same left-hand sides and the right-hand sides is calculated at forecasting time to prior time.
Graph-Based Clustering Algorithm
Graph-based clustering algorithms are powerful in giving results close to the human intuition. The common characteristic of graph-based clustering methods developed in recent years is that they build a graph on the set of data and then use the constructed graph during the clustering process. In graph-based clustering methods, data objects are represented as a graph, where each node is an object and objects are associated with links. Here, a cluster is made when a group of objects is linked to one another but have no connectivity to objects outside the group. Based on these viewpoints, a data clustering method is proposed by our in the research which is applied to display the dataset in the form of a tree and automatically generates clusters instead of the number of clusters pre-selected by the user. In particular, the graph-based clustering method is summarized to four procedures as follows:
The Procedure of Finding Root Node (PFRN). Based on the input data, this procedure points out the root node
Node Insertion Procedure (NIP). This procedure inputs one element of data set and root node and set the elements in the tree at their proper position
Tree Creation Procedure (TCP). From the input data set and the root node, this procedure shows the tree
Node Clustering Procedure (NCP). This procedure inputs the tree which is generated by the TCP and makes logical clustering of the nodes
A Proposed Forecasting Model Based on FTS and Graph-Based Clustering
The objective of this Section is to propose a hybrid FTS forecasting model which is incorporated between the Graph-based clustering and PSO. The framework of forecasting proposed model includes six steps which is presented in Figure 1. To handle these steps, all the dataset of number of new confirmed cases of COVID-19 in Vietnam from June 15, 2021 to July15, 2021 are depicted in Figure 2, which are utilized for illustrating forecasted process. The details of steps of the proposed model are explained as following.
Step 1
Partitioning historical dataset into intervals using graph-based clustering.
This step applies graph-based clustering algorithm in Section 2.2 to partition historical dataset into clusters and then, adjust them into intervals with unequal-size. The calculation is described according to Sub-steps as follows.

Figure 1: Flowchart of the FTS Proposed Forecasting Model Using Graph-Based Clustering

Figure 2: The Dataset of Number of Confirmed Cases of COVID-19 in Vietnam
Source: https://ncov.moh.gov.vn
Step 1.1: Apply the Graph-Based Clustering Algorithm to Partition Data into C Clusters
To partition time series X(t) into C clusters, four procedures of the graph-based clustering algorithm in Section 2.2 are used in this step. The brief results of these four procedures are explained as below:
The Procedure of Finding Root Node (PFRN)
Input the enrollments dataset of Alabama as S:
X(t) = (398, 414, 259, 471, ..., 2296, 2924, 1922); with (15/07/2021 ≤ t ≤ 15/7/2021)
Calculate range Rg = MAXvalue − MINvalue = 2760
Calculate standard deviation of the time series as SD = 762.76:

Define universe of discourse (U) of the S:
U = [MINvalue - w, MAXvalue + w] = [163.88, 2924.12]

Figure 3: The Tree Represents the Input Data of COVID-19 Time Series Based on Two Procedures TCP and NIP with Root Node of 1544
Calculate midpoint of U as:
Midu = (MINvalue + MAXvalue ) / 2 =1544
Assign the Midu as root node: Root = Midu =1544
Tree Creation Procedure (TCP) and Node Insertion Procedure (NIP)
For making the tree, from the input dataset S and Root. We utilize two procedures TCP and NIP to make tree and insert nodes into the tree. The results of these two procedures are shown in Figure 3.
Make the Clusters Based on Procedure 4 (NCP)
After creating the data tree as shown in Figure 3, the procedure of making clusters is brief explained according to conditions as follows:
From Procedures above, we achieve 15 clusters and their corresponding cluster centers. Then, these clusters are sorted according to an ascending sequence of clustering centers, the final results are listed in Table 1.
Step 1.2: Adjust the Clusters into Intervals
In this step, lower (Interval_LBi+1) and upper bounds (Interval_UBi) of the intervals are formed from the minimum and maximum values of the corresponding clusters, respectively.
The upper bound value of interval ui (Interval_UBi) and the lower bound value of interval ui+1 (Interval_LBi+1) can be computed as follows:

Table 1: The Completed Clusters from the Covid-19 Dataset
| No. | Clusters |
| 1 | (217,164) |
| 2 | (259,230) |
| - | - |
| 13 | (1844, 1945, 1922) |
| 14 | (2367, 2296) |
| 15 | (2924, 2924.12) |
Table 2: the result of intervals and it’s midpoints
| No. | Intervals ( | Midpoint |
| 1 | [164, 217.5] | 190.8 |
| 2 | [230, 259] | 240.4 |
| - | - | - |
| 13 | [1844, 1945] | 1894.5 |
| 14 | [2296, 2367] | 2372.7 |
| 15 | [2924, 2924.12] | 2924.06 |

Compute midpoint value of the intervali as follows:
After applying the conditions above, we obtain 15 intervals corresponding to the clusters in Table 1, called ui (1 ≤ i ≤ 15) and the midpoint values of these intervals are shown in Table 2.
Step 2: Determine Linguistic Terms for Each of Interval Obtained in Step 1
Each linguistic term can be defined by intervals that the historical time series data is distributed among these intervals. For ten intervals in step 1, we obtain 15 linguistic values of linguistic variables “ enrolments” e.g., {{ very very very few}, {very very few}, {very few}, {few},…, {very many}, {too many}, {too many many}, {too many many many}}, which can be represented by fuzzy sets Ai, eg, {A1, A2, A3, … , A9, A15}, respectively and calculated as follows:

where, the values aij∈[0,1] indicates the grade of membership of uj in fuzzy set Ai. The degree of each data is determined according to their membership grade to the fuzzy sets and which is defined in equation 5. Here, the symbol ‘+’ denotes the set union operator and the symbol ‘/’ denotes the membership of uj which belongs to Ai. The value of aij is defined as follows:

Step 3: Fuzzy all Historical Time Series Data
Each interval obtained in Step 1 can cover one or more historical data value of time series. In order to all historical time series, the common way is to convert historical data which belongs to the interval U into fuzzy sets. If the maximum membership value of fuzzy set Ai occurs at ui, then the fuzzified historical value is considered as Ai. For example, the COVID-19 data on day 16/6/2021 equal to 398 belongs to the interval u5 = [164, 217.5) and the highest membership value of fuzzy set A5 occurs at u5 So, it is fuzzified into A5. The similar way for next years, we complete the results of fuzzification of enrolments data for all years, as listed in Table 3.
Table 3: The Complete Fuzzified Results
| Day | Actual data | Fuzzy sets | Linguistic value |
| 15/6/2021 | 398 | A5 | “somewhat few” |
| 16/6/2021 | 414 | A5 | “somewhat few” |
| - | - | - | - |
| 14/7/2021 | 2924 | A15 | “too many many many” |
| 15/7/2021 | 1922 | A13 | “too many” |
Table 4: The Complete the 1st-Order Fuzzy Relationships
| Day | No. | Fuzzy set | 1st-order FRs |
| 15/6/2021 | - | A5 | - |
| 16/6/2021 | 1 | A5 | A5 → A5 |
| 17/6/2021 | 2 | A6 | A5 → A6 |
| - | - | - | - |
| 14/7/2021 | 29 | A10 | A14 → A15 |
| 15/7/2021 | 30 | A9 | A15 → A13 |
| 16/7/2021 | 31 | N/A | A13 → # |
Step 4: Create all mth-Order FRs between the Fuzzified Data Values. (
≥ 1 )
After converting data values of time series into fuzzy sets, the mth-order FRs is created between two or many consecutive fuzzified values in time series based on Definition 3. For establishing of these relationships, we need to find any relationship which has the type F(t − m), F(t − m + 1), . . . , F(t − 1) → F(t), where, the left-hand side of FR is called the current state and the right-hand side of FR is called next state, respectively. Then, the mth-order FR is replaced by relation in accordance with the corresponding fuzzy sets as:

For example, with m = 1. From Table 4, it can be seen that the fuzzified historical data of time series on the day t −1 of 16/6/2021 and t of 17/6/2021 are fuzzy sets
and
, respectively. The structure of the first - order FRs is created by two consecutive fuzzy sets as: A5 → A6.
By this way, we have achieved the 1st-order FRs for the all fuzzified data values, which are presented in column 4 of Table 4.
Where, the linguistic value of F(16/7/2021) on the right-hand side of the last relationship is denoted by symbol ‘#’ which is used to represent the unknown linguistic value.
Step 5: Generate all
-Order Time-Variant FRGs
In this paper, we apply the concept of time - variant fuzzy relationship group [25] to create FRGs. Based on the current state of the FRs in Table 4, the FRs can be grouped into a FRG by considering the history of appearance of the fuzzy sets on the next state of the FRs and called Time variant-FRGs. From this viewpoint, we obtain all 1st-order time-variant FRGs, which are shown in Table 5. Where, there are 30 groups in training phase and one group in testing phase.
Table 5: The Complete the 1st-Order Time Variant FRGs
| No. | 1st-order FRs | 1st-order TV-FRGs |
| G1 | ||
| G2 | ||
| - | - | - |
| G29 | ||
| G30 | ||
| G31 |
Step 6: Defuzzify and Compute the Forecasting Output Values
The last step is to defuzzify the forecasting values to a crisp output value by fuzzy forecasting rules. In particular, in order to defuzzify the fuzzified data values, the our defuzzified principle in article [25] is presented to compute the forecasted value for all 1st-order and high- order time variant FRGs in training phase. Next, we use a defuzzified principle [26] for computing with the unknown linguistic value in testing phase. The forecasting principles is presented as follows.
Rule 1
Calculate the forecasting value with known linguistic values.
To obtain the forecasting output results of proposed model from the time variant - FRGs. we divide each corresponding interval with respect to the linguistic value in the next state into three sub-intervals which has the same length and calculate forecasted output value for each group according to in equation 6:

where, FV is forecasted value at time t, n is the sum of fuzzy sets on the next state of FRG:
Rule 2
Calculate the forecasting value with unknown linguistic values.
In the testing phase, we calculate forecasting value for the group of fuzzy relationship which has the unknown
linguistic value appearing in the next state. Assume that there is the mth-order fuzzy relationship group whose next state is #, shown as follows: Ai m, Ai m-1,…, Ai 1 → #.
Where the symbol ‘‘#” denotes an unknown value, then the forecasted value of year i is identified according to [26] as follows:

where, mi 1, mi2, … , mik is midpoints of ui1, ui2, . . . and uik(2 ≤ k ≤ m) respectively.
Based on two forecasting principles above, we complete forecasting results for COVID-19 confirmed cases prediction from 15/6/2021 to 15/7/2021 based on 1st-order time variant-FRGs under fifteen intervals, which are listed in Table 6.
Based on two forecasting principles above, we complete forecasting results for COVID-19 confirmed cases prediction from 15/6/2021 to 15/7/2021 based on the 1st-order time variant-FRGs under fifteen intervals, which are listed in Table 6.
The present study demonstrates the application of the proposed method with two experiments. Experiment 1 compares the accuracy of its forecasted results with some conventional models mentioned above and experiment 2 illustrates the improvements in the proposed model.
Datasets and Experimental Method
Since fuzzy time series forecasting models have been used to make predictions in enrolments for many years, we also investigate enrollments at the University of Alabama [3] from 1971 to 1992 and dataset of number of new confirmed cases of COVID-19 in Vietnam from June 15, 2021 to July 15, 2021 which can be obtained from https://vnexpress.net/covid-19/covid-19-viet-nam.
To confirm the effectiveness of proposed forecasting model on two theses datasets, Mean Square Error (MSE) and Mean Absolute Percentage Error (MAPE) are employed as an evaluation criterion in term of the forecasted accuracy. The MSE and MAPE can be calculated as follows:


where, Ri, Fi represent the real value and forecasting value at year i, respectively; n is the total number of forecasted data, m means the order of the FR.
Experiment 1
In this section, we evaluate proposed forecasting model in education domain on enrolments data of University of Alabama and compared the obtained results with previous prediction models [3,30-33] to demonstrate the performance of our method. The obtained forecasting results from the proposed model which are shown in Table 7. The results in Table 7 show that the proposed model has the MAPE (9) value of 1.07 % which is the smallest among all the models compared with number of intervals equal to 10. This can be seen that the proposed model gives a very positive predictive effect on the enrollments problem of the University of Alabama. From forecasted values in Table 7, it is also found that integration of the GBC technique with the fuzzy time series model reduces the MAPE value for the historical university data set, significantly.
In addition, the forecasted results of the proposed model are also compared with each model which is named as in works [6,18,4,20] based on the various high-order FRs with different number of intervals. Comparison of these models according to the MSE value is shown in Fig 4, where the CC06b model in work [20] model and the C02 model in work [4] use 7th-order and 5th- order FRs, respectively. Remaining models use fuzzy relations with number of orders less equal to 4. The results in Figure 4 confirm that our proposed model has the smallest error in comparison with four other models in terms of MSE.
For the proposed model, the MSE value is 637.9 which is the smallest forecasting error as known. It can be seen that the proposed model forecasts more accurate than the existing models for various high-order models under different number of intervals.
Table 6: The Complete Forecasted Results Based on the 1st-Order Time Variant FRGs
| Day | COVID-19 data | Fuzzy sets | Forecasted values |
| 15/06/2021 | 398 | Not forecasted | |
| 16/06/2021 | 414 | 414.6 | |
| - | - | - | - |
| 14/07/2021 | 2924 | 2614.7 | |
| 15/07/2021 | 1922 | 1898.4 |
Table 7: A Comparison of the Existing Models in Enrollments Dataset with the Proposed Model
Year | Actual Data | Model [3] | Model [30] | Model [31] (MEPA) | Model [31] (TFA) | Model [32] | Model [33] | Proposed model |
1971 | 13055 |
|
|
|
|
|
|
|
1972 | 13563 | 14000 | 14025 | 15430 | 14230 | 14195 | 14242.0 | 13610.2 |
1973 | 13867 | 14000 | 14568 | 15430 | 14230 | 14424 | 14242.0 | 13741.2 |
1974 | 14696 | 14000 | 14568 | 15430 | 14230 | 14593 | 14242.0 | 14640.9 |
1975 | 15460 | 15500 | 15654 | 15430 | 15541 | 15589 | 15474.3 | 15054.9 |
1976 | 15311 | 16000 | 15654 | 15430 | 15541 | 15645 | 15474.3 | 15308.8 |
1977 | 15603 | 16000 | 15654 | 15430 | 15541 | 15634 | 15474.3 | 15606.6 |
1978 | 15861 | 16000 | 15654 | 15430 | 16196 | 16100 | 15474.3 | 15574 |
1979 | 16807 | 16000 | 16197 | 16889 | 16196 | 16188 | 16146.5 | 16768.1 |
1980 | 16919 | 16833 | 17283 | 16871 | 16196 | 17077 | 16988.3 | 16951.6 |
1981 | 16388 | 16833 | 17283 | 16871 | 17507 | 17105 | 16988.3 | 16673 |
1982 | 15433 | 16833 | 16197 | 15447 | 16196 | 16369 | 16146.5 | 15430.8 |
1983 | 15497 | 16000 | 15654 | 15430 | 15541 | 15643 | 15474.3 | 15560.7 |
1984 | 15145 | 16000 | 15654 | 15430 | 15541 | 15648 | 15474.3 | 15424.9 |
1985 | 15163 | 16000 | 15654 | 15430 | 15541 | 15622 | 15474.3 | 15126.7 |
1986 | 15984 | 16000 | 15654 | 15430 | 15541 | 15623 | 15474.3 | 15561.6 |
1987 | 16859 | 16000 | 16197 | 16889 | 16196 | 16231 | 16146.5 | 16768.1 |
1988 | 18150 | 16833 | 17283 | 16871 | 17507 | 17090 | 16988.3 | 17142.9 |
1989 | 18970 | 19000 | 18369 | 19333 | 18872 | 18325 | 19144.0 | 18897.1 |
1990 | 19328 | 19000 | 19454 | 19333 | 18872 | 19000 | 19144.0 | 19103 |
1991 | 19337 | 19000 | 19454 | 19333 | 18872 | 19000 | 19144.0 | 19308.8 |
1992 | 18876 | 19000 |
| 19333 | 18872 | 19000 | 19144.0 | 19103 |
MSE |
| 3.11% | 2.67% | 2.75% | 2.66% | 2.66% | 2.40% | 1.07% |
Table 8: The Forecasting Results and Accuracy of the Proposed Model
Day | Actual data of Covid-19 | Fuzzy sets | Forecasted values |
15/06/2021 | 398 | A5 | Not forecasted |
16/06/2021 | 414 | A5 | 413 |
18/06/2021 | 259 | A2 | 336.2 |
19/06/2021 | 471 | A6 | 473.5 |
20/06/2021 | 300 | A3 | 299.8 |
- | - | - | - |
13/07/2021 | 2296 | A14 | 2330.2 |
14/07/2021 | 2924 | A15 | 2614.7 |
15/07/2021 | 1922 | A13 | 1898.4 |
MSE |
|
| 25070 |

Figure 4: A Comparison of the MSE Value Between the Proposed Model and Various High-Order FTS Models

Figure 5: A Describe Graph Between Actual Versus Forecasted Covid-19 of the Proposed Model
Experiment 2
In this section, the proposed model is applied to forecast the number of new confirmed cases of COVID-19 in Vietnam between June 15, 2021 and July15, 2021. The performance of the proposed model is evaluated by using the MSE (8). The results and accuracy of the proposed model based on different number of orders with 15 intervals which are shown in Table 8. Furthermore, the trend in forecast of the proposed method is also illustrated in Figure 5 and it clearly shows that the proposed forecasted values are significantly in close accordance with the actual values.
In this study, we proposed the FTS forecasting model using graph-based clustering technique, which aims to attain better forecasting accuracy rate. The proposed method can overcome the drawback of the existing fuzzy time series forecasting method in terms of determining the lengths of intervals in the universal of discourse. In this approach, we apply graph-based clustering technique to determine different length of intervals automatically instead of the number of intervals pre-selected by the user. Further, we define time variant FRGs and obtain a forecasted value by simple computation rather than max-min composition operator on fuzzy sets. The suitability of the method is examined in student enrollments at the University of Alabama and number of new confirmed cases of COVID-19 in Vietnam. Based on our simulation and application results, it can be believed that the development and use of this approach makes a meaningful contribution to the literature on the clustering of time series. In future work, the proposed model can be extended to deal with the aspect of two - factors FTS forecasting model and also develop a new approach by applying particle swarm optimization and neural network by which the forecasting accuracy can be further improved.
Acknowledgment
This work was supported by Thai Nguyen University of Technology (TNUT).
Song, Q. and B.S. Chissom. “Forecasting Enrolments with Fuzzy Time Series - Part I.” Fuzzy Sets and Systems, vol. 54, no. 1, 1993, pp. 1-9.
Song, Q. and B.S. Chissom. “Fuzzy Time Series and Its Models.” Fuzzy Sets and Systems, vol. 54, no. 3, 1993, pp. 269-277.
Chen, S.M. “Forecasting Enrolments Based on Fuzzy Time Series.” Fuzzy Sets and Systems, vol. 81, 1996, pp. 311-319.
Chen, S.M. “Forecasting Enrolments Based on High-Order Fuzzy Time Series.” Cybernetics and Systems, vol. 33, no. 1, 2002, pp. 1-16.
Huarng, K. “Effective Lengths of Intervals to Improve Forecasting in Fuzzy Time Series.” Fuzzy Sets and Systems, vol. 123, no. 3, 2001, pp. 387-394.
Hwang, J.R. et al. “Handling Forecasting Problems Using Fuzzy Time Series.” Fuzzy Sets and Systems, vol. 100, 1998, pp. 217-228.
Yu, H.K. “A Refined Fuzzy Time-Series Model for Forecasting.” Physica A: Statistical Mechanics and Its Applications, vol. 346, no. 3-4, 2005, pp. 657-681.
Yu, H.K. “Weighted Fuzzy Time Series Models for TAIEX Forecasting.” Physica A: Statistical Mechanics and Its Applications, vol. 349, no. 3-4, 2005, pp. 609-624.
Bosel, M. and K. Mali. “A Novel Data Partitioning and Rule Selection Technique for Modelling High-Order Fuzzy Time Series.” Applied Soft Computing, 2017. https://doi.org/10. 1016/j.asoc.2017.11.01.
Chen, S.M. and K. Tanuwijaya. “Fuzzy Forecasting Based on High-Order Fuzzy Logical Relationships and Automatic Clustering Techniques.” Expert Systems with Applications, vol. 38, 2011, pp. 15425-15437.
Loc, V.M. and P.T.H. Nghia. “Context-Aware Approach to Improve Result of Forecasting Enrollment in Fuzzy Time Series.” International Journal of Emerging Technologies in Engineering Research, vol. 5, no. 7, 2017, pp. 28-33.
Lu, W. et al. “Using Interval Information Granules to Improve Forecasting in Fuzzy Time Series.” International Journal of Approximate Reasoning, vol. 57, 2015, pp. 1-18.
Kuo, I.H. et al. “An Improved Method for Forecasting Enrolments Based on Fuzzy Time Series and Particle Swarm Optimization.” Expert Systems with Applications, vol. 36, 2009, pp. 6108-6117.
Kuo, I.H. et al. “Forecasting TAIFEX Based on Fuzzy Time Series and Particle Swarm Optimization.” Expert Systems with Applications, vol. 237, 2010, pp. 1494-1502.
Tian, Z.H. et al. “Fuzzy Time Series Based on K-Means and Particle Swarm Optimization Algorithm.” Man-Machine-Environment System Engineering, vol. 406, 2017, pp. 181-189.
Chen, S.M. et al. “Fuzzy Time Series Forecasting Based on Optimal Partitions of Intervals and Optimal Weighting Vectors.” Knowledge-Based Systems, vol. 118, 2017, pp. 204-216.
Singh, P. and B. Borah. “An Effective Neural Network and Fuzzy Time Series Based Hybridized Model to Handle Forecasting Problems of Two Factors.” Knowledge and Information Systems, vol. 38, 2014, pp. 669-690.
Singh, S.R. “A Simple Method of Forecasting Based on Fuzzy Time Series.” Applied Mathematics and Computation, vol. 186, 2007, pp. 330-339.
Huang, Y.L. et al. “A Hybrid Forecasting Model for Enrolments Based on Aggregated Fuzzy Time Series and Particle Swarm Optimization.” Expert Systems with Applications, vol. 38, 2011, pp. 8014-8023.
Chen, S.M. and N.Y. Chung. “Forecasting Enrolments Using High-Order Fuzzy Time Series and Genetic Algorithms.” International Journal of Intelligent Systems, vol. 21, 2006, pp. 485-501.
Lee, L.W. et al. “Temperature Prediction and TAIFEX Forecasting Based on High-Order Fuzzy Logical Relationship and Genetic Simulated Annealing Techniques.” Expert Systems with Applications, vol. 34, 2008, pp. 328-336.
Lee, L.W. et al. “Handling Forecasting Problems Based on Two-Factors High-Order Fuzzy Time Series.” IEEE Transactions on Fuzzy Systems, vol. 14, 2006, pp. 468-477.
Wang, N.Y. and S.M. Chen. “Temperature Prediction and TAIFEX Forecasting Based on Automatic Clustering Techniques and Two-Factors High-Order Fuzzy Time Series.” Expert Systems with Applications, vol. 36, 2009, pp. 2143-2154.
Chen, S.M. et al. “TAIEX Forecasting Based on Fuzzy Time Series, Particle Swarm Optimization Techniques and Support Vector Machines.” Information Sciences, vol. 247, 2013, pp. 62-71.
Tinh, N.V. “Enhanced Forecasting Accuracy of Fuzzy Time Series Model Based on Combined Fuzzy C-Mean Clustering with Particle Swarm Optimization.” International Journal of Computational Intelligence Applications, vol. 19, 2020, article 2050017.
Hsu, L.Y. et al. “Temperature Prediction and TAIFEX Forecasting Based on Fuzzy Relationships and MTPSO Techniques.” Expert Systems with Applications, vol. 37, 2010, pp. 2756-2770.
Park, J.I. et al. “TAIFEX and KOSPI 200 Forecasting Based on Two-Factors High-Order Fuzzy Time Series and Particle Swarm Optimization.” Expert Systems with Applications, vol. 37, 2010, pp. 959-967.
Cheng, S.H. et al. “Fuzzy Time Series Forecasting Based on Fuzzy Logical Relationships and Similarity Measures.” Information Sciences, vol. 327, 2016, pp. 272-287.
Chen, S.M. and N.Y. Chung. “Forecasting Enrollments of Students by Using Fuzzy Time Series and Genetic Algorithm.” International Journal of Information and Management Sciences, vol. 17, 2006, pp. 1-17.
Lee, H.S. and M.T. Chou. “Fuzzy Forecasting Based on Fuzzy Time Series.” International Journal of Computer Mathematics, vol. 81, no. 7, 2004, pp. 781-789.
Cheng, C. et al. “Entropy-Based and Trapezoid Fuzzification-Based Fuzzy Time Series Approaches for Forecasting IT Project Cost.” Technological Forecasting and Social Change, vol. 73, 2006, pp. 524-542.
Qiu, W. et al. “A Generalized Method for Forecasting Based on Fuzzy Time Series.” Expert Systems with Applications, vol. 38, no. 8, 2011, pp. 10446-10453.
Cheng, C.H. et al. “Multi-Attribute Fuzzy Time Series Method Based on Fuzzy Clustering.” Expert Systems with Applications, vol. 34, 2008, pp. 1235-1242.