An
Autoregressive Moving Average Model for Short Term Prediction of Non-Insulin
Dependent Diabetes Among Farmers in Benue State
John Agada1, David Adugh Kuhe 2 and Ojochegbe Noah Anthony 3*
1Department of
Mathematics and Computer Science, Rev, Fr. Moses Orshio Adasu University
Makurdi, Benue State, Nigeria
2Department of
Statistics, Joseph Sarwuan Tarka University, Makurdi, Benue State, Nigeria
3Department of
Mathematics and Computer Science, Rev, Fr. Moses Orshio Adasu University
Makurdi, Benue State, Nigeria
Corresponding Author:
Email: davidkuhe@gmail.com; Tel: 2348064842229
ABSTRACT
This study employs an Autoregressive
Moving Average (ARMA) time series model to forecast the short-term incidence of
non-insulin-dependent diabetes mellitus (Type 2 Diabetes) among farmers in
Benue State, Nigeria. The data was collected from the Benue State
Epidemiological Unit, Makurdi, and covered a 20-year period from January 2005
to June 2025. The study employed descriptive statistics and normality measures,
Augmented Dickey-Fuller (ADF) unit root test and ARMA (p,q) model as the
principal analytical techniques
and procedures used to examine the data. The descriptive statistics
indicated moderate variability in diabetes cases over the years, while the
Augmented Dickey-Fuller (ADF) test confirmed the stationarity of the series in
level. Model choice based on Akaike Information Criterion (AIC), Schwarz
Information Criterion (SIC), and Hannan–Quinn Criterion (HQC) identified the
ARMA(3,3) model as the best fit for forecasting diabetic cases in the study
area. The model’s high coefficient of determination (R² = 0.8905) and
statistically significant parameters (p < 0.05) demonstrated its robustness
and predictive accuracy. Diagnostic checks using autocorrelation, partial
autocorrelation, and the Ljung–Box Q-statistics showed that the residuals
behaved like white noise, indicating a well-specified model. Forecast
evaluations using Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and
Mean Absolute Percentage Error (MAPE) confirmed that the model accurately good
for predicting out-of-sample values. The forecast for July 2025 to June 2027
revealed a potential average of approximately 6,420 diabetes cases per month
among farmers, with expected fluctuations over time. The study underscored the
growing public health concern of diabetes among the farming population in Benue
State and its implications for agricultural productivity and postharvest
losses. The study concluded that predictive modeling can serve as a vital tool
for health planners to design early intervention strategies, integrate health
management with agricultural development, and enhance the overall well-being of
rural farmers.
Keywords: Diabetes, ARIMA, Time
Series Forecasting, Non-Insulin Dependent Diabetes, Farmers, Benue State,
Public Health, Postharvest Losses
1.0 INTRODUCTION
Diabetes mellitus, often simply
referred to as diabetes, is a group of metabolic disorders characterized by
high blood sugar levels over a prolonged period. The two main types of diabetes
are type-1 diabetes, which results from the body’s inability to produce
insulin, and Type-2 diabetes develops when the body either
becomes resistant to insulin or produces insufficient insulin to control blood
sugar levels effectively. Diabetes mellitus is a multifaceted metabolic
condition marked by high concentrations of glucose (sugar) in the bloodstream Glucose is a crucial source of
energy for cells, and insulin, a hormone produced by the pancreas, plays a
central role in regulating its uptake into cells. In diabetes mellitus, this
regulation is disrupted, leading to persistent hyperglycemia (high blood sugar)
(American Diabetes Association, 2022).
Diabetes mellitus is a significant
public health concern worldwide, with its prevalence increasing steadily over
the past few decades. According to the International Diabetes Federation (IDF,
2019), an estimated 537 million adults aged 20-79 years were living with
diabetes globally in 2021 and this number is projected to rise to 783 million
by 2045. The prevalence of diabetes varies by region, with higher rates
observed in low- and middle-income countries, particularly in urban areas
undergoing rapid socioeconomic development and lifestyle changes. (ADA, 2022).
In Nigeria, the prevalence is
estimated at 7% and 11.35% in South-south zone. The Diabetes Association of
Nigeria (DAN) reviewed that, mortality rate of diabetes from insufficient
management far outweighs that of HIV/AIDs, Malaria and Cancer (Olamoyegun et al.,
2024)
Diabetes mellitus is significantly
Impacting farmers in Benue State with prevalence rate among yam farming
population estimated at 24.9% and mortality rate of 8.61% and as led to reduced
labor productivity, economic impact and health complications (Teran, A.D.. 2017)
Diabetes is associated with numerous
complications that can affect nearly every organ system in the body. These
complications includes Microvascular: Retinopathy (vision loss) neuropathy
(nerve damage), nephropathy (kidney damage), and Microvascular: cardiovascular
disease (such as heart attack and stroke), others are foot ulcers and
amputations. The burden of diabetes-related complications is substantial,
leading to increased medical costs, reduced quality of life, and higher risk of
premature mortality (ADA, 2022).
Type-2 diabetes, also known as non-insulin dependent
diabetes, is a long-term condition that affects how the body processes sugar
(glucose), which is an important source of energy. In this condition, the body
either becomes resistant to insulin, a hormone that helps move sugar into
cells, or doesn’t produce enough insulin to keep blood sugar levels normal (Sun
et al., 2021). Unlike type-1 diabetes, where the immune system attacks and
destroys insulin-producing cells in the pancreas, type-2 diabetes usually develops
slowly over time. While it was once mostly seen in adults, more children and
teenagers are now being diagnosed, largely due to increasing obesity and less
active lifestyles (Sun et al., 2021).
A major characteristic of type-2 diabetes is insulin
resistance, which means the body's cells don't respond to insulin as they
should. When this happens, the pancreas tries to make more insulin to help move
sugar into the cells. However, over time, the pancreas may struggle to keep up
with this increased demand. As a result, sugar starts to accumulate in the
blood, causing high blood sugar levels (Cloete, 2022).
Several determinants contributes to
the risk of developing type-2 diabetes, including obesity, particularly excess
fat around the abdomen (central obesity), A sedentary
lifestyle, unhealthy eating habits—like eating too many sugary and processed
foods—having a family history of diabetes, getting older (especially after 45),
and belonging to certain ethnic groups are all factors that can increase the
risk of developing diabetes
(ADA, 2022).
In Addition to insulin resistance, type-2 diabetes can also
involve problems with the pancreas, the organ that makes insulin. Sometimes,
the pancreas doesn't produce enough insulin to keep blood sugar levels in
check, making high blood sugar worse (Desai & Deshmukh, 2020).
Symptoms of type-2 diabetes often develop slowly and can
include increased thirst, frequent urination, fatigue, blurred vision, slow
wound healing, and repeated infections. In the early stages, some people may
not notice any symptoms at all,
which is why regular screenings are essential (IDF, 2019).
Treatment for type-2 diabetes aims
to maintain blood sugar levels within a target range to prevent serious health
problems and complications. This typically involves lifestyle modifications
such as regular exercise, healthy eating habits (including portion control and
selecting nutrient-rich foods), weight management, and monitoring blood sugar
levels. (Desai & Deshmukh, 2020).
The management and treatment of
type-2 diabetes can impose financial burdens on individuals, families, and
healthcare systems. In regions where healthcare costs are primarily borne by
the individual or are not adequately covered by insurance, the expenses
associated with diabetes care can divert resources away from agricultural
investments and productivity-enhancing measures. This can directly impact
agricultural communities with reduced investment into agricultural produces,
reduced income and crop loss thereby affecting their livelihood (Huang et al., 2016).
Diabetes Mellitus is diagnosed when certain blood sugar
levels are met or exceeded. Specifically, a person may be diagnosed if their
A1C is 6.5% or higher, which reflects average blood glucose over the past few
months. Alternatively, if fasting blood sugar is 126 mg/dL or higher, or if a
2-hour blood sugar reading during an oral glucose tolerance test reaches 200
mg/dL or more, a diagnosis may be made. Additionally, if an individual has a
random blood sugar of 200 mg/dL or higher along with symptoms like excessive
thirst, frequent urination, or unexplained weight loss, they may also be
diagnosed with diabetes (Jaeger et al., 2025).
Agricultural activities, like applying chemical
fertilizers and pesticides, can have environmental consequences that can
indirectly impact diabetes risk factors. For instance, exposure to chemicals
such as glyphosate or organophosphates used in farming has been associated with
a higher likelihood of developing metabolic disorders. Additionally, environmental
factors such as air pollution and climate change may exacerbate diabetes risk
factors and health outcomes, potentially affecting agricultural productivity
and crop yields (whiting et al.,
2011). Overall, while the direct impact of type-2 diabetes on agricultural
productivity and postharvest losses may be limited, the interplay between
diabetes, dietary patterns, healthcare access, and environmental factors can
have broader implications for agricultural communities and food systems.
Addressing the complex relationship between health, agriculture, and the
environment requires a holistic approach that considers socioeconomic factors,
public health interventions, and sustainable agricultural practices (Whiting et al., 2011).
Overall, while the direct impact of
type-2 diabetes on agricultural productivity and postharvest losses may be
limited, the interplay between diabetes, dietary patterns, healthcare access,
and environmental factors can have broader implications for agricultural
communities and food systems. Addressing the complex relationship between
health, agriculture, and the environment requires a holistic approach that
considers socioeconomic factors, public health interventions, and sustainable
agricultural practices (Huang et al.,
2016).
This study therefore attempts to
extend the existing literature and contribute to the existing body of knowledge
by modeling and forecasting non insulin dependent diabetes among farmers in
Benue State using autoregressive moving average (ARIMA) time series model with
more recent data.
2.0
MATERIALS AND METHODS
2.1 Method of Data Collection
The
data utilized in this research work are monthly secondary time series data on
morbidity incidence of type-2 diabetes in Benue state for the period of
January, 2005 June, 2025 making a total of 234 observations. The data was
collected from Benue State Epidemiological unit, Makurdi. The data was
transformed to natural logarithms using the following formula:
where
is
the confirmed type-2 diabetes series observation indexed by time
, while
is
the natural logarithm. Hence forth
will be regarded as a series.
2.2 Methods of Data Analysis
Find
below the statistical tools employed in the analysis of data in this work.
3.2.1 Descriptive statistics and normality
measures
The
mean of any given set of data can be computed as follows:
The
sample standard deviation of any given set of data over a given period of time
is computed using the formula:
where
is
the sample mean,
is
the sample size.
Jarque-Bera
test is a normality test of whether a given sample data have the skewness and
kurtosis similar to that of a normal distribution. The test was proposed by
Jarque and Bera (1980, 1987) and test the null hypothesis that the series is
normally distributed. Given any data set, the test statistic JB is defined as:
where
is
the sample skewness denoted as:
and is
the sample kurtosis given below:
whereT
is the total number of observations. The JB normality test checks the
following pair of hypothesis:
and
(i.e.,
follows a normal distribution)
and
(i.e.,
does not follows a normal distribution).
The
test rejects the null hypothesis if the p-value of the JB test statistic is
less than level of significance.
2.2.2 Augmented Dickey-Fuller (ADF) unit root
test
The Augmented Dickey-Fuller (ADF) test helps to
identify if a time series is stationary or has a unit root, indicating a
persistent trend over time (Dickey and Fuller, 1979).
It
accounts for higher-order correlations by assuming the series follows an AR(p)
process and incorporates lagged differences of the series into the regression
to enhance the test's precision.
.
where
are optional exogenous regressors which
may consist of constant, or a constant and trend,
and
are
parameters to be estimated,β values arelagged difference terms and the
are assumed to be white noise. The null and
alternative hypotheses are written as:
(8)
and
evaluated using the conventional ratio for
where
is
the estimate of
and "the
coefficient standard error is denoted as
"
2.2.3
Portmanteau test
A Portmanteau test also called he Ljung-Box Q-statistic test is used to
determine whether there is any remaining serial correlation or autocorrelation
in the residuals of a time series. The test checks the following pairs
of hypotheses:
(all lags correlations are zero)
(there is at least one lag with non-zero
correlation). The test statistic is given by:
where
denotes the autocorrelation estimate of squared standardized
residuals at lags. T is the sample size, Q
is the sample autocorrelation at lag k. We reject
if p-value is less than
level of significance (Ljung and Box, 1979).
2.3 Time Series Models Specification
To specify an ARIMA
model which is the model framework use in this study, we first specify
autoregressive (AR) model, moving average (MA) model, autoregressive moving
average (ARMA) model before specifying autoregressive integrated moving average
(ARIMA) model. These models are specified as follows.
2.3.1 The autoregressive (AR) model
A
stochastic time series process {} is an autoregressive process of order p,
denoted AR(
) if it satisfied the difference equation
where
is
a white noise and
are constants to be determined.
2.3.2 Moving average (MA) model
A
time series {} which satisfies the difference equation
where
are fixed constants with
as
white noise is called a moving average process of order q, denoted MA(
).
2.3.3 Autoregressive moving average
(ARMA) model
A
stochastic time series process {} which results from a linear combination
of autoregressive and moving average processes is called an Autoregressive
Moving Average (ARMA) process of order p, q, denoted ARMA (
) if it satisfies the following
difference equation:
where
are fixed constants associated with the
AR terms and
are fixed constants associated with the MA
terms with
being a white noise. The stationarity of an
ARMA (
) process is guaranteed if the roots of
the polynomial
lie outside the unit circle.
An
ARMA () model is specified as:
2.3.4 Autoregressive integrated moving average
(ARIMA) model
Autoregressive
(AR), Moving Average (MA) or Autoregressive Moving Average (ARMA) model in
which differences have been taken are collectively called Autoregressive
Integrated Moving Average or ARIMA models. A time series {} is said to follow an integrated
autoregressive moving average model if the
th difference
is
a stationary ARMA process. If
follows an ARMA(p, q) model, we say that {
} is an ARIMA (p, d, q) process. For
practical purposes, we can usually take
or
at most 2.
Consider
then an ARIMA (p, 1, q) process, with , we have
In
terms of the observed series,
)
2.4 Model Order Selection
We use the following information criteria
for model order selection in conjunction with log likelihood function: Akaike
information criterion (AIC) due to Akaike (1978), Schwarz information Criterion
(SIC) due to (Schwarz, 1978) and Hannan-Quinn information Criterion (HQC) due
to (Hannan, 1980). The
formula for the information criteria are:
where is
the number of free parameters to be estimated in the model, T is the number of
observations and L is the likelihood function defined as:
Thus
given a set of estimated ARMA models for a given set of data, the preferred
model is the one with the minimum information criteria and maximum log
likelihood.
2.5 Model Forecast Evaluation
We
employed Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) accuracy
measures to select an optimal model mode that is both parsimonious and
accurately forecast the data based on minimum values of the accuracy measures.
2.5.1 Root Mean Square Error (RMSE)
The
Root Mean Square Error is
a statistical tool for measuring the accuracy of a forecast method. It is
computed as:
Where
is
the forecast value of the series and
is
the actual series and
is
the number of forecast observations.
2.5.2
Mean Absolute Error (MAE)
The mean absolute
error (MAE) is a statistical tool for measuring the average size of the errors
in a collection of predictions, without taking their directions into account.
It is measured as the average absolute difference between the predicted values and
the actual values and is used to assess the effectiveness of a model. It is
given as:
where"
is
the actual value of the series at time
is
the forecasted value of the series and
is
the number of observations. The lower the value of RMSE and MAE, the better the
model is able to forecast future values.
3.0 RESULTS
AND DISC0USSION
3.1
Summary Statistics and Normality Measures
This
study seeks to provide a short-term prediction of non-insulin-dependent
diabetes (Type-2 diabetes mellitus) among farmers in Benue State using the
Autoregressive Moving Average (ARMA) time series model. Before model
estimation, a preliminary analysis of the dataset was conducted to summarize
its key characteristics and assess the normality of the distribution. Table 1
below presents the descriptive statistics and normality test results for the
observed monthly diabetes cases.
Table 1: Summary Statistics and Normality Measures
|
Variable |
Statistic |
|
Mean |
5571.321 |
|
Maximum |
9661.00 |
|
Minimum |
3624.000 |
|
Standard
Deviation |
1769.088 |
|
Skewness |
0.010212 |
|
Kurtosis |
1.767498 |
|
Jarque-Bera
Statistic |
15.57465 |
|
p-value |
0.000415 |
|
Number of
Observations |
246 |
From the result of summary
statistics and normality measures reported in Table 1 above, the mean value of
approximately 5571 infection cases indicates the average number of recorded
non-insulin-dependent diabetes cases among farmers during the study period,
while the maximum and minimum values (9661 and 3624, respectively) show the
range of variation in the data. The standard deviation (1769) suggests a
relatively high level of fluctuation around the mean, implying moderate
variability in the monthly incidence of diabetes cases.
The skewness value (0.010212),
being close to zero, indicates that the distribution of the series is
approximately symmetric. However, the kurtosis value (1.767498) is less than 3,
signifying a platykurtic distribution, that is, the data are relatively flatter
than a normal distribution with lighter tails.
The Jarque–Bera statistic
(15.57465) with an associated p-value of 0.000415 is statistically significant
at the 1% level, leading to the rejection of the null hypothesis of normality.
This implies that the series does not follow a perfectly normal distribution,
which is a common characteristic of real-world time series data.
Overall, the results suggest that while the data are
fairly symmetric, they deviate slightly from normality, a factor to be
considered when fitting and diagnosing the ARMA model for accurate short-term
forecasting.
4.2 Graphical
Examination of Diabetes Miletus Series
Examining the morbidity cases of diabetes mellitus is
essential for identifying trends and patterns over time, which can provide
insights into the progression and fluctuations of the disease within a
population. By analyzing these visual representations, healthcare providers and
policymakers can better understand peak periods, seasonal variations, and the
impact of interventions. This information is crucial for planning targeted
healthcare responses, optimizing resource allocation, and developing strategies
to reduce disease incidence and manage complications, ultimately improving
health outcomes for affected populations. The time plots of the level and log
transform series of diabetes mellitus are plotted in Figures 1 and 2
respectively as shown below.
The time plots of the level series and log transformed
series reported in Figures 1 and 2 below indicate that both series are
covariance or weakly stationary which implies the absence of unit root in the
series in level. This is indicated by the smooth trend of both series.
Figure
1:
Time Series Plot of Diabetes Miletus in Benue State from 2005 to 2025
Figure 2: Time Series Plot of Natural Log of
Diabetes Miletus in Benue State from 2005
to 2025
4.3
Augmented Dickey-Fuller (ADF) Unit Root Test Result
To ensure the
appropriateness of applying an Autoregressive Moving Average (ARMA) model for
short-term prediction of non–insulin-dependent diabetes cases among farmers in
Benue State, it is necessary to examine the time series properties of the data.
A key requirement for ARMA modeling is that the underlying series must be
stationary. Therefore, the Augmented Dickey–Fuller (ADF) unit root test was
conducted to determine whether the series is
stationary. Table 2 below presents the results of the ADF test under two
specifications: with an intercept only, and with both intercept and trend.
The ADF statistics reported in
Table 2 below for both model specifications (intercept only and intercept with
trend) are -15.3344 and -15.4304, respectively. These values are far more
negative than their corresponding 5% critical values (-2.8731 and -3.4283). In
addition, the associated p-values are 0.0000, indicating strong statistical
significance. Because the ADF test statistics are well below the critical
values and the p-values are less than 0.05, the null hypothesis of a unit root
is rejected under both model specifications. This confirms that the series stationary in its level
form. Stationarity implies that the mean and variance of the diabetes case
series remain stable over time, making it suitable for direct ARMA modeling
without differencing. The strong evidence of stationarity enhances the reliability
of subsequent short-term forecasts produced by the ARMA model.
Table 2: Augmented
Dickey-Fuller (ADF) Unit Root Test Result
|
Variable |
Option |
ADF Test
Statistic |
p-value |
5% Critical
Value |
|
|
Intercept
only |
-15.3344 |
0.0000 |
-2.8731 |
|
Intercept
& Trend |
-15.4304 |
0.0000 |
-3.4283 |
4.4
Autocorrelations and Partial Autocorrelations Functions of the Series
After confirming that the
series of non–insulin-dependent diabetes cases among farmers in Benue State is
stationary, the next step in the ARMA modeling process involves examining the
autocorrelation structure of the series. The Autocorrelation Function (ACF) and
Partial Autocorrelation Function (PACF) are used to identify the dependence
pattern between current and past observations, which guides the selection of
appropriate autoregressive (AR) and moving-average (MA) orders.
Furthermore, the Ljung-Box
Q-statistics were computed to test for the joint significance of
autocorrelations up to various lags. This test determines whether the residuals
are independently distributed — a key requirement for model adequacy. Table 3
below presents the ACF, PACF, and Ljung-Box Q-statistics results for the series
while Figure 3 belowpresented the ACF and PACF plots of the series.
The results of ACF and PACF
reported in Table 3 below and Figure 3 show that the autocorrelation (ACF) and
partial autocorrelation (PACF) coefficients for all lags are small in
magnitude, fluctuating around zero. This indicates the absence of significant
serial correlation in the data. None of the autocorrelations exceed the
approximate 95% confidence bounds (±0.1 for a large sample size of 246),
suggesting that the time series behaves like a white-noise process.
The Ljung-Box Q-statistics and
their corresponding p-values across all lags (p > 0.05) further confirm that
there is no significant autocorrelation remaining in the residuals. This means
that the null hypothesis of no autocorrelation cannot be rejected at any lag,
implying that the series is adequately described by a stationary stochastic
process (Ljung & Box, 1979).
Table 3: Autocorrelations and Ljung-Box
Q-Statistics Test Results
|
Lag |
ACF |
PACF |
Q-Statistics |
p-value |
|
1 |
0.014 |
0.014 |
0.0458 |
0.831 |
|
2 |
-0.019 |
-0.019 |
0.1338 |
0.935 |
|
3 |
0.004 |
0.005 |
0.1380 |
0.987 |
|
4 |
-0.049 |
-0.050 |
0.7497 |
0.945 |
|
5 |
0.022 |
0.024 |
0.8747 |
0.972 |
|
6 |
0.037 |
0.034 |
1.2165 |
0.976 |
|
7 |
0.022 |
0.023 |
1.3420 |
0.987 |
|
8 |
0.017 |
0.015 |
1.4126 |
0.994 |
|
9 |
-0.007 |
-0.005 |
1.4260 |
0.998 |
|
10 |
-0.110 |
-0.107 |
4.5659 |
0.918 |
|
11 |
-0.025 |
-0.022 |
4.7227 |
0.944 |
|
12 |
0.078 |
0.075 |
6.2944 |
0.901 |
|
13 |
-0.008 |
-0.012 |
6.3115 |
0.934 |
|
14 |
-0.017 |
-0.027 |
6.3907 |
0.956 |
|
15 |
0.052 |
0.055 |
7.0970 |
0.955 |
|
16 |
-0.035 |
-0.022 |
7.4226 |
0.964 |
|
17 |
-0.012 |
-0.008 |
7.4599 |
0.977 |
|
18 |
-0.088 |
-0.093 |
9.5213 |
0.946 |
|
19 |
-0.054 |
-0.050 |
10.302 |
0.945 |
|
20 |
-0.092 |
-0.114 |
12.567 |
0.895 |
|
21 |
-0.026 |
-0.032 |
12.750 |
0.917 |
|
22 |
-0.115 |
-0.115 |
16.369 |
0.797 |
|
23 |
0.007 |
0.008 |
16.381 |
0.838 |
|
24 |
-0.053 |
-0.074 |
17.165 |
0.842 |
|
25 |
-0.056 |
-0.036 |
18.032 |
0.841 |
|
26 |
-0.047 |
-0.056 |
18.643 |
0.851 |
|
27 |
0.055 |
0.057 |
19.482 |
0.852 |
|
28 |
-0.011 |
-0.032 |
19.514 |
0.882 |
|
29 |
0.060 |
0.057 |
20.511 |
0.876 |
|
30 |
0.056 |
0.042 |
21.381 |
0.876 |
|
31 |
0.040 |
0.061 |
21.828 |
0.888 |
|
32 |
-0.001 |
-0.015 |
21.828 |
0.912 |
|
33 |
-0.027 |
-0.007 |
22.036 |
0.927 |
|
34 |
-0.109 |
-0.121 |
25.432 |
0.855 |
|
35 |
-0.056 |
-0.074 |
26.342 |
0.854 |
|
36 |
0.066 |
0.025 |
27.604 |
0.841 |
Figure
3:
Plots of ACF and PACF of Log Transformed Series
Collectively, these findings
suggest that the series is not driven by persistent temporal dependence, and
any ARMA model fitted to the data should yield uncorrelated and well-behaved
residuals. Therefore, the dataset is suitable for ARMA model identification and
estimation, and the absence of significant autocorrelation validates the
appropriateness of proceeding with short-term forecasting using the ARMA
framework.
4.5 Model Order Selection
Following the establishment of
stationarity and the absence of significant autocorrelation in the diabetes
time series, various ARMA model orders were estimated to determine the most
parsimonious and best-fitting specification for short-term prediction. Model
selection was based on several statistical criteria, including the Log
Likelihood (LogL), Akaike Information Criterion (AIC), Schwarz Information
Criterion (SIC), and Hannan–Quinn Criterion (HQC). Generally, the preferred
model is the one with the highest Log Likelihood and the lowest values of AIC,
SIC, and HQC. Table 4 below presents the results of the model order selection
process.
Among the twenty-four ARMA
model specifications estimated, the ARMA(3,3) model exhibits the highest Log
Likelihood value (-24.0103) and the lowest AIC (0.2552), SIC (0.3159), and HQC
(0.2958) values. These results indicate that the ARMA(3,3) model provides the
best balance between goodness-of-fit and parsimony.
Table 4:Model Order Selection using Log
Likelihood and Information Criteria
|
S/n |
Model |
LogL |
AIC |
SIC |
HQC |
|
1. |
ARMA(0,1) |
-34.4597 |
0.2964 |
0.3349 |
0.3079 |
|
2. |
ARMA(1,0) |
-34.8194 |
0.3006 |
0.3391 |
0.3121 |
|
3. |
ARMA(1,1) |
-32.9444 |
0.2934 |
0.3363 |
0.3107 |
|
4. |
ARMA(0,2) |
-34.4107 |
0.3042 |
0.3469 |
0.3214 |
|
5. |
ARMA(2,0) |
-35.1256 |
0.3125 |
0.3555 |
0.3298 |
|
6. |
ARMA(1,2) |
-32.9256 |
0.3014 |
0.3586 |
0.3245 |
|
7. |
ARMA(2,1) |
-33.2988 |
0.3057 |
0.3631 |
0.3288 |
|
8. |
ARMA(2,2) |
-30.3771 |
0.2899 |
0.3616 |
0.3188 |
|
9. |
ARMA(0,3) |
-34.4060 |
0.3122 |
0.3692 |
0.3352 |
|
10. |
ARMA(3,0) |
-35.4688 |
0.3248 |
0.3823 |
0.3480 |
|
11. |
ARMA(1,3) |
-28.0912 |
0.2701 |
0.3616 |
0.3089 |
|
12. |
ARMA(3,1) |
-32.9028 |
0.3119 |
0.3838 |
0.3409 |
|
13. |
ARMA(2,3) |
-30.3708 |
0.2981 |
0.3841 |
0.3328 |
|
14. |
ARMA(3,2) |
-30.5304 |
0.3007 |
0.3859 |
0.3354 |
|
15. |
ARMA(3,3)** |
-24.0103 |
0.2552 |
0.3159 |
0.2958 |
|
16. |
ARMA(0,4) |
-34.1157 |
0.3180 |
0.3893 |
0.3467 |
|
17. |
ARMA(4,0) |
-35.3492 |
0.3335 |
0.4056 |
0.3625 |
|
18. |
ARMA(1,4) |
-34.4466 |
0.3302 |
0.4159 |
0.3647 |
|
19. |
ARMA(4,1) |
-35.3432 |
0.3417 |
0.4282 |
0.3765 |
|
20. |
ARMA(2,4) |
-32.0099 |
0.3198 |
0.4201 |
0.3602 |
|
21. |
ARMA(4,2) |
-26.7027 |
0.2785 |
0.3795 |
0.3192 |
|
22. |
ARMA(3,4) |
-25.4065 |
0.2799 |
0.3899 |
0.3213 |
|
23. |
ARMA(4,3) |
-33.4797 |
0.3428 |
0.4581 |
0.3893 |
|
24. |
ARMA(4,4) |
-31.4253 |
0.2962 |
0.4060 |
0.3285 |
Therefore, based on the
information criteria, the ARMA(3,3) model is selected as the optimal model for
forecasting short-term variations in non–insulin-dependent diabetes cases among
farmers in Benue State. This suggests that both autoregressive and moving
average components up to the third order significantly contribute to capturing
the dynamic structure of the series.
4.6 Parameter Estimates of ARMA(3,3) Model
After selecting
the ARMA(3,3) model as the optimal specification based on the information
criteria, the model parameters were estimated to evaluate the dynamic
relationship between past observations and random disturbances in the series of
non–insulin-dependent diabetes cases among farmers in Benue State. Table 5
below presents the estimated coefficients of the ARMA(3,3) model, along with
their corresponding standard errors, t-statistics, and p-values.
Goodness-of-fit measures such as the R-squared, Adjusted R-squared,
F-statistic, and Durbin–Watson statistic are also reported to assess the
adequacy of the fitted model.
Table 5: Parameter
Estimates of ARMA(3,3) Model
|
Variable |
Coefficient |
Std. Error |
t-Statistic |
p-value |
|
C |
8.768664 |
0.017218 |
509.2761 |
0.0000 |
|
AR(1) |
0.366096 |
0.024641 |
14.85713 |
0.0000 |
|
AR(2) |
0.311203 |
0.029382 |
10.59171 |
0.0000 |
|
AR(3) |
-0.912359 |
0.024212 |
-37.68166 |
0.0000 |
|
MA(1) |
-0.372828 |
0.009593 |
-38.86277 |
0.0000 |
|
MA(2) |
-0.386923 |
0.009312 |
-41.55086 |
0.0000 |
|
MA(3) |
0.982389 |
0.007644 |
128.5160 |
0.0000 |
|
R-squared |
0.890511 |
|
AIC |
0.255229 |
|
Adjusted
R2 |
0.867389 |
|
SIC |
0.315852 |
|
F-statistic |
6.914400 |
|
HQC |
0.295759 |
|
Prob(F-stat.) |
0.000951 |
|
Durbin-Watson stat. |
2.011502 |
The model estimation results
reported in Table 5 show that all autoregressive (AR) and moving average (MA)
coefficients are statistically significant at the 1% level, as indicated by
their very low p-values (p < 0.01). This implies that past values and past
error terms up to the third lag significantly influence the current level of
non–insulin-dependent diabetes cases among farmers.
Specifically, the positive
coefficients of AR(1) and AR(2) suggest a direct persistence effect, meaning
that increases in diabetes cases in the immediate past periods tend to raise
current cases. Conversely, the negative AR(3) coefficient indicates a corrective
mechanism, implying that after about three periods, the series tends to revert
toward its mean. The MA terms also show alternating positive and negative
signs, suggesting that short-term shocks have both dampening and amplifying
effects over time before dissipating.
The high R-squared (0.8905) and
adjusted R-squared (0.8674) values indicate that approximately 89% of the
variation in diabetes cases is explained by the model, signifying a very good
fit. The F-statistic (6.9144) with a significant probability value (0.000951)
confirms the overall significance of the model.The Durbin–Watson statistic
(2.0115) is close to 2, suggesting the absence of serial correlation in the
residuals, while the information criteria (AIC = 0.2552, SIC = 0.3159, HQC =
0.2958) reaffirm that the ARMA(3,3) model remains the most parsimonious and
efficient choice.
Overall, the ARMA(3,3) model adequately captures
the temporal dynamics and short-term fluctuations in non–insulin-dependent
diabetes cases among farmers in Benue State, making it suitable for reliable
short-term forecasting.
4.7
Model Diagnostic Checks
Following the
estimation of the ARMA(3,3) model for predicting non–insulin-dependent diabetes
cases among farmers in Benue State, diagnostic checks such as multicolinearity
test and Ljung-Box Q-statistic tests were conducted to verify the adequacy of
the fitted model. This assessment ensures that the residuals behave like white
noise, uncorrelated, homoscedastic, and pattern-free over time. The test are
presented in the following subsections.
4.7.1 Multicolinearity test result
Multicollinearity
diagnostics were performed to make sure the variables in ARMA(3,3) model weren't overlapping
too much. Using
the Variance Inflation Factor (VIF) for each autoregressive (AR) and moving
average (MA) term, the test assessed how multicollinearity might affect the
stability and reliability of parameter estimates. Generally, VIF values above
10 indicate severe multicollinearity, values between 5 and 10 suggest moderate
correlation, and values below 5 imply no serious concern. The results presented
in Table 6 show both uncentered and centered VIF statistics for the ARMA(3,3)
model parameters.
The results of multicolinearity
test reported in Table 6 below reveal that all centered VIF values are
considerably low, ranging between 1.11 and 2.55, which are far below the
critical threshold of 10. This indicates that there is no serious multicollinearity
among the explanatory variables (AR and MA terms) in the estimated ARMA(3,3)
model.
Therefore, the
estimated parameters are statistically reliable, and the standard errors are
not inflated by multicollinearity. This implies that the ARMA (3,3) model is
well-conditioned, and the coefficients can be interpreted with confidence.
Table 6: Test for
Multicolinearity (Variance Inflation Factors)
|
|
Coefficient |
Uncentered |
Centered |
|
Variable |
Variance |
VIF |
VIF |
|
C |
0.000296 |
1.018813 |
Na |
|
AR(1) |
0.000607 |
1.779456 |
1.779044 |
|
AR(2) |
0.000863 |
2.552345 |
2.552344 |
|
AR(3) |
0.000586 |
1.768375 |
1.768101 |
|
MA(1) |
9.20E-05 |
1.257613 |
1.255458 |
|
MA(2) |
8.67E-05 |
1.213557 |
1.203709 |
|
MA(3) |
5.84E-05 |
1.121942 |
1.111356 |
4.7.2
Ljung-Box Q-statistic test result for serial correlation
The
Autocorrelation Function (ACF), Partial Autocorrelation Function (PACF), and
Ljung–Box Q-statistics were used to test for serial correlation. High p-values
(greater than 0.05) for the Q-statistics indicate no significant
autocorrelation, suggesting that the residuals are random and the model is well
specified. Table 5 presents these diagnostic test results for the ARMA(3,3)
model residuals.
The results of Q-statistic
reported in Table 5 and the ACF as well as PACF plots reported in Figure 4 show
that all residual autocorrelations (ACF and PACF) are very small and fluctuate
closely around zero across all 36 lags. None of the autocorrelation coefficients
appear significant, suggesting that the residuals from the ARMA(3,3) model are
approximately white noise.
Furthermore, the Ljung–Box
Q-statistics have p-values consistently greater than 0.05, indicating that the
null hypothesis of no autocorrelation cannot be rejected at any lag. This
confirms that there is no statistically significant serial correlation remaining
in the residuals. In addition, the Durbin–Watson statistic from the model
estimation (2.0115) supports this conclusion by indicating near-zero
autocorrelation in the residuals.
Overall, these diagnostic results confirm that
the ARMA(3,3) model is well specified, the residuals are independently and
randomly distributed, and the model provides a statistically adequate fit to
the data. Therefore, the model is suitable for reliable short-term forecasting
of non–insulin-dependent diabetes cases among farmers in Benue State
Table 7: Autocorrelations
and Ljung-Box Q-Statistic Test Results of Residuals
|
Lag |
ACF |
PACF |
Q-Statistics |
p-value |
|
1 |
-0.024 |
-0.024 |
0.1415 |
0.707 |
|
2 |
-0.012 |
-0.012 |
0.1760 |
0.916 |
|
3 |
-0.069 |
-0.070 |
1.3558 |
0.716 |
|
4 |
0.007 |
0.003 |
1.3669 |
0.850 |
|
5 |
-0.126 |
-0.128 |
5.3247 |
0.378 |
|
6 |
-0.036 |
-0.048 |
5.6541 |
0.463 |
|
7 |
-0.017 |
-0.024 |
5.7294 |
0.572 |
|
8 |
0.142 |
0.124 |
10.812 |
0.213 |
|
9 |
-0.042 |
-0.042 |
11.254 |
0.259 |
|
10 |
0.046 |
0.032 |
11.802 |
0.299 |
|
11 |
-0.021 |
-0.015 |
11.918 |
0.370 |
|
12 |
0.052 |
0.044 |
12.628 |
0.397 |
|
13 |
-0.025 |
0.012 |
12.794 |
0.464 |
|
14 |
-0.009 |
-0.008 |
12.815 |
0.541 |
|
15 |
0.062 |
0.080 |
13.804 |
0.540 |
|
16 |
0.068 |
0.053 |
15.019 |
0.523 |
|
17 |
0.112 |
0.147 |
18.316 |
0.369 |
|
18 |
0.109 |
0.127 |
21.475 |
0.256 |
|
19 |
-0.008 |
0.027 |
21.493 |
0.310 |
|
20 |
-0.087 |
-0.066 |
23.529 |
0.264 |
|
21 |
-0.066 |
-0.032 |
24.707 |
0.260 |
|
22 |
-0.020 |
0.010 |
24.810 |
0.306 |
|
23 |
-0.062 |
-0.057 |
25.855 |
0.308 |
|
24 |
-0.048 |
-0.064 |
26.480 |
0.329 |
|
25 |
0.021 |
-0.044 |
26.599 |
0.376 |
|
26 |
0.020 |
-0.037 |
26.704 |
0.425 |
|
27 |
-0.033 |
-0.069 |
27.003 |
0.464 |
|
28 |
0.065 |
0.050 |
28.156 |
0.456 |
|
29 |
0.052 |
0.030 |
28.898 |
0.470 |
|
30 |
0.062 |
0.046 |
29.969 |
0.467 |
|
31 |
0.014 |
0.040 |
30.023 |
0.516 |
|
32 |
0.010 |
0.016 |
30.053 |
0.565 |
|
33 |
0.042 |
0.050 |
30.555 |
0.589 |
|
34 |
0.003 |
0.004 |
30.558 |
0.637 |
|
35 |
-0.039 |
-0.013 |
30.994 |
0.662 |
|
36 |
-0.008 |
-0.001 |
31.014 |
0.705 |
Figure
4:Plot
of Correlogram of Residuals of Estimated ARMA(3,3) Model
4.8 Forecast and Forecast Evaluation
To
evaluate the predictive performance of the ARMA(3,3) model in forecasting
non–insulin-dependent diabetes cases among farmers in Benue State, forecast
accuracy measures were computed. The Root Mean Squared Error (RMSE), Mean
Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE) were used to
assess both in-sample and out-of-sample forecast accuracy. Lower values of
these statistics indicate better model performance and predictive reliability.
The result is presented in Table 8.
The results of
forecast comparison reported in Table 8below show that the out-of-sample
forecast achieved slightly lower RMSE (0.2671), MAE (0.2310), and MAPE (2.6490)
values compared to the in-sample forecast (RMSE = 0.2715, MAE = 0.2446, MAPE =
2.6781). This suggests that the ARMA(3,3) model demonstrates strong predictive
capability, with minimal forecast error and good generalization performance.
The model selected in forecast mode, as denoted by the accuracy measures,
provides reliable short-term out-of-sample predictions of non–insulin-dependent
diabetes cases.
Table 8: Forecast Comparison using Accuracy
Measures
|
|
RMSE |
MAE |
MAPE |
|
In-Sample |
0.271510 |
0.244615 |
2.678116 |
|
Out-of-Sample** |
0.267100 |
0.231048 |
2.649005 |
Note: ** denotes forecast mode selected by
accuracy measures.
4.8.1
Forecast of Diabetes Miletus in Benue State from July, 2025 to June, 2027
To evaluate the
short-term predictive performance of the ARMA(3,3) model, forecasts of
non–insulin-dependent diabetes (Type-2 Diabetes Mellitus) cases among farmers
in Benue State were generated for the period July 2025 to June 2027. The
forecasts were computed in natural logarithmic form and then converted to
actual population estimates. For each forecast, the standard error, lower
confidence limit (LCL), and upper confidence limit (UCL) were calculated at a
95% confidence level, using . These values provide a range within
which the true number of diabetes cases is expected to fall with high
probability, thereby indicating the reliability and uncertainty of the
forecasts. The forecast result is reported in Table 9 below while the forecast
graph is presented as Figure 5 below too.
Table 9: "Forecast
of Diabetes Miletus Infection Cases in Benue State from July 2025-
June, 2027"
|
Year: Month |
Forecast (natural log form) |
Actual Forecast
(No. of Persons) |
|||
|
Forecast |
Std. error |
LCL |
Forecast |
UCL |
|
|
2025:06 |
6.9967 |
--- |
--- |
8896 |
--- |
|
2025:07 |
8.77405 |
0.271243 |
3799 |
6464 |
11000 |
|
2025:08 |
8.72655 |
0.271669 |
3619 |
6165 |
10499 |
|
2025:09 |
8.78204 |
0.271670 |
3826 |
6516 |
11098 |
|
2025:10 |
8.77132 |
0.272065 |
3782 |
6447 |
10988 |
|
2025:11 |
8.80141 |
0.272672 |
3893 |
6644 |
11337 |
|
2025:12 |
8.74519 |
0.272672 |
3680 |
6281 |
10717 |
|
2026:01 |
8.76088 |
0.272790 |
3738 |
6380 |
10889 |
|
2026:02 |
8.74585 |
0.273455 |
3677 |
6285 |
10741 |
|
2026:03 |
8.79725 |
0.273466 |
3871 |
6616 |
11308 |
|
2026:04 |
8.77366 |
0.273476 |
3781 |
6462 |
11044 |
|
2026:05 |
8.77825 |
0.274040 |
3794 |
6492 |
11107 |
|
2026:06 |
8.73648 |
0.274110 |
3638 |
6226 |
10654 |
|
2026:07 |
8.76803 |
0.274114 |
3755 |
6426 |
10996 |
|
2026:08 |
8.76810 |
0.274473 |
3752 |
6426 |
11005 |
|
2026:09 |
8.79729 |
0.274652 |
3862 |
6616 |
11335 |
|
2026:10 |
8.76026 |
0.274669 |
3722 |
6376 |
10923 |
|
2026:11 |
8.76113 |
0.274824 |
3724 |
6381 |
10936 |
|
2026:12 |
8.74504 |
0.275111 |
3662 |
6279 |
10767 |
|
2027:01 |
8.78341 |
0.275121 |
3805 |
6525 |
11188 |
|
2027:02 |
8.77734 |
0.275152 |
3782 |
6486 |
11121 |
|
2027:03 |
8.78223 |
0.275481 |
3798 |
6517 |
11183 |
|
2027:04 |
8.74716 |
0.275481 |
3667 |
6293 |
10798 |
|
2027:05 |
8.76058 |
0.275481 |
3717 |
6378 |
10944 |
|
2027:06 |
8.76313 |
0.275759 |
3724 |
6394 |
10978 |
|
Total |
210.40663 |
|
|
154075 |
|
|
Average |
8.766942917 |
|
|
6419.7917 |
|
Note: For 95%
confidence intervals, . LCL and UCL denote lower and upper
confidence limits respectively.
Figure
5:
Forecast Graph of Diabetes Miletus in Benue State from July, 2025-June, 2027
The forecast results reported
in Table 9 and Figure 5 above reveals that the predicted number of
non–insulin-dependent diabetes cases among farmers in Benue State is expected
to fluctuate moderately over the two-year forecast horizon (July 2025–June 2027).
The monthly forecasts range between approximately 3,600 and 11,300 cases, with
an overall average of about 6,420 cases per month and a total forecast of
154,075 cases during the study period. The relatively narrow confidence
intervals across months suggest a high level of precision in the model’s
predictions.
Overall, the ARMA(3,3) model
demonstrates strong forecasting capability, indicating that diabetes prevalence
among farmers in Benue State is likely to remain fairly stable with mild
month-to-month variations over the forecast period.
4.9 Implications of the Study to Farmers and Postharvest Losses in
Benue State
The implications of this study for farmers and
postharvest losses in Benue State are significant from both public health and
socio-economic perspectives. The findings, which forecast the prevalence of
non–insulin-dependent diabetes (Type-2 Diabetes Mellitus) among farmers,
suggest that a substantial portion of the agricultural workforce may experience
declining health and productivity over time. Poor health conditions such as
diabetes can reduce farmers’ physical capacity to engage in strenuous agricultural
activities, particularly during critical periods like harvesting and
processing. "This in turn increases the
likelihood of postharvest losses, as crops may remain un-harvested or
inadequately stored due to reduced labour efficiency and absenteeism resulting
from illness".
Moreover, "higher
diabetes prevalence among farmers implies increased medical expenditures and a
diversion of household income away from agricultural investment", further compounding the problem of low
productivity and waste. The study underscores the urgent need for integrated
health and agricultural policies—including improved rural healthcare services,
regular medical screening, health education on diet and lifestyle, and the
promotion of labour-saving technologies—to mitigate the dual burden of disease and
postharvest losses. Ultimately, addressing the health challenges of farmers is
crucial for achieving food security, sustaining agricultural livelihoods, and
enhancing overall economic resilience in Benue State.
4.0 Conclusion
The study demonstrates that the ARMA(3,3) model effectively
forecasts the incidence of non-insulin-dependent diabetes among farmers in
Benue State, Nigeria, The analysis revealed that the
ARMA(3,3) model provided the best fit based on information criteria and
diagnostic tests, with residuals behaving like white noise, indicating a
well-specified and reliable model. The forecasts from July 2025 to June 2027 suggest
a steady and relatively high incidence of diabetes cases among farmers,
implying that the disease poses an ongoing public health concern within the
agricultural population. This condition could adversely affect farmers’
productivity, increase medical costs, and indirectly contribute to higher
postharvest losses due to reduced labour availability and inefficiencies in
farm management. These findings highlight the interconnectedness between health
and agricultural output, emphasizing that the burden of chronic diseases like
diabetes extends beyond healthcare into the realm of food security and economic
stability. Therefore, proactive health interventions and policy integration
between the health and agricultural sectors are vital. Ensuring farmers’
wellness through preventive care, early detection, and education can
significantly reduce the impact of diabetes and its broader economic
consequences. The study provides empirical evidence to guide policymakers,
healthcare providers, and agricultural development agencies in formulating
context-specific strategies to improve both health outcomes and agricultural
sustainability in Benue State.
REFERENCES
Al Zahrani, S., Al
Rahman Al Sameeh, F., Musa, A. C. M., &Shokeralla, A. A. A. (2020). Forecasting diabetes patients
attendance at Al-Baha hospitals using autoregressive fractional integrated moving average (ARFIMA) models. Journal of Data Analysis and Information
Processing, 8, 183-194.
American College of
Obstetricians and Gynecologists. (2018). ACOG Practice Bulletin No. 190:
Gestational Diabetes Mellitus. Obstetrics
& Gynecology, 131(2), e49–e64.
American Diabetes
Association. (ADA 2022). Diagnosis and Classification of Diabetes Mellitus. Diabetes Care, 45(1), S17-S38.
Atkinson, M. A.,
Eisenbarth, G. S., & Michels, A. W. (2014). Type 1 diabetes. The Lancet, 383(9911), 69–82.
Benue State Epidemiological Unit, Makurdi,
Nigeria.
(Unpublished secondary data on type-2 diabetes incidence, 2005–2025).
Box, G. E. P., Jenkins,
G. M., & Reinsel, G. C. (2015). Time series analysis: forecasting and
control. John Wiley & Sons.
Carlos M. Jarque, C. M., &Anil K. Bera, A. K. (1980).Efficient tests for
normality,homoscedasticity and serial independence of regression residuals.
Economics Letters, 6(3), 255–259.
Carlos M. Jarque, C. M., & Anil K. Bera, A. K. (1987).A test for normality
of observations and regression residuals. International Statistical Review,
55(2), 163–172
Cloete, L. (2022).
Diabetes mellitus: an overview of the types, symptoms, complications and
management. Nursing Standard,
37(1), 61-66.
David A.Dickey, D. A., &Wayne A. Fuller, W. A. (1979).Distribution of
the estimators for autoregressive time series with a unit root.Journal of
the American StatisticalAssociation, 74(366), 427–431.
Deberneh,
H. M. & Kim, I. (2021). Prediction of type 2 diabetes based on machine
learning algorithm. International Journal
of Environmental Research and Public Health, 18, 3317-3329.
Desai, S. &
Deshmukh, A. (2020). Mapping of type-1 diabetes mellitus. Current Diabetes
Reviews, 16(5), 438-441.
Dickey, D. A., & Fuller, W. A. (1979). Distribution
of the Estimators for Autoregressive Time Series with a Unit Root. Journal
of the American Statistical Association, 74(366), 427–431.
https://doi.org/10.2307/2286348
Diogo,
M. V., Nunopombo, F., & Brandão, P. (2022).Hypoglycemia prediction models
with auto explanation. IEEE Access, 10,
57930-57941.
Donath, M. Y., & Shoelson,
S. E. (2011). Type-2 diabetes as an inflammatory disease. Nature Reviews
Immunology, 11(2), 98–107.
Edward J. Hannan, E. J., &Barry G. Quinn, B. G. (1979).The determination
of the order of an autoregression.Journal of the Royal Statistical Society:
Series B, 41(2), 190–195
George Casella, G., &Roger L. Berger, R. L. (2002).Statistical
Inference (2nd ed.). Duxbury.
George E. P. Box, G. E. P., Gwilym M. Jenkins, G. M., &Gregory C. Reinsel, G. C. (2015).
Time Series Analysis: Forecasting and Control (5th ed.). Wiley
Greta M. Ljung, G. M., &George E. P. Box, G. E. P. (1978).On a measure
of lack of fit in time series models.Biometrika, 65(2), 297–303
Hirotugu Akaike, H. (1974).A new look at the
statistical model identification.IEEE Transactions on Automatic Control,
19(6), 716–723.
Huang, Y., Vemer, P.,
Zhu, J., & Postma, M. J. (2016). The economic burden of diabetes mellitus
in rural southwest China. International
Journal of Environmental Research and Public Health, 13(9), 875-889.
International Diabetes
Federation. (IDF, 2019). IDF Diabetes Atlas, 9th Edition. Brussels, Belgium:
International Diabetes Federation.https://www.diabetesatlas.org/en/
Jaeger, B., Casanova, R., Demesie,
Y., Stafford, J., Wells, B., & Bancks, M. P. (2025). Development and
Validation of a Diabetes Risk Prediction Model With Individualized Preventive
Intervention Effects. The Journal of Clinical Endocrinology and Metabolism,
110(12), e4023–e4029. https://doi.org/10.1210/clinem/dgaf250
Kahn, S. E., Cooper, M. E., &
Del Prato, S. (2014). Pathophysiology and treatment of type-2 diabetes:
perspectives on the past, present, and future. The Lancet, 383(9922),
1068–1083.
Katsarou, D. N., Georga, E. I.,
Christou, M., Tigas, S., Papaloukas, C., & Fotiadis, D. I. (2022). Short
term glucose prediction in patients with type-1 diabetes mellitus. Annual International Conference of IEEE
Engineering, Medical & Biological Society, 2022, 329-332.
Ljung, G. M., & Box, G. E. P. (1979). The
Likelihood Function of Stationary Autoregressive-Moving Average Models. Biometrika,
66(2), 265. https://doi.org/10.2307/2335657
Ma, N., Zhao, Y.,
Wen, S., Yang, T., Wu, R., Tao, R., Yu, X., & Li, H. (2020). Online blood
glucose prediction using autoregressive moving average model with residual
compensation network. Journal of science,
12(2), 115-128.
Matthew,
P. K., Timothy, K. N., Ajia, R., & Antyev, S. (2022).Time series modelling
of diabetes disease in Taraba state, Nigeria. Science World Journal, 17(3), 406-412.
Olamoyegun, M. A., Alare, K., Afolabi, S. A., Aderinto,
N., & Adeyemi, T. (2024). A systematic review
and meta-analysis of the prevalence and risk factors of type 2 diabetes
mellitus in Nigeria. Clinical Diabetes and Endocrinology, 10(1). https://doi.org/10.1186/s40842-024-00209- 1
Olivares-Vera,
D. A., Gutiérrez-Hernández, D. A., Escobar-Acevedo, M. A., Lara-Rendón, C.,
& Velázquez-Vázquez, D. A. (2021). Comparison of algorithms for the
prediction of glucose levels in patients with diabetes. Nova Scientia, 13(2), 1-19.
Powers, A. C.,
D'Alessio, D., & Endocrine Society. (2016). Diabetes Mellitus: Diagnosis,
Classification, and Pathophysiology. In Endotext. MDText.com, Inc.
Rob J. Hyndman, R. J., & George
Athanasopoulos, G. (2021).Forecasting: Principles and Practice (3rd
ed.). OTexts.Available online: https://otexts.com/fpp3/
Robertson, R. P. (2004).
Chronic oxidative stress as a central mechanism for glucose toxicity in
pancreatic islet beta cells in diabetes. Journal
of Biological Chemistry, 279(41), 42351-42354.
Rodríguez-Rodríguez,
I., Chatzigiannakis, L., Rodríguez, J., Maranghi, M., Gentili, M., &
Zamora-Izquierdo, M. (2019). Utility of big data in predicting short-term blood
glucose levels in type 1 diabetes mellitus through machine learning techniques.
Sensors, 19, 4482-4498.
Schwarz, G. (1978c). Estimating the Dimension of a
Model. The Annals of Statistics, 6(2), 461–464.
https://doi.org/10.1214/aos/1176344136
Sheldon
M. Ross,
S. M. (2014).Introduction to Probability and Statistics for Engineers and Scientists
(5th ed.). Academic Press.
Singye,
T. &Unhapipat, S. (2018). Time series analysis of diabetes patients: A case
study of Jigme Dorji Wangchuk National Referral Hospital in Bhutan. Journal of physics: conference series, 1039,
1-11.
Smith, S. M., Boppana,
A., Traupman, J. A., Unson, E., Maddock, D. A., Chao, K., Dobesh, D. P.,
Brufsky, A., & Connor, R. I. (2021). Impaired glucose metabolism in
patients with diabetes, prediabetes, and obesity is associated with severe
COVID-19. Journal of Medical Virology, 93(1),
409-415.
Spyros Makridakis, S., Steven C.
Wheelwright, S. C., & Rob J. Hyndman, R. J. (1998).Forecasting: Methods
and Applications (3rd ed.). Wiley
Sun, Y., Tao, Q., Wu,
X., Zhang, L., Liu, Q., & Wang, L. (2021). The utility of exosomes in
diagnosis and therapy of diabetes mellitus and associated complications. Frontiers in Endocrinology (Lausanne), 12,
75-88.
Teran, A. D. (2017). Effects of
diabetic prevalence and mortality on households farm labour productivity in Benue State. IOSR Journal
of Agriculture and Veterinary Science, 10(7), 63-72.
Villani, M., Nanayakkara, N., Ranasinha, S., Earnest, A., Smith, K., Soldatos, G., Teede, H. &Zoungas, S. (2017). Utilisation of prehospital emergency medical services for hyperglycemia: a community-based observational sAn Autoregressive Moving Average Model for Short Term Prediction of Non-Insulin Dependent Diabetes Among Farmers in Benue State
John Agada1, David Adugh Kuhe 2 and Ojochegbe Noah Anthony 3*
1Department of
Mathematics and Computer Science, Rev, Fr. Moses Orshio Adasu University
Makurdi, Benue State, Nigeria
2Department of
Statistics, Joseph Sarwuan Tarka University, Makurdi, Benue State, Nigeria
3Department of
Mathematics and Computer Science, Rev, Fr. Moses Orshio Adasu University
Makurdi, Benue State, Nigeria
Corresponding Author:
Email: davidkuhe@gmail.com; Tel: 2348064842229
ABSTRACT
This study employs an Autoregressive
Moving Average (ARMA) time series model to forecast the short-term incidence of
non-insulin-dependent diabetes mellitus (Type 2 Diabetes) among farmers in
Benue State, Nigeria. The data was collected from the Benue State
Epidemiological Unit, Makurdi, and covered a 20-year period from January 2005
to June 2025. The study employed descriptive statistics and normality measures,
Augmented Dickey-Fuller (ADF) unit root test and ARMA (p,q) model as the
principal analytical techniques
and procedures used to examine the data. The descriptive statistics
indicated moderate variability in diabetes cases over the years, while the
Augmented Dickey-Fuller (ADF) test confirmed the stationarity of the series in
level. Model choice based on Akaike Information Criterion (AIC), Schwarz
Information Criterion (SIC), and Hannan–Quinn Criterion (HQC) identified the
ARMA(3,3) model as the best fit for forecasting diabetic cases in the study
area. The model’s high coefficient of determination (R² = 0.8905) and
statistically significant parameters (p < 0.05) demonstrated its robustness
and predictive accuracy. Diagnostic checks using autocorrelation, partial
autocorrelation, and the Ljung–Box Q-statistics showed that the residuals
behaved like white noise, indicating a well-specified model. Forecast
evaluations using Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and
Mean Absolute Percentage Error (MAPE) confirmed that the model accurately good
for predicting out-of-sample values. The forecast for July 2025 to June 2027
revealed a potential average of approximately 6,420 diabetes cases per month
among farmers, with expected fluctuations over time. The study underscored the
growing public health concern of diabetes among the farming population in Benue
State and its implications for agricultural productivity and postharvest
losses. The study concluded that predictive modeling can serve as a vital tool
for health planners to design early intervention strategies, integrate health
management with agricultural development, and enhance the overall well-being of
rural farmers.
Keywords: Diabetes, ARIMA, Time
Series Forecasting, Non-Insulin Dependent Diabetes, Farmers, Benue State,
Public Health, Postharvest Losses
1.0 INTRODUCTION
Diabetes mellitus, often simply
referred to as diabetes, is a group of metabolic disorders characterized by
high blood sugar levels over a prolonged period. The two main types of diabetes
are type-1 diabetes, which results from the body’s inability to produce
insulin, and Type-2 diabetes develops when the body either
becomes resistant to insulin or produces insufficient insulin to control blood
sugar levels effectively. Diabetes mellitus is a multifaceted metabolic
condition marked by high concentrations of glucose (sugar) in the bloodstream Glucose is a crucial source of
energy for cells, and insulin, a hormone produced by the pancreas, plays a
central role in regulating its uptake into cells. In diabetes mellitus, this
regulation is disrupted, leading to persistent hyperglycemia (high blood sugar)
(American Diabetes Association, 2022).
Diabetes mellitus is a significant
public health concern worldwide, with its prevalence increasing steadily over
the past few decades. According to the International Diabetes Federation (IDF,
2019), an estimated 537 million adults aged 20-79 years were living with
diabetes globally in 2021 and this number is projected to rise to 783 million
by 2045. The prevalence of diabetes varies by region, with higher rates
observed in low- and middle-income countries, particularly in urban areas
undergoing rapid socioeconomic development and lifestyle changes. (ADA, 2022).
In Nigeria, the prevalence is
estimated at 7% and 11.35% in South-south zone. The Diabetes Association of
Nigeria (DAN) reviewed that, mortality rate of diabetes from insufficient
management far outweighs that of HIV/AIDs, Malaria and Cancer (Olamoyegun et al.,
2024)
Diabetes mellitus is significantly
Impacting farmers in Benue State with prevalence rate among yam farming
population estimated at 24.9% and mortality rate of 8.61% and as led to reduced
labor productivity, economic impact and health complications (Teran, A.D.. 2017)
Diabetes is associated with numerous
complications that can affect nearly every organ system in the body. These
complications includes Microvascular: Retinopathy (vision loss) neuropathy
(nerve damage), nephropathy (kidney damage), and Microvascular: cardiovascular
disease (such as heart attack and stroke), others are foot ulcers and
amputations. The burden of diabetes-related complications is substantial,
leading to increased medical costs, reduced quality of life, and higher risk of
premature mortality (ADA, 2022).
Type-2 diabetes, also known as non-insulin dependent
diabetes, is a long-term condition that affects how the body processes sugar
(glucose), which is an important source of energy. In this condition, the body
either becomes resistant to insulin, a hormone that helps move sugar into
cells, or doesn’t produce enough insulin to keep blood sugar levels normal (Sun
et al., 2021). Unlike type-1 diabetes, where the immune system attacks and
destroys insulin-producing cells in the pancreas, type-2 diabetes usually develops
slowly over time. While it was once mostly seen in adults, more children and
teenagers are now being diagnosed, largely due to increasing obesity and less
active lifestyles (Sun et al., 2021).
A major characteristic of type-2 diabetes is insulin
resistance, which means the body's cells don't respond to insulin as they
should. When this happens, the pancreas tries to make more insulin to help move
sugar into the cells. However, over time, the pancreas may struggle to keep up
with this increased demand. As a result, sugar starts to accumulate in the
blood, causing high blood sugar levels (Cloete, 2022).
Several determinants contributes to
the risk of developing type-2 diabetes, including obesity, particularly excess
fat around the abdomen (central obesity), A sedentary
lifestyle, unhealthy eating habits—like eating too many sugary and processed
foods—having a family history of diabetes, getting older (especially after 45),
and belonging to certain ethnic groups are all factors that can increase the
risk of developing diabetes
(ADA, 2022).
In Addition to insulin resistance, type-2 diabetes can also
involve problems with the pancreas, the organ that makes insulin. Sometimes,
the pancreas doesn't produce enough insulin to keep blood sugar levels in
check, making high blood sugar worse (Desai & Deshmukh, 2020).
Symptoms of type-2 diabetes often develop slowly and can
include increased thirst, frequent urination, fatigue, blurred vision, slow
wound healing, and repeated infections. In the early stages, some people may
not notice any symptoms at all,
which is why regular screenings are essential (IDF, 2019).
Treatment for type-2 diabetes aims
to maintain blood sugar levels within a target range to prevent serious health
problems and complications. This typically involves lifestyle modifications
such as regular exercise, healthy eating habits (including portion control and
selecting nutrient-rich foods), weight management, and monitoring blood sugar
levels. (Desai & Deshmukh, 2020).
The management and treatment of
type-2 diabetes can impose financial burdens on individuals, families, and
healthcare systems. In regions where healthcare costs are primarily borne by
the individual or are not adequately covered by insurance, the expenses
associated with diabetes care can divert resources away from agricultural
investments and productivity-enhancing measures. This can directly impact
agricultural communities with reduced investment into agricultural produces,
reduced income and crop loss thereby affecting their livelihood (Huang et al., 2016).
Diabetes Mellitus is diagnosed when certain blood sugar
levels are met or exceeded. Specifically, a person may be diagnosed if their
A1C is 6.5% or higher, which reflects average blood glucose over the past few
months. Alternatively, if fasting blood sugar is 126 mg/dL or higher, or if a
2-hour blood sugar reading during an oral glucose tolerance test reaches 200
mg/dL or more, a diagnosis may be made. Additionally, if an individual has a
random blood sugar of 200 mg/dL or higher along with symptoms like excessive
thirst, frequent urination, or unexplained weight loss, they may also be
diagnosed with diabetes (Jaeger et al., 2025).
Agricultural activities, like applying chemical
fertilizers and pesticides, can have environmental consequences that can
indirectly impact diabetes risk factors. For instance, exposure to chemicals
such as glyphosate or organophosphates used in farming has been associated with
a higher likelihood of developing metabolic disorders. Additionally, environmental
factors such as air pollution and climate change may exacerbate diabetes risk
factors and health outcomes, potentially affecting agricultural productivity
and crop yields (whiting et al.,
2011). Overall, while the direct impact of type-2 diabetes on agricultural
productivity and postharvest losses may be limited, the interplay between
diabetes, dietary patterns, healthcare access, and environmental factors can
have broader implications for agricultural communities and food systems.
Addressing the complex relationship between health, agriculture, and the
environment requires a holistic approach that considers socioeconomic factors,
public health interventions, and sustainable agricultural practices (Whiting et al., 2011).
Overall, while the direct impact of
type-2 diabetes on agricultural productivity and postharvest losses may be
limited, the interplay between diabetes, dietary patterns, healthcare access,
and environmental factors can have broader implications for agricultural
communities and food systems. Addressing the complex relationship between
health, agriculture, and the environment requires a holistic approach that
considers socioeconomic factors, public health interventions, and sustainable
agricultural practices (Huang et al.,
2016).
This study therefore attempts to
extend the existing literature and contribute to the existing body of knowledge
by modeling and forecasting non insulin dependent diabetes among farmers in
Benue State using autoregressive moving average (ARIMA) time series model with
more recent data.
2.0
MATERIALS AND METHODS
2.1 Method of Data Collection
The
data utilized in this research work are monthly secondary time series data on
morbidity incidence of type-2 diabetes in Benue state for the period of
January, 2005 June, 2025 making a total of 234 observations. The data was
collected from Benue State Epidemiological unit, Makurdi. The data was
transformed to natural logarithms using the following formula:
where
is
the confirmed type-2 diabetes series observation indexed by time
, while
is
the natural logarithm. Hence forth
will be regarded as a series.
2.2 Methods of Data Analysis
Find
below the statistical tools employed in the analysis of data in this work.
3.2.1 Descriptive statistics and normality
measures
The
mean of any given set of data can be computed as follows:
The
sample standard deviation of any given set of data over a given period of time
is computed using the formula:
where
is
the sample mean,
is
the sample size.
Jarque-Bera
test is a normality test of whether a given sample data have the skewness and
kurtosis similar to that of a normal distribution. The test was proposed by
Jarque and Bera (1980, 1987) and test the null hypothesis that the series is
normally distributed. Given any data set, the test statistic JB is defined as:
where
is
the sample skewness denoted as:
and is
the sample kurtosis given below:
whereT
is the total number of observations. The JB normality test checks the
following pair of hypothesis:
and
(i.e.,
follows a normal distribution)
and
(i.e.,
does not follows a normal distribution).
The
test rejects the null hypothesis if the p-value of the JB test statistic is
less than level of significance.
2.2.2 Augmented Dickey-Fuller (ADF) unit root
test
The Augmented Dickey-Fuller (ADF) test helps to
identify if a time series is stationary or has a unit root, indicating a
persistent trend over time (Dickey and Fuller, 1979).
It
accounts for higher-order correlations by assuming the series follows an AR(p)
process and incorporates lagged differences of the series into the regression
to enhance the test's precision.
.
where
are optional exogenous regressors which
may consist of constant, or a constant and trend,
and
are
parameters to be estimated,β values arelagged difference terms and the
are assumed to be white noise. The null and
alternative hypotheses are written as:
(8)
and
evaluated using the conventional ratio for
where
is
the estimate of
and "the
coefficient standard error is denoted as
"
2.2.3
Portmanteau test
A Portmanteau test also called he Ljung-Box Q-statistic test is used to
determine whether there is any remaining serial correlation or autocorrelation
in the residuals of a time series. The test checks the following pairs
of hypotheses:
(all lags correlations are zero)
(there is at least one lag with non-zero
correlation). The test statistic is given by:
where
denotes the autocorrelation estimate of squared standardized
residuals at lags. T is the sample size, Q
is the sample autocorrelation at lag k. We reject
if p-value is less than
level of significance (Ljung and Box, 1979).
2.3 Time Series Models Specification
To specify an ARIMA
model which is the model framework use in this study, we first specify
autoregressive (AR) model, moving average (MA) model, autoregressive moving
average (ARMA) model before specifying autoregressive integrated moving average
(ARIMA) model. These models are specified as follows.
2.3.1 The autoregressive (AR) model
A
stochastic time series process {} is an autoregressive process of order p,
denoted AR(
) if it satisfied the difference equation
where
is
a white noise and
are constants to be determined.
2.3.2 Moving average (MA) model
A
time series {} which satisfies the difference equation
where
are fixed constants with
as
white noise is called a moving average process of order q, denoted MA(
).
2.3.3 Autoregressive moving average
(ARMA) model
A
stochastic time series process {} which results from a linear combination
of autoregressive and moving average processes is called an Autoregressive
Moving Average (ARMA) process of order p, q, denoted ARMA (
) if it satisfies the following
difference equation:
where
are fixed constants associated with the
AR terms and
are fixed constants associated with the MA
terms with
being a white noise. The stationarity of an
ARMA (
) process is guaranteed if the roots of
the polynomial
lie outside the unit circle.
An
ARMA () model is specified as:
2.3.4 Autoregressive integrated moving average
(ARIMA) model
Autoregressive
(AR), Moving Average (MA) or Autoregressive Moving Average (ARMA) model in
which differences have been taken are collectively called Autoregressive
Integrated Moving Average or ARIMA models. A time series {} is said to follow an integrated
autoregressive moving average model if the
th difference
is
a stationary ARMA process. If
follows an ARMA(p, q) model, we say that {
} is an ARIMA (p, d, q) process. For
practical purposes, we can usually take
or
at most 2.
Consider
then an ARIMA (p, 1, q) process, with , we have
In
terms of the observed series,
)
2.4 Model Order Selection
We use the following information criteria
for model order selection in conjunction with log likelihood function: Akaike
information criterion (AIC) due to Akaike (1978), Schwarz information Criterion
(SIC) due to (Schwarz, 1978) and Hannan-Quinn information Criterion (HQC) due
to (Hannan, 1980). The
formula for the information criteria are:
where is
the number of free parameters to be estimated in the model, T is the number of
observations and L is the likelihood function defined as:
Thus
given a set of estimated ARMA models for a given set of data, the preferred
model is the one with the minimum information criteria and maximum log
likelihood.
2.5 Model Forecast Evaluation
We
employed Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) accuracy
measures to select an optimal model mode that is both parsimonious and
accurately forecast the data based on minimum values of the accuracy measures.
2.5.1 Root Mean Square Error (RMSE)
The
Root Mean Square Error is
a statistical tool for measuring the accuracy of a forecast method. It is
computed as:
Where
is
the forecast value of the series and
is
the actual series and
is
the number of forecast observations.
2.5.2
Mean Absolute Error (MAE)
The mean absolute
error (MAE) is a statistical tool for measuring the average size of the errors
in a collection of predictions, without taking their directions into account.
It is measured as the average absolute difference between the predicted values and
the actual values and is used to assess the effectiveness of a model. It is
given as:
where"
is
the actual value of the series at time
is
the forecasted value of the series and
is
the number of observations. The lower the value of RMSE and MAE, the better the
model is able to forecast future values.
3.0 RESULTS
AND DISC0USSION
3.1
Summary Statistics and Normality Measures
This
study seeks to provide a short-term prediction of non-insulin-dependent
diabetes (Type-2 diabetes mellitus) among farmers in Benue State using the
Autoregressive Moving Average (ARMA) time series model. Before model
estimation, a preliminary analysis of the dataset was conducted to summarize
its key characteristics and assess the normality of the distribution. Table 1
below presents the descriptive statistics and normality test results for the
observed monthly diabetes cases.
Table 1: Summary Statistics and Normality Measures
|
Variable |
Statistic |
|
Mean |
5571.321 |
|
Maximum |
9661.00 |
|
Minimum |
3624.000 |
|
Standard
Deviation |
1769.088 |
|
Skewness |
0.010212 |
|
Kurtosis |
1.767498 |
|
Jarque-Bera
Statistic |
15.57465 |
|
p-value |
0.000415 |
|
Number of
Observations |
246 |
From the result of summary
statistics and normality measures reported in Table 1 above, the mean value of
approximately 5571 infection cases indicates the average number of recorded
non-insulin-dependent diabetes cases among farmers during the study period,
while the maximum and minimum values (9661 and 3624, respectively) show the
range of variation in the data. The standard deviation (1769) suggests a
relatively high level of fluctuation around the mean, implying moderate
variability in the monthly incidence of diabetes cases.
The skewness value (0.010212),
being close to zero, indicates that the distribution of the series is
approximately symmetric. However, the kurtosis value (1.767498) is less than 3,
signifying a platykurtic distribution, that is, the data are relatively flatter
than a normal distribution with lighter tails.
The Jarque–Bera statistic
(15.57465) with an associated p-value of 0.000415 is statistically significant
at the 1% level, leading to the rejection of the null hypothesis of normality.
This implies that the series does not follow a perfectly normal distribution,
which is a common characteristic of real-world time series data.
Overall, the results suggest that while the data are
fairly symmetric, they deviate slightly from normality, a factor to be
considered when fitting and diagnosing the ARMA model for accurate short-term
forecasting.
4.2 Graphical
Examination of Diabetes Miletus Series
Examining the morbidity cases of diabetes mellitus is
essential for identifying trends and patterns over time, which can provide
insights into the progression and fluctuations of the disease within a
population. By analyzing these visual representations, healthcare providers and
policymakers can better understand peak periods, seasonal variations, and the
impact of interventions. This information is crucial for planning targeted
healthcare responses, optimizing resource allocation, and developing strategies
to reduce disease incidence and manage complications, ultimately improving
health outcomes for affected populations. The time plots of the level and log
transform series of diabetes mellitus are plotted in Figures 1 and 2
respectively as shown below.
The time plots of the level series and log transformed
series reported in Figures 1 and 2 below indicate that both series are
covariance or weakly stationary which implies the absence of unit root in the
series in level. This is indicated by the smooth trend of both series.
Figure
1:
Time Series Plot of Diabetes Miletus in Benue State from 2005 to 2025
Figure 2: Time Series Plot of Natural Log of
Diabetes Miletus in Benue State from 2005
to 2025
4.3
Augmented Dickey-Fuller (ADF) Unit Root Test Result
To ensure the
appropriateness of applying an Autoregressive Moving Average (ARMA) model for
short-term prediction of non–insulin-dependent diabetes cases among farmers in
Benue State, it is necessary to examine the time series properties of the data.
A key requirement for ARMA modeling is that the underlying series must be
stationary. Therefore, the Augmented Dickey–Fuller (ADF) unit root test was
conducted to determine whether the series is
stationary. Table 2 below presents the results of the ADF test under two
specifications: with an intercept only, and with both intercept and trend.
The ADF statistics reported in
Table 2 below for both model specifications (intercept only and intercept with
trend) are -15.3344 and -15.4304, respectively. These values are far more
negative than their corresponding 5% critical values (-2.8731 and -3.4283). In
addition, the associated p-values are 0.0000, indicating strong statistical
significance. Because the ADF test statistics are well below the critical
values and the p-values are less than 0.05, the null hypothesis of a unit root
is rejected under both model specifications. This confirms that the series stationary in its level
form. Stationarity implies that the mean and variance of the diabetes case
series remain stable over time, making it suitable for direct ARMA modeling
without differencing. The strong evidence of stationarity enhances the reliability
of subsequent short-term forecasts produced by the ARMA model.
Table 2: Augmented
Dickey-Fuller (ADF) Unit Root Test Result
|
Variable |
Option |
ADF Test
Statistic |
p-value |
5% Critical
Value |
|
|
Intercept
only |
-15.3344 |
0.0000 |
-2.8731 |
|
Intercept
& Trend |
-15.4304 |
0.0000 |
-3.4283 |
4.4
Autocorrelations and Partial Autocorrelations Functions of the Series
After confirming that the
series of non–insulin-dependent diabetes cases among farmers in Benue State is
stationary, the next step in the ARMA modeling process involves examining the
autocorrelation structure of the series. The Autocorrelation Function (ACF) and
Partial Autocorrelation Function (PACF) are used to identify the dependence
pattern between current and past observations, which guides the selection of
appropriate autoregressive (AR) and moving-average (MA) orders.
Furthermore, the Ljung-Box
Q-statistics were computed to test for the joint significance of
autocorrelations up to various lags. This test determines whether the residuals
are independently distributed — a key requirement for model adequacy. Table 3
below presents the ACF, PACF, and Ljung-Box Q-statistics results for the series
while Figure 3 belowpresented the ACF and PACF plots of the series.
The results of ACF and PACF
reported in Table 3 below and Figure 3 show that the autocorrelation (ACF) and
partial autocorrelation (PACF) coefficients for all lags are small in
magnitude, fluctuating around zero. This indicates the absence of significant
serial correlation in the data. None of the autocorrelations exceed the
approximate 95% confidence bounds (±0.1 for a large sample size of 246),
suggesting that the time series behaves like a white-noise process.
The Ljung-Box Q-statistics and
their corresponding p-values across all lags (p > 0.05) further confirm that
there is no significant autocorrelation remaining in the residuals. This means
that the null hypothesis of no autocorrelation cannot be rejected at any lag,
implying that the series is adequately described by a stationary stochastic
process (Ljung & Box, 1979).
Table 3: Autocorrelations and Ljung-Box
Q-Statistics Test Results
|
Lag |
ACF |
PACF |
Q-Statistics |
p-value |
|
1 |
0.014 |
0.014 |
0.0458 |
0.831 |
|
2 |
-0.019 |
-0.019 |
0.1338 |
0.935 |
|
3 |
0.004 |
0.005 |
0.1380 |
0.987 |
|
4 |
-0.049 |
-0.050 |
0.7497 |
0.945 |
|
5 |
0.022 |
0.024 |
0.8747 |
0.972 |
|
6 |
0.037 |
0.034 |
1.2165 |
0.976 |
|
7 |
0.022 |
0.023 |
1.3420 |
0.987 |
|
8 |
0.017 |
0.015 |
1.4126 |
0.994 |
|
9 |
-0.007 |
-0.005 |
1.4260 |
0.998 |
|
10 |
-0.110 |
-0.107 |
4.5659 |
0.918 |
|
11 |
-0.025 |
-0.022 |
4.7227 |
0.944 |
|
12 |
0.078 |
0.075 |
6.2944 |
0.901 |
|
13 |
-0.008 |
-0.012 |
6.3115 |
0.934 |
|
14 |
-0.017 |
-0.027 |
6.3907 |
0.956 |
|
15 |
0.052 |
0.055 |
7.0970 |
0.955 |
|
16 |
-0.035 |
-0.022 |
7.4226 |
0.964 |
|
17 |
-0.012 |
-0.008 |
7.4599 |
0.977 |
|
18 |
-0.088 |
-0.093 |
9.5213 |
0.946 |
|
19 |
-0.054 |
-0.050 |
10.302 |
0.945 |
|
20 |
-0.092 |
-0.114 |
12.567 |
0.895 |
|
21 |
-0.026 |
-0.032 |
12.750 |
0.917 |
|
22 |
-0.115 |
-0.115 |
16.369 |
0.797 |
|
23 |
0.007 |
0.008 |
16.381 |
0.838 |
|
24 |
-0.053 |
-0.074 |
17.165 |
0.842 |
|
25 |
-0.056 |
-0.036 |
18.032 |
0.841 |
|
26 |
-0.047 |
-0.056 |
18.643 |
0.851 |
|
27 |
0.055 |
0.057 |
19.482 |
0.852 |
|
28 |
-0.011 |
-0.032 |
19.514 |
0.882 |
|
29 |
0.060 |
0.057 |
20.511 |
0.876 |
|
30 |
0.056 |
0.042 |
21.381 |
0.876 |
|
31 |
0.040 |
0.061 |
21.828 |
0.888 |
|
32 |
-0.001 |
-0.015 |
21.828 |
0.912 |
|
33 |
-0.027 |
-0.007 |
22.036 |
0.927 |
|
34 |
-0.109 |
-0.121 |
25.432 |
0.855 |
|
35 |
-0.056 |
-0.074 |
26.342 |
0.854 |
|
36 |
0.066 |
0.025 |
27.604 |
0.841 |
Figure
3:
Plots of ACF and PACF of Log Transformed Series
Collectively, these findings
suggest that the series is not driven by persistent temporal dependence, and
any ARMA model fitted to the data should yield uncorrelated and well-behaved
residuals. Therefore, the dataset is suitable for ARMA model identification and
estimation, and the absence of significant autocorrelation validates the
appropriateness of proceeding with short-term forecasting using the ARMA
framework.
4.5 Model Order Selection
Following the establishment of
stationarity and the absence of significant autocorrelation in the diabetes
time series, various ARMA model orders were estimated to determine the most
parsimonious and best-fitting specification for short-term prediction. Model
selection was based on several statistical criteria, including the Log
Likelihood (LogL), Akaike Information Criterion (AIC), Schwarz Information
Criterion (SIC), and Hannan–Quinn Criterion (HQC). Generally, the preferred
model is the one with the highest Log Likelihood and the lowest values of AIC,
SIC, and HQC. Table 4 below presents the results of the model order selection
process.
Among the twenty-four ARMA
model specifications estimated, the ARMA(3,3) model exhibits the highest Log
Likelihood value (-24.0103) and the lowest AIC (0.2552), SIC (0.3159), and HQC
(0.2958) values. These results indicate that the ARMA(3,3) model provides the
best balance between goodness-of-fit and parsimony.
Table 4:Model Order Selection using Log
Likelihood and Information Criteria
|
S/n |
Model |
LogL |
AIC |
SIC |
HQC |
|
1. |
ARMA(0,1) |
-34.4597 |
0.2964 |
0.3349 |
0.3079 |
|
2. |
ARMA(1,0) |
-34.8194 |
0.3006 |
0.3391 |
0.3121 |
|
3. |
ARMA(1,1) |
-32.9444 |
0.2934 |
0.3363 |
0.3107 |
|
4. |
ARMA(0,2) |
-34.4107 |
0.3042 |
0.3469 |
0.3214 |
|
5. |
ARMA(2,0) |
-35.1256 |
0.3125 |
0.3555 |
0.3298 |
|
6. |
ARMA(1,2) |
-32.9256 |
0.3014 |
0.3586 |
0.3245 |
|
7. |
ARMA(2,1) |
-33.2988 |
0.3057 |
0.3631 |
0.3288 |
|
8. |
ARMA(2,2) |
-30.3771 |
0.2899 |
0.3616 |
0.3188 |
|
9. |
ARMA(0,3) |
-34.4060 |
0.3122 |
0.3692 |
0.3352 |
|
10. |
ARMA(3,0) |
-35.4688 |
0.3248 |
0.3823 |
0.3480 |
|
11. |
ARMA(1,3) |
-28.0912 |
0.2701 |
0.3616 |
0.3089 |
|
12. |
ARMA(3,1) |
-32.9028 |
0.3119 |
0.3838 |
0.3409 |
|
13. |
ARMA(2,3) |
-30.3708 |
0.2981 |
0.3841 |
0.3328 |
|
14. |
ARMA(3,2) |
-30.5304 |
0.3007 |
0.3859 |
0.3354 |
|
15. |
ARMA(3,3)** |
-24.0103 |
0.2552 |
0.3159 |
0.2958 |
|
16. |
ARMA(0,4) |
-34.1157 |
0.3180 |
0.3893 |
0.3467 |
|
17. |
ARMA(4,0) |
-35.3492 |
0.3335 |
0.4056 |
0.3625 |
|
18. |
ARMA(1,4) |
-34.4466 |
0.3302 |
0.4159 |
0.3647 |
|
19. |
ARMA(4,1) |
-35.3432 |
0.3417 |
0.4282 |
0.3765 |
|
20. |
ARMA(2,4) |
-32.0099 |
0.3198 |
0.4201 |
0.3602 |
|
21. |
ARMA(4,2) |
-26.7027 |
0.2785 |
0.3795 |
0.3192 |
|
22. |
ARMA(3,4) |
-25.4065 |
0.2799 |
0.3899 |
0.3213 |
|
23. |
ARMA(4,3) |
-33.4797 |
0.3428 |
0.4581 |
0.3893 |
|
24. |
ARMA(4,4) |
-31.4253 |
0.2962 |
0.4060 |
0.3285 |
Therefore, based on the
information criteria, the ARMA(3,3) model is selected as the optimal model for
forecasting short-term variations in non–insulin-dependent diabetes cases among
farmers in Benue State. This suggests that both autoregressive and moving
average components up to the third order significantly contribute to capturing
the dynamic structure of the series.
4.6 Parameter Estimates of ARMA(3,3) Model
After selecting
the ARMA(3,3) model as the optimal specification based on the information
criteria, the model parameters were estimated to evaluate the dynamic
relationship between past observations and random disturbances in the series of
non–insulin-dependent diabetes cases among farmers in Benue State. Table 5
below presents the estimated coefficients of the ARMA(3,3) model, along with
their corresponding standard errors, t-statistics, and p-values.
Goodness-of-fit measures such as the R-squared, Adjusted R-squared,
F-statistic, and Durbin–Watson statistic are also reported to assess the
adequacy of the fitted model.
Table 5: Parameter
Estimates of ARMA(3,3) Model
|
Variable |
Coefficient |
Std. Error |
t-Statistic |
p-value |
|
C |
8.768664 |
0.017218 |
509.2761 |
0.0000 |
|
AR(1) |
0.366096 |
0.024641 |
14.85713 |
0.0000 |
|
AR(2) |
0.311203 |
0.029382 |
10.59171 |
0.0000 |
|
AR(3) |
-0.912359 |
0.024212 |
-37.68166 |
0.0000 |
|
MA(1) |
-0.372828 |
0.009593 |
-38.86277 |
0.0000 |
|
MA(2) |
-0.386923 |
0.009312 |
-41.55086 |
0.0000 |
|
MA(3) |
0.982389 |
0.007644 |
128.5160 |
0.0000 |
|
R-squared |
0.890511 |
|
AIC |
0.255229 |
|
Adjusted
R2 |
0.867389 |
|
SIC |
0.315852 |
|
F-statistic |
6.914400 |
|
HQC |
0.295759 |
|
Prob(F-stat.) |
0.000951 |
|
Durbin-Watson stat. |
2.011502 |
The model estimation results
reported in Table 5 show that all autoregressive (AR) and moving average (MA)
coefficients are statistically significant at the 1% level, as indicated by
their very low p-values (p < 0.01). This implies that past values and past
error terms up to the third lag significantly influence the current level of
non–insulin-dependent diabetes cases among farmers.
Specifically, the positive
coefficients of AR(1) and AR(2) suggest a direct persistence effect, meaning
that increases in diabetes cases in the immediate past periods tend to raise
current cases. Conversely, the negative AR(3) coefficient indicates a corrective
mechanism, implying that after about three periods, the series tends to revert
toward its mean. The MA terms also show alternating positive and negative
signs, suggesting that short-term shocks have both dampening and amplifying
effects over time before dissipating.
The high R-squared (0.8905) and
adjusted R-squared (0.8674) values indicate that approximately 89% of the
variation in diabetes cases is explained by the model, signifying a very good
fit. The F-statistic (6.9144) with a significant probability value (0.000951)
confirms the overall significance of the model.The Durbin–Watson statistic
(2.0115) is close to 2, suggesting the absence of serial correlation in the
residuals, while the information criteria (AIC = 0.2552, SIC = 0.3159, HQC =
0.2958) reaffirm that the ARMA(3,3) model remains the most parsimonious and
efficient choice.
Overall, the ARMA(3,3) model adequately captures
the temporal dynamics and short-term fluctuations in non–insulin-dependent
diabetes cases among farmers in Benue State, making it suitable for reliable
short-term forecasting.
4.7
Model Diagnostic Checks
Following the
estimation of the ARMA(3,3) model for predicting non–insulin-dependent diabetes
cases among farmers in Benue State, diagnostic checks such as multicolinearity
test and Ljung-Box Q-statistic tests were conducted to verify the adequacy of
the fitted model. This assessment ensures that the residuals behave like white
noise, uncorrelated, homoscedastic, and pattern-free over time. The test are
presented in the following subsections.
4.7.1 Multicolinearity test result
Multicollinearity
diagnostics were performed to make sure the variables in ARMA(3,3) model weren't overlapping
too much. Using
the Variance Inflation Factor (VIF) for each autoregressive (AR) and moving
average (MA) term, the test assessed how multicollinearity might affect the
stability and reliability of parameter estimates. Generally, VIF values above
10 indicate severe multicollinearity, values between 5 and 10 suggest moderate
correlation, and values below 5 imply no serious concern. The results presented
in Table 6 show both uncentered and centered VIF statistics for the ARMA(3,3)
model parameters.
The results of multicolinearity
test reported in Table 6 below reveal that all centered VIF values are
considerably low, ranging between 1.11 and 2.55, which are far below the
critical threshold of 10. This indicates that there is no serious multicollinearity
among the explanatory variables (AR and MA terms) in the estimated ARMA(3,3)
model.
Therefore, the
estimated parameters are statistically reliable, and the standard errors are
not inflated by multicollinearity. This implies that the ARMA (3,3) model is
well-conditioned, and the coefficients can be interpreted with confidence.
Table 6: Test for
Multicolinearity (Variance Inflation Factors)
|
|
Coefficient |
Uncentered |
Centered |
|
Variable |
Variance |
VIF |
VIF |
|
C |
0.000296 |
1.018813 |
Na |
|
AR(1) |
0.000607 |
1.779456 |
1.779044 |
|
AR(2) |
0.000863 |
2.552345 |
2.552344 |
|
AR(3) |
0.000586 |
1.768375 |
1.768101 |
|
MA(1) |
9.20E-05 |
1.257613 |
1.255458 |
|
MA(2) |
8.67E-05 |
1.213557 |
1.203709 |
|
MA(3) |
5.84E-05 |
1.121942 |
1.111356 |
4.7.2
Ljung-Box Q-statistic test result for serial correlation
The
Autocorrelation Function (ACF), Partial Autocorrelation Function (PACF), and
Ljung–Box Q-statistics were used to test for serial correlation. High p-values
(greater than 0.05) for the Q-statistics indicate no significant
autocorrelation, suggesting that the residuals are random and the model is well
specified. Table 5 presents these diagnostic test results for the ARMA(3,3)
model residuals.
The results of Q-statistic
reported in Table 5 and the ACF as well as PACF plots reported in Figure 4 show
that all residual autocorrelations (ACF and PACF) are very small and fluctuate
closely around zero across all 36 lags. None of the autocorrelation coefficients
appear significant, suggesting that the residuals from the ARMA(3,3) model are
approximately white noise.
Furthermore, the Ljung–Box
Q-statistics have p-values consistently greater than 0.05, indicating that the
null hypothesis of no autocorrelation cannot be rejected at any lag. This
confirms that there is no statistically significant serial correlation remaining
in the residuals. In addition, the Durbin–Watson statistic from the model
estimation (2.0115) supports this conclusion by indicating near-zero
autocorrelation in the residuals.
Overall, these diagnostic results confirm that
the ARMA(3,3) model is well specified, the residuals are independently and
randomly distributed, and the model provides a statistically adequate fit to
the data. Therefore, the model is suitable for reliable short-term forecasting
of non–insulin-dependent diabetes cases among farmers in Benue State
Table 7: Autocorrelations
and Ljung-Box Q-Statistic Test Results of Residuals
|
Lag |
ACF |
PACF |
Q-Statistics |
p-value |
|
1 |
-0.024 |
-0.024 |
0.1415 |
0.707 |
|
2 |
-0.012 |
-0.012 |
0.1760 |
0.916 |
|
3 |
-0.069 |
-0.070 |
1.3558 |
0.716 |
|
4 |
0.007 |
0.003 |
1.3669 |
0.850 |
|
5 |
-0.126 |
-0.128 |
5.3247 |
0.378 |
|
6 |
-0.036 |
-0.048 |
5.6541 |
0.463 |
|
7 |
-0.017 |
-0.024 |
5.7294 |
0.572 |
|
8 |
0.142 |
0.124 |
10.812 |
0.213 |
|
9 |
-0.042 |
-0.042 |
11.254 |
0.259 |
|
10 |
0.046 |
0.032 |
11.802 |
0.299 |
|
11 |
-0.021 |
-0.015 |
11.918 |
0.370 |
|
12 |
0.052 |
0.044 |
12.628 |
0.397 |
|
13 |
-0.025 |
0.012 |
12.794 |
0.464 |
|
14 |
-0.009 |
-0.008 |
12.815 |
0.541 |
|
15 |
0.062 |
0.080 |
13.804 |
0.540 |
|
16 |
0.068 |
0.053 |
15.019 |
0.523 |
|
17 |
0.112 |
0.147 |
18.316 |
0.369 |
|
18 |
0.109 |
0.127 |
21.475 |
0.256 |
|
19 |
-0.008 |
0.027 |
21.493 |
0.310 |
|
20 |
-0.087 |
-0.066 |
23.529 |
0.264 |
|
21 |
-0.066 |
-0.032 |
24.707 |
0.260 |
|
22 |
-0.020 |
0.010 |
24.810 |
0.306 |
|
23 |
-0.062 |
-0.057 |
25.855 |
0.308 |
|
24 |
-0.048 |
-0.064 |
26.480 |
0.329 |
|
25 |
0.021 |
-0.044 |
26.599 |
0.376 |
|
26 |
0.020 |
-0.037 |
26.704 |
0.425 |
|
27 |
-0.033 |
-0.069 |
27.003 |
0.464 |
|
28 |
0.065 |
0.050 |
28.156 |
0.456 |
|
29 |
0.052 |
0.030 |
28.898 |
0.470 |
|
30 |
0.062 |
0.046 |
29.969 |
0.467 |
|
31 |
0.014 |
0.040 |
30.023 |
0.516 |
|
32 |
0.010 |
0.016 |
30.053 |
0.565 |
|
33 |
0.042 |
0.050 |
30.555 |
0.589 |
|
34 |
0.003 |
0.004 |
30.558 |
0.637 |
|
35 |
-0.039 |
-0.013 |
30.994 |
0.662 |
|
36 |
-0.008 |
-0.001 |
31.014 |
0.705 |
Figure
4:Plot
of Correlogram of Residuals of Estimated ARMA(3,3) Model
4.8 Forecast and Forecast Evaluation
To
evaluate the predictive performance of the ARMA(3,3) model in forecasting
non–insulin-dependent diabetes cases among farmers in Benue State, forecast
accuracy measures were computed. The Root Mean Squared Error (RMSE), Mean
Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE) were used to
assess both in-sample and out-of-sample forecast accuracy. Lower values of
these statistics indicate better model performance and predictive reliability.
The result is presented in Table 8.
The results of
forecast comparison reported in Table 8below show that the out-of-sample
forecast achieved slightly lower RMSE (0.2671), MAE (0.2310), and MAPE (2.6490)
values compared to the in-sample forecast (RMSE = 0.2715, MAE = 0.2446, MAPE =
2.6781). This suggests that the ARMA(3,3) model demonstrates strong predictive
capability, with minimal forecast error and good generalization performance.
The model selected in forecast mode, as denoted by the accuracy measures,
provides reliable short-term out-of-sample predictions of non–insulin-dependent
diabetes cases.
Table 8: Forecast Comparison using Accuracy
Measures
|
|
RMSE |
MAE |
MAPE |
|
In-Sample |
0.271510 |
0.244615 |
2.678116 |
|
Out-of-Sample** |
0.267100 |
0.231048 |
2.649005 |
Note: ** denotes forecast mode selected by
accuracy measures.
4.8.1
Forecast of Diabetes Miletus in Benue State from July, 2025 to June, 2027
To evaluate the
short-term predictive performance of the ARMA(3,3) model, forecasts of
non–insulin-dependent diabetes (Type-2 Diabetes Mellitus) cases among farmers
in Benue State were generated for the period July 2025 to June 2027. The
forecasts were computed in natural logarithmic form and then converted to
actual population estimates. For each forecast, the standard error, lower
confidence limit (LCL), and upper confidence limit (UCL) were calculated at a
95% confidence level, using . These values provide a range within
which the true number of diabetes cases is expected to fall with high
probability, thereby indicating the reliability and uncertainty of the
forecasts. The forecast result is reported in Table 9 below while the forecast
graph is presented as Figure 5 below too.
Table 9: "Forecast
of Diabetes Miletus Infection Cases in Benue State from July 2025-
June, 2027"
|
Year: Month |
Forecast (natural log form) |
Actual Forecast
(No. of Persons) |
|||
|
Forecast |
Std. error |
LCL |
Forecast |
UCL |
|
|
2025:06 |
6.9967 |
--- |
--- |
8896 |
--- |
|
2025:07 |
8.77405 |
0.271243 |
3799 |
6464 |
11000 |
|
2025:08 |
8.72655 |
0.271669 |
3619 |
6165 |
10499 |
|
2025:09 |
8.78204 |
0.271670 |
3826 |
6516 |
11098 |
|
2025:10 |
8.77132 |
0.272065 |
3782 |
6447 |
10988 |
|
2025:11 |
8.80141 |
0.272672 |
3893 |
6644 |
11337 |
|
2025:12 |
8.74519 |
0.272672 |
3680 |
6281 |
10717 |
|
2026:01 |
8.76088 |
0.272790 |
3738 |
6380 |
10889 |
|
2026:02 |
8.74585 |
0.273455 |
3677 |
6285 |
10741 |
|
2026:03 |
8.79725 |
0.273466 |
3871 |
6616 |
11308 |
|
2026:04 |
8.77366 |
0.273476 |
3781 |
6462 |
11044 |
|
2026:05 |
8.77825 |
0.274040 |
3794 |
6492 |
11107 |
|
2026:06 |
8.73648 |
0.274110 |
3638 |
6226 |
10654 |
|
2026:07 |
8.76803 |
0.274114 |
3755 |
6426 |
10996 |
|
2026:08 |
8.76810 |
0.274473 |
3752 |
6426 |
11005 |
|
2026:09 |
8.79729 |
0.274652 |
3862 |
6616 |
11335 |
|
2026:10 |
8.76026 |
0.274669 |
3722 |
6376 |
10923 |
|
2026:11 |
8.76113 |
0.274824 |
3724 |
6381 |
10936 |
|
2026:12 |
8.74504 |
0.275111 |
3662 |
6279 |
10767 |
|
2027:01 |
8.78341 |
0.275121 |
3805 |
6525 |
11188 |
|
2027:02 |
8.77734 |
0.275152 |
3782 |
6486 |
11121 |
|
2027:03 |
8.78223 |
0.275481 |
3798 |
6517 |
11183 |
|
2027:04 |
8.74716 |
0.275481 |
3667 |
6293 |
10798 |
|
2027:05 |
8.76058 |
0.275481 |
3717 |
6378 |
10944 |
|
2027:06 |
8.76313 |
0.275759 |
3724 |
6394 |
10978 |
|
Total |
210.40663 |
|
|
154075 |
|
|
Average |
8.766942917 |
|
|
6419.7917 |
|
Note: For 95%
confidence intervals, . LCL and UCL denote lower and upper
confidence limits respectively.
Figure
5:
Forecast Graph of Diabetes Miletus in Benue State from July, 2025-June, 2027
The forecast results reported
in Table 9 and Figure 5 above reveals that the predicted number of
non–insulin-dependent diabetes cases among farmers in Benue State is expected
to fluctuate moderately over the two-year forecast horizon (July 2025–June 2027).
The monthly forecasts range between approximately 3,600 and 11,300 cases, with
an overall average of about 6,420 cases per month and a total forecast of
154,075 cases during the study period. The relatively narrow confidence
intervals across months suggest a high level of precision in the model’s
predictions.
Overall, the ARMA(3,3) model
demonstrates strong forecasting capability, indicating that diabetes prevalence
among farmers in Benue State is likely to remain fairly stable with mild
month-to-month variations over the forecast period.
4.9 Implications of the Study to Farmers and Postharvest Losses in
Benue State
The implications of this study for farmers and
postharvest losses in Benue State are significant from both public health and
socio-economic perspectives. The findings, which forecast the prevalence of
non–insulin-dependent diabetes (Type-2 Diabetes Mellitus) among farmers,
suggest that a substantial portion of the agricultural workforce may experience
declining health and productivity over time. Poor health conditions such as
diabetes can reduce farmers’ physical capacity to engage in strenuous agricultural
activities, particularly during critical periods like harvesting and
processing. "This in turn increases the
likelihood of postharvest losses, as crops may remain un-harvested or
inadequately stored due to reduced labour efficiency and absenteeism resulting
from illness".
Moreover, "higher
diabetes prevalence among farmers implies increased medical expenditures and a
diversion of household income away from agricultural investment", further compounding the problem of low
productivity and waste. The study underscores the urgent need for integrated
health and agricultural policies—including improved rural healthcare services,
regular medical screening, health education on diet and lifestyle, and the
promotion of labour-saving technologies—to mitigate the dual burden of disease and
postharvest losses. Ultimately, addressing the health challenges of farmers is
crucial for achieving food security, sustaining agricultural livelihoods, and
enhancing overall economic resilience in Benue State.
4.0 Conclusion
The study demonstrates that the ARMA(3,3) model effectively
forecasts the incidence of non-insulin-dependent diabetes among farmers in
Benue State, Nigeria, The analysis revealed that the
ARMA(3,3) model provided the best fit based on information criteria and
diagnostic tests, with residuals behaving like white noise, indicating a
well-specified and reliable model. The forecasts from July 2025 to June 2027 suggest
a steady and relatively high incidence of diabetes cases among farmers,
implying that the disease poses an ongoing public health concern within the
agricultural population. This condition could adversely affect farmers’
productivity, increase medical costs, and indirectly contribute to higher
postharvest losses due to reduced labour availability and inefficiencies in
farm management. These findings highlight the interconnectedness between health
and agricultural output, emphasizing that the burden of chronic diseases like
diabetes extends beyond healthcare into the realm of food security and economic
stability. Therefore, proactive health interventions and policy integration
between the health and agricultural sectors are vital. Ensuring farmers’
wellness through preventive care, early detection, and education can
significantly reduce the impact of diabetes and its broader economic
consequences. The study provides empirical evidence to guide policymakers,
healthcare providers, and agricultural development agencies in formulating
context-specific strategies to improve both health outcomes and agricultural
sustainability in Benue State.
REFERENCES
Al Zahrani, S., Al
Rahman Al Sameeh, F., Musa, A. C. M., &Shokeralla, A. A. A. (2020). Forecasting diabetes patients
attendance at Al-Baha hospitals using autoregressive fractional integrated moving average (ARFIMA) models. Journal of Data Analysis and Information
Processing, 8, 183-194.
American College of
Obstetricians and Gynecologists. (2018). ACOG Practice Bulletin No. 190:
Gestational Diabetes Mellitus. Obstetrics
& Gynecology, 131(2), e49–e64.
American Diabetes
Association. (ADA 2022). Diagnosis and Classification of Diabetes Mellitus. Diabetes Care, 45(1), S17-S38.
Atkinson, M. A.,
Eisenbarth, G. S., & Michels, A. W. (2014). Type 1 diabetes. The Lancet, 383(9911), 69–82.
Benue State Epidemiological Unit, Makurdi,
Nigeria.
(Unpublished secondary data on type-2 diabetes incidence, 2005–2025).
Box, G. E. P., Jenkins,
G. M., & Reinsel, G. C. (2015). Time series analysis: forecasting and
control. John Wiley & Sons.
Carlos M. Jarque, C. M., &Anil K. Bera, A. K. (1980).Efficient tests for
normality,homoscedasticity and serial independence of regression residuals.
Economics Letters, 6(3), 255–259.
Carlos M. Jarque, C. M., & Anil K. Bera, A. K. (1987).A test for normality
of observations and regression residuals. International Statistical Review,
55(2), 163–172
Cloete, L. (2022).
Diabetes mellitus: an overview of the types, symptoms, complications and
management. Nursing Standard,
37(1), 61-66.
David A.Dickey, D. A., &Wayne A. Fuller, W. A. (1979).Distribution of
the estimators for autoregressive time series with a unit root.Journal of
the American StatisticalAssociation, 74(366), 427–431.
Deberneh,
H. M. & Kim, I. (2021). Prediction of type 2 diabetes based on machine
learning algorithm. International Journal
of Environmental Research and Public Health, 18, 3317-3329.
Desai, S. &
Deshmukh, A. (2020). Mapping of type-1 diabetes mellitus. Current Diabetes
Reviews, 16(5), 438-441.
Dickey, D. A., & Fuller, W. A. (1979). Distribution
of the Estimators for Autoregressive Time Series with a Unit Root. Journal
of the American Statistical Association, 74(366), 427–431.
https://doi.org/10.2307/2286348
Diogo,
M. V., Nunopombo, F., & Brandão, P. (2022).Hypoglycemia prediction models
with auto explanation. IEEE Access, 10,
57930-57941.
Donath, M. Y., & Shoelson,
S. E. (2011). Type-2 diabetes as an inflammatory disease. Nature Reviews
Immunology, 11(2), 98–107.
Edward J. Hannan, E. J., &Barry G. Quinn, B. G. (1979).The determination
of the order of an autoregression.Journal of the Royal Statistical Society:
Series B, 41(2), 190–195
George Casella, G., &Roger L. Berger, R. L. (2002).Statistical
Inference (2nd ed.). Duxbury.
George E. P. Box, G. E. P., Gwilym M. Jenkins, G. M., &Gregory C. Reinsel, G. C. (2015).
Time Series Analysis: Forecasting and Control (5th ed.). Wiley
Greta M. Ljung, G. M., &George E. P. Box, G. E. P. (1978).On a measure
of lack of fit in time series models.Biometrika, 65(2), 297–303
Hirotugu Akaike, H. (1974).A new look at the
statistical model identification.IEEE Transactions on Automatic Control,
19(6), 716–723.
Huang, Y., Vemer, P.,
Zhu, J., & Postma, M. J. (2016). The economic burden of diabetes mellitus
in rural southwest China. International
Journal of Environmental Research and Public Health, 13(9), 875-889.
International Diabetes
Federation. (IDF, 2019). IDF Diabetes Atlas, 9th Edition. Brussels, Belgium:
International Diabetes Federation.https://www.diabetesatlas.org/en/
Jaeger, B., Casanova, R., Demesie,
Y., Stafford, J., Wells, B., & Bancks, M. P. (2025). Development and
Validation of a Diabetes Risk Prediction Model With Individualized Preventive
Intervention Effects. The Journal of Clinical Endocrinology and Metabolism,
110(12), e4023–e4029. https://doi.org/10.1210/clinem/dgaf250
Kahn, S. E., Cooper, M. E., &
Del Prato, S. (2014). Pathophysiology and treatment of type-2 diabetes:
perspectives on the past, present, and future. The Lancet, 383(9922),
1068–1083.
Katsarou, D. N., Georga, E. I.,
Christou, M., Tigas, S., Papaloukas, C., & Fotiadis, D. I. (2022). Short
term glucose prediction in patients with type-1 diabetes mellitus. Annual International Conference of IEEE
Engineering, Medical & Biological Society, 2022, 329-332.
Ljung, G. M., & Box, G. E. P. (1979). The
Likelihood Function of Stationary Autoregressive-Moving Average Models. Biometrika,
66(2), 265. https://doi.org/10.2307/2335657
Ma, N., Zhao, Y.,
Wen, S., Yang, T., Wu, R., Tao, R., Yu, X., & Li, H. (2020). Online blood
glucose prediction using autoregressive moving average model with residual
compensation network. Journal of science,
12(2), 115-128.
Matthew,
P. K., Timothy, K. N., Ajia, R., & Antyev, S. (2022).Time series modelling
of diabetes disease in Taraba state, Nigeria. Science World Journal, 17(3), 406-412.
Olamoyegun, M. A., Alare, K., Afolabi, S. A., Aderinto,
N., & Adeyemi, T. (2024). A systematic review
and meta-analysis of the prevalence and risk factors of type 2 diabetes
mellitus in Nigeria. Clinical Diabetes and Endocrinology, 10(1). https://doi.org/10.1186/s40842-024-00209- 1
Olivares-Vera,
D. A., Gutiérrez-Hernández, D. A., Escobar-Acevedo, M. A., Lara-Rendón, C.,
& Velázquez-Vázquez, D. A. (2021). Comparison of algorithms for the
prediction of glucose levels in patients with diabetes. Nova Scientia, 13(2), 1-19.
Powers, A. C.,
D'Alessio, D., & Endocrine Society. (2016). Diabetes Mellitus: Diagnosis,
Classification, and Pathophysiology. In Endotext. MDText.com, Inc.
Rob J. Hyndman, R. J., & George
Athanasopoulos, G. (2021).Forecasting: Principles and Practice (3rd
ed.). OTexts.Available online: https://otexts.com/fpp3/
Robertson, R. P. (2004).
Chronic oxidative stress as a central mechanism for glucose toxicity in
pancreatic islet beta cells in diabetes. Journal
of Biological Chemistry, 279(41), 42351-42354.
Rodríguez-Rodríguez,
I., Chatzigiannakis, L., Rodríguez, J., Maranghi, M., Gentili, M., &
Zamora-Izquierdo, M. (2019). Utility of big data in predicting short-term blood
glucose levels in type 1 diabetes mellitus through machine learning techniques.
Sensors, 19, 4482-4498.
Schwarz, G. (1978c). Estimating the Dimension of a
Model. The Annals of Statistics, 6(2), 461–464.
https://doi.org/10.1214/aos/1176344136
Sheldon
M. Ross,
S. M. (2014).Introduction to Probability and Statistics for Engineers and Scientists
(5th ed.). Academic Press.
Singye,
T. &Unhapipat, S. (2018). Time series analysis of diabetes patients: A case
study of Jigme Dorji Wangchuk National Referral Hospital in Bhutan. Journal of physics: conference series, 1039,
1-11.
Smith, S. M., Boppana,
A., Traupman, J. A., Unson, E., Maddock, D. A., Chao, K., Dobesh, D. P.,
Brufsky, A., & Connor, R. I. (2021). Impaired glucose metabolism in
patients with diabetes, prediabetes, and obesity is associated with severe
COVID-19. Journal of Medical Virology, 93(1),
409-415.
Spyros Makridakis, S., Steven C.
Wheelwright, S. C., & Rob J. Hyndman, R. J. (1998).Forecasting: Methods
and Applications (3rd ed.). Wiley
Sun, Y., Tao, Q., Wu,
X., Zhang, L., Liu, Q., & Wang, L. (2021). The utility of exosomes in
diagnosis and therapy of diabetes mellitus and associated complications. Frontiers in Endocrinology (Lausanne), 12,
75-88.
Teran, A. D. (2017). Effects of
diabetic prevalence and mortality on households farm labour productivity in Benue State. IOSR Journal
of Agriculture and Veterinary Science, 10(7), 63-72.
Villani,
M., Nanayakkara, N., Ranasinha, S., Earnest, A., Smith, K., Soldatos, G.,
Teede, H. &Zoungas, S. (2017). Utilisation of prehospital emergency medical
services for hyperglycemia: a community-based observational study. PLoS ONE, 12, e0182413.
Wang, J., Zhang, T., Lu,
X., Zhang, H., Dong, Y., & Chen, X. (2019). Application of ARIMA model in
forecasting diabetes mellitus mortality in China from 2019 to 2023. Chinese Journal of Preventive Medicine, 53(11),
1121–1125.
Whiting, D. R.,
Guariguata, L., Weil, C., & Shaw, J. (2011). IDF diabetes atlas: global
estimates of the prevalence of diabetes for 2011 and 2030. Diabetes Research and Clinical Practice, 94(3), 311–321.
Zhu,
D., Zhou, D., Li, N., & Han, B. (2022). Predicting Diabetes and Estimating
Its Economic Burden in China Using Autoregressive Integrated Moving Average
Model. International Journal of Public
Health, 66, 1604449.
Zhu, H., Capistrant, B.
D., & Peng, Y. (2017). Investigating the impact of socioeconomic factors,
air quality, and built environment on diabetes mellitus in China. Journal of Environmental and Public Health,
2017, 1–12.tudy. PLoS ONE, 12, e0182413.
Wang, J., Zhang, T., Lu,
X., Zhang, H., Dong, Y., & Chen, X. (2019). Application of ARIMA model in
forecasting diabetes mellitus mortality in China from 2019 to 2023. Chinese Journal of Preventive Medicine, 53(11),
1121–1125.
Whiting, D. R.,
Guariguata, L., Weil, C., & Shaw, J. (2011). IDF diabetes atlas: global
estimates of the prevalence of diabetes for 2011 and 2030. Diabetes Research and Clinical Practice, 94(3), 311–321.
Zhu,
D., Zhou, D., Li, N., & Han, B. (2022). Predicting Diabetes and Estimating
Its Economic Burden in China Using Autoregressive Integrated Moving Average
Model. International Journal of Public
Health, 66, 1604449.
Zhu, H., Capistrant, B.
D., & Peng, Y. (2017). Investigating the impact of socioeconomic factors,
air quality, and built environment on diabetes mellitus in China. Journal of Environmental and Public Health,
2017, 1–12.