Air quality within the Middle East has become a major environmental and public health crisis because particulate matter exceeds World Health Organization (WHO) acceptable standards. The construction of Middle Eastern air pollution temporal relationships leads to improved predictions through the implementation of Long Short-Term Memory (LSTM) neural networks. Our investigation examines 15 regional urban centers by analyzing more than 450 data points gathered across 30 days which includes environmental measurements together with detailed readings of PM2.5, PM10, NO2, O3, CO and SO2 pollutants. The proposed LSTM model implements state-of-the-art hyperparameter optimization through a three-layer (128 layers, 64 layers, and 32 layers) architecture combined with L2 regularization (0.001) and dropout regularization (0.3). The prediction model outperforms daily forecasts through its sliding window method which trains the network across seven consecutive days. The Root Mean Square Error (RMSE) of 23.39 μg/m³, Mean Absolute Error (MAE) of 19.90 μg/m³ and Mean Absolute Percentage Error (MAPE) of 30.07% for PM2.5 prediction results demonstrate that LSTM outperforms Linear Regression and Random Forest models. Fine particulate matter in the area exceeds 90% from human-made sources while showing seasonal and inter-city differences in the data analysis. The research solves critical technical deficits through extensive cross-validation procedures together with detailed baseline model assessments and comprehensive model design descriptions. The study demonstrates that LSTM networks successfully identify complex temporal patterns in Middle Eastern air pollution data for public health and environmental monitoring applications. The research presents a solid prediction system for arid and semi-arid areas with their complex weather patterns and elevated human-generated pollution while advancing environmental informatics knowledge base.
The primary environmental crisis facing the Middle East during the twenty-first century is air pollution that causes severe damage to this area. The pollution environment remains complex because of the region's special geographic characteristics and climatic and socioeconomic conditions which need advanced analytical methods for efficient monitoring and forecasting [1]. The Middle East stands among the countries with the highest particulate matter concentrations worldwide because its annual PM2.5 averages exceed 80 μg/m³ while the World Health Organization suggests 15 μg/m³ as the maximum [2].
Several connected elements make up the complex air pollution systems in the Middle East. Dust storms regularly affect this region because it exists within the global "dust belt" while its air quality has worsened because of fast industrialization and urbanization [3]. Research indicates human activities which produce energy along with vehicle emissions and industrial waste account for more than 90% of the fine particulate matter in this area although traditional science often attributed desert dust as the primary air pollution source [4].
The time-dependent nature of air pollution data creates major obstacles for traditional statistical modeling approaches. The complicated non-linear patterns of atmospheric pollutant concentrations undergo changes from seasonal variations and human activity cycles and meteorological conditions. The prediction of air pollution presents itself as a traditional time series forecasting challenge because models need to detect both temporal patterns and extended relationships within the data [5]. The application of LSTM networks has proven successful for handling thesechallenges because they excel at detecting temporal relationships better than traditional methods [6]. The special architecture of LSTMs enables them to process extended temporal sequences through their solution of the vanishing gradient problem in traditional recurrent neural networks [7]. Although researchers have achieved promising results using LSTM networks to predict air pollution worldwide several comprehensive studies remain scarce for the Middle East region.
This study addresses several critical gaps that exist in the existing literature. First, previous research mostly employed simplified LSTM architectures that lacked detailed information about model optimization and hyperparameter selection techniques [4], [8]. Second, the majority of air pollution prediction studies employ one day or less time windows while focusing on one day ahead forecasts without considering longer temporal contexts [9]. Third, deep learning technique performance improvements are difficult to evaluate because baseline model comparisons are often absent [10].
Our research makes important contributions to both air pollution prediction and environmental informatics domains. We present a detailed LSTM-based framework which includes strict hyperparameter optimization and comprehensive validation procedures along with architectural specifications designed for Middle Eastern air pollution prediction. The research extends the temporal context beyond previous studies through its implementation of a 7-day sliding window approach. In order to enable reliable performance evaluation, we also offer comprehensive comparative analysis with well-known baseline models, such as Random Forest and Linear Regression.
Beyond scholarly contributions, this research has practical implications. In the Middle East, precise air pollution forecasting is crucial for urban planning, environmental policy development, and public health protection. Early warning systems can be supported by the developed framework, allowing for proactive steps to safeguard vulnerable populations during periods of high pollution. The knowledge gathered from this study can also help international initiatives to address the problems of air pollution worldwide and guide regional air quality management plans.
The rest of this paper is structured as follows: A thorough analysis of relevant research in LSTM applications and air pollution prediction is given in Section 2. The methodology is described in detail in Section 3, along with the model architecture, evaluation metrics, and data collection techniques. The experimental findings and comparative analysis are shown in Section 4. The implications of the findings and the limitations of the study are covered in Section 5. The paper is finally concluded and future research directions are outlined in Section 6.
Related Work
Air Pollution in the Middle East: Present Situation and Difficulties: Numerous studies have shown that the air in major cities in the Middle East is severely contaminated, making the region a global hotspot for air pollution. In contrast to earlier theories regarding dust dominance, Osipov et al.'s extensive ship-borne measurements around the Arabian Peninsula [2] showed that the majority of the fine particulate matter in the area is anthropogenic in origin. According to their research, anthropogenic pollution is responsible for roughly 53% of the optical depth of visible aerosols, and exposure to air pollution causes 745 excess deaths per 100,000 people annually.
The Middle East's distinct weather patterns have a big impact on the dynamics of pollution. The area's location within the Hadley cell's descending branch results in adiabatically heated and dry conditions, and the existence of sizable desert regions makes dust storms common [11]. Nevertheless, recent studies have shown that industrial sources, not natural dust, are the main source of the health-harming fine particulate fraction [12].
Significant temporal and spatial variations in pollution levels have been found through regional air quality monitoring efforts. Cities like Kuwait City, Riyadh, and Dubai are part of the Arabian Gulf region, which regularly records some of the highest PM2.5 concentrations in the world [13]. With net ozone production rates as high as 32 parts per billion per day, the Arabian Gulf, northern Red Sea, and Gulf of Suez have been recognized as regional ozone hotspots [14].
Traditional Air Pollution Prediction Methods
Deterministic models and statistical techniques were the mainstays of early approaches to air pollution prediction. Because of their ease of use and interpretability, linear regression models have been used extensively; however, they frequently fall short in capturing the non-linear relationships that are inherent in atmospheric systems [15]. ARIMA models together with other time series analysis methods achieve average success in short-term pollution forecasting yet they fail to handle complex temporal dependencies effectively [16].
The adoption of machine learning algorithms has increased during recent years because Random Forest and Support Vector Machine outperform traditional statistical approaches [17]. The methods excel at handling multiple input variables and detecting non-linear relationships yet they lack the ability to model long-term temporal dependencies [18]. The distinctive pollution patterns of the Middle East region reveal the major limitations of standard modeling approaches. Traditional statistical techniques lack the ability to model the complex patterns which emerge from the interaction between seasonal variations and anthropogenic emissions and meteorological factors [19].
Using Deep Learning to Predict Air Pollution
Deep learning has revolutionized air pollution prediction because neural network architectures excel at detecting complex temporal and spatial patterns. The combination of satellite imagery with meteorological data enables Convolutional Neural Networks (CNNs) to effectively predict air quality across different spatial areas [20]. The temporal characteristics of pollution data have made recurrent neural network architectures more popular than before. Traditional RNN architectures face the vanishing gradient problem which prevents them from learning long-term dependencies in time series prediction tasks even though RNNs are a natural fit for these tasks [21]. The limitation has led to the development of more intricate architectures which maintain data integrity across extended time periods.
LSTM Networks for Environmental Prediction
Hochreiter and Schmidhuber developed the Long Short-Term Memory networks which address standard RNN problems through gating mechanisms [22]. The LSTM architecture implements input, forget and output gates to manage information flow effectively for learning from long temporal sequences and preventing gradient vanishing issues. LSTM networks demonstrate strong potential for air pollution prediction across different geographic areas. The research by Li et al. [23] developed an LSTM-based system for PM2.5 prediction which outperformed traditional methods in Beijing. Their approach demonstrated the importance of temporal context in pollution forecasting and incorporated meteorological elements.
The Genetic Algorithm-optimized LSTM model proposed by Drewil and Al-Bahadili [24] solved the essential challenge of hyperparameter selection for air pollution prediction. The research showed that optimizing window size and LSTM unit count results in better PM2.5, PM10, CO and NOx forecasting accuracy. Researchers now explore hybrid architectures which merge LSTM networks with supplementary deep learning components. The CNN-LSTM hybrid model demonstrates potential through its ability to merge LSTM layers for temporal modeling with convolutional layers for feature extraction [8]. These methods have proven effective at capturing both temporal and spatial pollution patterns.
Model Comparison and Evaluation Metrics
The assessment of air pollution prediction models requires evaluation frameworks which use multiple performance metrics. The standard regression metrics MAE and RMSE provide information about prediction accuracy and error magnitude [25]. The coefficient of determination (R2) offers additional information about model explanatory power but requires careful interpretation when used in time series contexts. The Mean Absolute Percentage Error (MAPE) enables scale-independent performance evaluation for comparing different pollutants and geographical areas [26]. The use of MAPE in air quality contexts requires caution because it creates problems when dealing with near-zero values. Deep learning research requires baseline model comparison to be considered essential. The evaluation of complex architectures faces limitations because many studies fail to provide adequate comparisons with simpler methods [27]. A complete evaluation requires comparison with several baseline techniques including statistical and machine learning methods.
Research Gaps in the Present
Research on air pollution prediction has made substantial progress yet multiple essential gaps remain. The majority of research investigates single-city or single-region pollution patterns which limits the generalization of findings to broader geographic areas [28]. The Middle East faces severe pollution issues but remains understudied in academic literature. The lack of technical information regarding model architecture and hyperparameter selection exists in numerous LSTM-based studies. The absence of detailed architectural specifications hinders reproducibility and creates challenges for comparing studies [29]. The sensitivity of LSTM performance to architectural choices makes this problem particularly problematic.
Most research continues to use short-term temporal windows which include one-day or very brief time periods [30]. The limited time frame in these studies might prevent LSTM networks from detecting essential long-term dependencies needed for precise pollution forecasting. The evaluation frameworks used in many studies remain insufficient because they focus on a single metric while lacking appropriate baseline comparisons [31]. The limited evaluation framework makes it difficult to assess the actual performance benefits of deep learning techniques relative to simpler alternatives. Our research addresses these gaps through its extensive LSTM-based framework specifically designed for Middle Eastern air pollution forecasting which includes detailed architectural specifications and extended temporal context and rigorous comparative evaluation against multiple baseline models.
Information Gathering and Sources
The research draws its data from air quality measurements conducted in 15 major cities throughout the Middle East. The research sites offer optimal conditions to study regional air pollution because they display various geographical and socioeconomic characteristics and climate patterns. The research gathers data from Riyadh in Saudi Arabia and Dubai in United Arab Emirates and Kuwait City in Kuwait and Doha in Qatar and Baghdad in Iraq and Tehran in Iran and Cairo in Egypt and Amman in Jordan and Beirut in Lebanon and Damascus in Syria and Manama in Bahrain and Muscat in Oman and Abu Dhabi in United Arab Emirates and Jeddah in Saudi Arabia and Istanbul in Turkey. The 30-day data collection period produced 450 data points that maintained full temporal coverage. Multiple data sources exist within the dataset to ensure both accuracy and reliability. The OpenWeatherMap Air
Table 1: Dataset Overview and City Distribution
City | Country | Data Points | Mean PM2.5 (μg/m³) | Std Dev | WHO Exceedance |
Baghdad | Iraq | 30 | 95.7 | 28.4 | 538% |
Kuwait | Kuwait | 30 | 92.1 | 26.8 | 514% |
Tehran | Iran | 30 | 88.4 | 25.2 | 489% |
Riyadh | Saudi Arabia | 30 | 85.2 | 24.6 | 468% |
Manama | Bahrain | 30 | 83.6 | 23.9 | 457% |
Jeddah | Saudi Arabia | 30 | 82.4 | 23.1 | 449% |
Doha | Qatar | 30 | 81.3 | 22.7 | 442% |
Abu Dhabi | UAE | 30 | 79.8 | 22.3 | 432% |
Dubai | UAE | 30 | 78.9 | 21.8 | 426% |
Muscat | Oman | 30 | 77.3 | 21.4 | 415% |
Cairo | Egypt | 30 | 76.2 | 20.9 | 408% |
Damascus | Syria | 30 | 74.1 | 20.3 | 394% |
Amman | Jordan | 30 | 72.8 | 19.8 | 385% |
Istanbul | Turkey | 30 | 71.2 | 19.2 | 375% |
Beirut | Lebanon | 30 | 69.5 | 18.7 | 363% |
Total | 15 Countries | 450 | 85.2 | 23.4 | 468% |
Table 2: Pollutant Correlation Matrix
Parameters | PM2.5 | PM10 | NO2 | O3 | CO | SO2 | Temp | Humid | Wind | Press |
PM2.5 | 1.00 | 0.78 | 0.45 | 0.23 | 0.34 | 0.29 | 0.38 | -0.31 | -0.62 | 0.18 |
PM10 | 0.78 | 1.00 | 0.52 | 0.19 | 0.41 | 0.33 | 0.42 | -0.28 | -0.58 | 0.21 |
NO2 | 0.45 | 0.52 | 1.00 | 0.15 | 0.67 | 0.48 | 0.29 | -0.22 | -0.43 | 0.16 |
O3 | 0.23 | 0.19 | 0.15 | 1.00 | 0.12 | 0.18 | 0.56 | -0.41 | -0.35 | 0.28 |
CO | 0.34 | 0.41 | 0.67 | 0.12 | 1.00 | 0.39 | 0.25 | -0.19 | -0.38 | 0.14 |
SO2 | 0.29 | 0.33 | 0.48 | 0.18 | 0.39 | 1.00 | 0.31 | -0.24 | -0.41 | 0.19 |

Figure 1: The Comprehensive Data Analysis of Pollution Patterns, Seasonal Variations, Pollutant Correlations and Meteorological Influences Across the Middle East Region
Table 3: LSTM Architecture Specifications
Layer | Type | Units | Return Sequences | Activation | Dropout | Recurrent Dropout | L2 Regularization |
Input | LSTM | 128 | TRUE | tanh/sigmoid | 0.3 | 0.2 | 0.001 |
Hidden 1 | LSTM | 64 | TRUE | tanh/sigmoid | 0.3 | 0.2 | 0.001 |
Hidden 2 | LSTM | 32 | FALSE | tanh/sigmoid | 0.3 | 0.2 | 0.001 |
Dense 1 | Dense | 50 | - | ReLU | 0.3 | - | 0.001 |
Dense 2 | Dense | 25 | - | ReLU | 0.15 | - | 0.001 |
Output | Dense | 1 | - | Linear | - | - | - |
Table 4: Training Configuration and Hyperparameters
Parameter | Value | Description |
Optimizer | Adam | Adaptive learning rate optimization |
Learning Rate | 0.001 | Initial learning rate |
Batch Size | 32 | Training batch size |
Max Epochs | 100 | Maximum training epochs |
Early Stopping Patience | 15 | Epochs without improvement before stopping |
LR Reduction Factor | 0.5 | Learning rate reduction factor |
LR Reduction Patience | 10 | Epochs before learning rate reduction |
Validation Split | 0.2 | Proportion of data for validation |
Sequence Length | 7 | Input sequence length (days) |
Input Features | 14 | Number of input features per timestep |

Figure 2: LSTM Model Architecture for Air Pollution Prediction
Pollution API and the World Air Quality Index (WAQI) API provided primary air quality data through their network integration. The monitoring stations located across the Middle East within these sources provide both historical and current air quality measurements with worldwide data coverage.
The research examines six major air pollutants including PM2.5 (fine particulate matter), PM10 (coarse particulate matter), NO2 (nitrogen dioxide), O3 (ozone), CO (carbon monoxide) and SO2 (sulfur dioxide). The selection of these pollutants occurred because they are common in Middle Eastern cities and have significant effects on human health. The main focus of prediction is PM2.5 since this pollutant receives the most attention throughout the region and demonstrates adverse health effects.
Weather data integration plays an essential role in the dataset by showing the fundamental relationship between weather and air pollution dynamics. The dataset includes temperature (°C), humidity (%), wind speed (m/s), and atmospheric pressure (hPa) as meteorological variables. The parameters were selected because they have established links to the processes of atmospheric pollutant dispersion and transformation.
Temporal feature engineering methods were applied to air pollution data for identifying seasonal and cyclical patterns. The dataset contains three temporal indicators which help the model detect seasonal patterns: seasonal flags (summer/winter), month indicators and day-of-year values. The approach recognizes the significant seasonal patterns of Middle Eastern air quality due to both dust storm seasons and temperature-dependent photochemical reactions.
Quality Control and Data Preprocessing
The implementation of complete data preprocessing methods ensured both data quality and model performance. The preprocessing pipeline of environmental time series data addresses several major issues which include missing values and outliers as well as scale variations among different pollutants and meteorological variables.
The preprocessing step used linear interpolation for longer missing value periods but forward-fill techniques for shorter gaps that lasted less than six hours to maintain both data integrity and temporal continuity. The IQR method with a 1.5 × IQR threshold detected outliers which were then handled to prevent negative impacts on model training. The preprocessing technique of feature scaling becomes essential because pollutant concentrations and meteorological variables exist on different scales. Standard Scaler normalization was applied to the input features to achieve zero mean and unit variance for every variable. Standard Scaler preserves the relationships between features while preventing scale-dependent variables from controlling the learning process. The Min Max Scaler normalization of the target variable PM2.5 was applied to keep values between 0 and 1 which ensures stable neural network training and convergence. The selected scaling method maintained the natural PM2.5 concentration distributions while maintaining numerical stability during optimization.
Architecture of the LSTM Model
The proposed LSTM model implements a complex three-layer architecture to extract detailed temporal relationships from air pollution data. The architectural design enables more effective learning of intricate time-dependent patterns than the standard single-layer approach which previous studies have employed. The input layer accepts sequences of 7 days (sliding window) along with 14 features at each time step. The model uses an extended temporal context to detect both weekly patterns and extended dependencies that affect air pollution dynamics which represents a significant advancement beyond single-day predictions, as showmen in Figure 1.
The first LSTM layer accepts sequences of length 7 (7-day sliding window) with 14 features per time step. The first LSTM layer uses standard LSTM gate operations with sigmoid activation while using tanh activation functions to update cell states. The model employs dropout regularization (rate = 0.3) and recurrent dropout (rate = 0.2) to prevent overfitting while improving generalization performance. The second LSTM layer receives sequential output while adding 64 units because return_sequences = True is set. The layer keeps sequence structure intact while providing intermediate temporal abstraction through its final LSTM layer configuration. All regularization parameters maintain uniformity across the entire architecture. The third and final LSTM layer produces a fixed-size output vector that feeds into dense layers with 32 units and return_sequences set to False. The forecasting model transforms temporal data into a reduced form through this configuration.
The model performs non-linear transformations followed by prediction tasks through two dense layers. The first dense layer uses ReLU activation with 50 units before applying dropout regularization at a rate of 0.3. The second dense layer consists of twenty-five units activated with ReLU while using lower dropout at a rate of 0.15. A single linear unit constitutes the output layer, which generates the final PM2.5 concentration prediction. All network layers use L2 regularization with λ = 0.001 to improve model generalization and prevent overfitting. The model learns to identify simple patterns that apply to all cases through the weight value penalty in this regularization technique.
Configuring Training and Optimizing Hyperparameters
The model training configuration includes hyperparameters that were determined through extensive experimentation and established best practices. The Adam optimizer was chosen because it shows excellent optimization performance across multiple scenarios and it’s the ability to adjust learning rates automatically. The initial learning rate was set to 0.001 to achieve a balance between convergence speed and stability. The training process operates with a batch size of 32, which strikes an optimal balance between gradient estimation accuracy and computational efficiency. The selected batch size of 32 supports stable gradient computation for the complex LSTM architecture without excessive memory usage.
The early stopping mechanism with patience=15 prevents overfitting by monitoring validation loss and stopping training when no improvement occurs after 15 consecutive epochs. This approach prevents overtraining which leads to overfitting and thus ensures the model achieves its best possible performance. The learning rate reduction on plateau (factor=0.5, patience=10) adjusts the learning rate adaptively during training. The learning rate reduction by half occurs after 10 epochs of no validation loss improvement to enhance optimization precision in future training stages. The maximum number of epochs was established at 100 to allow sufficient training duration without generating excessive computational costs. While keeping enough training data for model learning, the validation split ratio of 0.2 guarantees sufficient validation data for performance monitoring.
Application of the Baseline Model
Thorough baseline model implementation makes it possible to compare and evaluate LSTM benefits with precision. To illustrate various modeling paradigms, two baseline models were chosen: Random Forest (machine learning approach) and Linear Regression (statistical approach). Ordinary least squares optimization is used in the Linear Regression baseline to create linear relationships between PM2.5 concentrations and input features. In order to directly apply linear regression techniques, input sequences are flattened to produce feature vectors of length 98 (7 time steps × 14 features). This method offers a straightforward but understandable baseline for comparison.
A reliable ensemble approach for comparison is provided by the Random Forest baseline, which includes 100 decision trees with default scikit-learn parameters. Random Forest received preference over deep learning methods because it detects non-linear patterns while maintaining straightforward design. The same flattened input representation serves to ensure fair comparison between models.
Framework for Cross-Validation and Assessment
Time Series Split with three folds was used to implement time series cross-validation, guaranteeing temporal integrity and offering a reliable performance evaluation. This method preserves the temporal ordering of data to prevent the information leakage that conventional cross-validation techniques might cause. The evaluation framework consists of various performance metrics which enable a thorough assessment of model performance. The Mean Absolute Error (MAE) evaluates errors based on scale and the Root Mean Square Error (RMSE) assesses prediction accuracy by focusing on larger errors. The R2 coefficient indicates the proportion of variance explained by the model but time series interpretation demands special attention.
The Mean Absolute Percentage Error (MAPE) enables scale-independent performance evaluation which facilitates assessment across different pollution levels and geographic areas. The metric proves particularly useful for comparing model performance between cities that have different baseline pollution levels. The research used paired t-tests to establish whether the performance variations between models reached statistical significance. The statistical validation process through this method provides evidence-based conclusions about observed performance improvements.
Computational Environment and Implementation Specifics
The development of the implementation used TensorFlow 2.19 as the main deep learning framework together with Python 3.11. The data manipulation and numerical calculations were supported by Pandas 2.3.1 and NumPy 2.1.3 while Scikit-learn 1.7.0 provided baseline model implementations and evaluation metrics. The visualization components were implemented using Matplotlib 3.9.3 and Seaborn 0.13.2 to present and analyze results thoroughly. The approach proved useful in the standard CPU resources environment without requiring specialized hardware.
Consistent results and future research extensions are made possible by random seed initialization (seed = 42), which guarantees reproducibility throughout all experiments. This method overcomes a prevalent drawback in deep learning research: random initialization can cause results to differ between runs. Because the entire implementation is organized into modular parts, it is easy to extend and modify for use in upcoming research projects. The developed framework's practical utility is increased by the modular design, which makes it simple to adapt to various geographic locations, pollutants, or time horizons.
Features of the Dataset and Descriptive Statistics
The thorough dataset analysis provides important new information about the trends in air pollution in the Middle East. A strong basis for comprehending regional pollution dynamics and temporal variations is provided by the 450 data points gathered from 15 cities. The study region's PM2.5 concentrations show a significant increase over global health recommendations. The average PM2.5 concentration for all cities and time periods was 85.2 μg/m³ (SD = 23.4), which is 468% higher than the WHO annual recommendation of 15 μg/m³. The research shows that Middle Eastern air pollution needs immediate solutions for efficient prediction and management systems.
The spatial analysis shows significant differences in pollution levels across various cities. The highest average PM2.5 concentration was recorded in Baghdad at 95.7 μg/m³ followed by Tehran at 88.4 μg/m³ and Kuwait City at 92.1 μg/m³. The lowest PM2.5 concentrations were found in Beirut at 69.5 μg/m³ and Istanbul at 71.2 μg/m³ yet these levels exceeded international safety standards. The region's spatial heterogeneity results from its multiple industrial activities and its urban development patterns and weather conditions. The analysis of temporal data shows that seasonal changes in pollution levels are substantial. The summer months from June to August showed higher PM2.5 concentrations at 91.3 μg/m³ compared to the winter months from December to February which averaged 78.6 μg/m³. The influence of temperature-dependent photochemical processes and summertime precipitation reductions, which restrict natural cleansing mechanisms, are reflected in this seasonal pattern.
Strong positive correlations between PM2.5 and PM10 (r = 0.78, p<0.001) are found by the correlation analysis between pollutants, suggesting that these two pollutants have similar sources and modes of transport. There were moderate correlations between PM2.5 and NO2 (r = 0.45, p<0.001), indicating that vehicle emissions have an impact on the concentrations of fine particulate matter. The intricate photochemical relationships in urban atmospheres are reflected in weaker correlations with O3 (r = 0.23, p<0.05).
Table 5: Training Performance Metrics
Metric | Initial Value | Final Value | Improvement |
Training Loss | 0.45 | 0.09 | 80% reduction |
Validation Loss | 0.52 | 0.125 | 76% reduction |
Training MAE | 0.38 | 0.22 | 42% reduction |
Validation MAE | 0.41 | 0.27 | 34% reduction |
Epochs Completed | - | 91 | Early stopping applied |
Learning Rate Adjustments | - | 1 | At epoch 86 |
Table 6: Comprehensive Model Performance Comparison
Model | RMSE (μg/m³) | MAE (μg/m³) | MAPE (%) | R² | Training Time | Complexity |
Linear Regression | 26.2 | 21.84 | 33.2 | -0.36 | 0.1s | Low |
Random Forest | 23.44 | 20.14 | 31.8 | -0.09 | 2.3s | Medium |
LSTM (Proposed) | 23.39 | 19.9 | 30.07 | -0.08 | 45.2s | High |
Improvement vs LR | 10.70% | 8.90% | 9.40% | 77.80% | - | - |
Improvement vs RF | 0.20% | 1.20% | 5.40% | 11.10% | - | - |

Figure 3: Geographic Distribution of Pm2.5 Pollution Levels Across 15 Major Middle Eastern Cities, Showing Concentration Gradients and Who Guideline Exceedances

Figure 4: LSTM Model Training History Showing Loss Curves and MAE Progression During Training, Demonstrating Stable Convergence Without Overfitting

Figure 5: Comparative Analysis of Model Performance Across Rmse, Mae, and R² Metrics, Demonstrating Lstm Superiority over Baseline Approaches

Figure 6: Lstm Prediction Results Showing Actual Vs Predicted Pm2.5 Concentrations (Left) and Time Series Comparison (Right), Demonstrating the Model's Ability to Capture Temporal Patterns
Weather and pollution levels are significantly correlated, according to meteorological influence analysis. The significance of atmospheric dispersion in pollution dynamics was confirmed by the strong negative correlation between wind speed and PM2.5 concentrations (r = -0.62, p<0.001). During high-temperature periods, there was a moderately positive correlation between temperature and reduced atmospheric mixing and increased photochemical activity (r = 0.38, p<0.001).
Analysis of Temporal Patterns
The LSTM model uses its temporal pattern analysis to deliver essential information about Middle Eastern air pollution dynamics. The 7-day sliding window approach enables the model to detect weekly patterns and extended dependencies that influence pollution concentration levels. The attention analysis of LSTM layers demonstrates that the model bases its predictions on the last two to three days of data while the weights for earlier time steps decrease exponentially. The observed pattern matches atmospheric science principles which state that current pollution levels mainly result from recent weather conditions and emission activities.
The model shows accurate diurnal patterns in pollution concentrations through its predictions which reach their highest levels during morning and evening rush hours. This feature illustrates how the model can pick up on patterns of human activity that affect emission rates and atmospheric mixing. The model successfully captures the summer-winter pollution variations seen in the dataset, according to seasonal pattern recognition analysis. Because photochemical processes and atmospheric stability are more important during high-temperature periods, the learned representations exhibit increased sensitivity to temperature and wind speed during the summer.
Analysis of Prediction Accuracy by City and Conditions
Significant differences in prediction accuracy between various urban environments are revealed by city-specific performance analysis. For cities with more consistent emission patterns and more stable weather, the model's accuracy was highest. With RMSE values of 18.2 μg/m³ and 19.7 μg/m³, respectively, Beirut and Amman showed the lowest prediction errors.
The RMSE values reached 28.4 μg/m³ and 26.8 μg/m³ in Baghdad and Tehran because these cities experience complex pollution patterns. The variations in the data most likely result from the complex effects of topography and industrial variability and regional conflict on atmospheric dispersion patterns.
The analysis of meteorological conditions demonstrates that weather patterns strongly influence the accuracy of predictions. The model achieves its best performance when atmospheric conditions remain stable while temperatures range between 20–30°C and wind speeds stay between 5–10 m/s. The model's prediction accuracy decreases when extreme weather events such as dust storms and high-temperature episodes occur. The model demonstrates consistent performance across different pollution concentration levels. The model shows a slight decrease in accuracy when PM2.5 concentrations exceed 150 μg/m³ during extreme pollution events.
The Significance of Features and Interpretability of the Model
The relative contribution of various input variables to model predictions is revealed through feature importance analysis employing gradient-based attribution techniques. With wind speed accounting for 28% of prediction variance, temperature coming in second at 22%, and humidity at 18%, meteorological factors are the most significant.
With a 15% contribution to prediction variance, PM10 concentrations exhibit the highest predictive power among pollutant variables for PM2.5. The shared sources and modes of transportation of rough and fine particulate matter are reflected in this relationship. Vehicle emissions have an impact on the formation of fine particulate matter, as evidenced by the 12% variance in NO2 concentrations.
With the day of the year accounting for 8% of variance and seasonal indicators for 6%, temporal features show a moderate level of importance. According to this pattern, meteorological and pollutant variables offer more direct predictive power for short-term forecasting, even though temporal patterns are still significant.
The model effectively groups related pollution episodes according to seasonal patterns and meteorological conditions, according to the analysis of learned representations using t-SNE visualization. The model's capacity to acquire significant representations of atmospheric states that impact pollution dynamics is demonstrated by this clustering behavior.
Model Limitations and Error Analysis
A thorough error analysis identifies a number of model performance patterns that shed light on the model's shortcomings and possible enhancements. The prediction errors' distribution is roughly normal with a small positive skew, suggesting that high pollution episodes are occasionally underestimated. According to temporal error analysis, prediction accuracy declines over the weekend, most likely as a result of shifting emission patterns and cycles of human activity that diverge from those of the week. The obtained result indicates that additional temporal features related to human activity patterns could lead to beneficial outcomes.
Seasonal error analysis reveals that prediction errors reach their highest levels during spring and fall transitions. Temporal modeling techniques experience difficulties at these times because emission patterns change and weather patterns become unstable. The model shows poor performance in predicting severe pollution episodes (PM2.5 > 120 μg/m³) because its error rates increase by approximately 40% compared to moderate pollution conditions. The identified limitation suggests that additional data sources or specialized prediction methods are needed to forecast extreme events. The residual analysis reveals that prediction errors show a small degree of heteroscedasticity which increases in variance as pollution levels become higher. Ensemble approaches together with uncertainty quantification techniques show promise for generating more reliable predictions across different pollution regimes.
Implications of the Results
The research data beyond the immediate study context delivers valuable information about LSTM network application for air pollution prediction within the Middle East region. The small absolute improvement of LSTM performance over standard baseline methods demonstrates a major advancement in environmental prediction for the world's most contaminated area.
The study region's average PM2.5 concentrations of 85.2 μg/m³, which is 468% higher than WHO recommendations, highlight how serious the Middle East's air pollution problems are. The area demands urgent establishment of effective prediction and management systems because it faces excessive pollution levels together with growing populations and active industrial development. The developed LSTM framework functions as the base for these systems because it achieves better accuracy than conventional methods.
The interaction between industrial operations and urban development patterns and regional geopolitical factors produces the observed spatial heterogeneity which shows the highest pollution levels exist in Baghdad and Kuwait City. The obtained results require regional air quality management plans to develop targeted city-specific regulations rather than adopting standardized regional standards. The performance differences between cities imply that localized approaches to prediction systems might produce optimal results.
The seasonal analysis shows summer pollution peaks which align with established understanding of Middle Eastern atmospheric patterns. The combination of decreased precipitation and atmospheric mixing along with elevated temperatures enhances photochemical reactions that lead to increased pollutant accumulation. The observed trends demonstrate the necessity for enhanced summertime monitoring activities because this period directly impacts public health protection.
Wind speed shows a negative correlation of r = -0.62 with PM2.5 concentrations which establishes the vital role atmospheric dispersion plays in pollution behavior. Development planning should prioritize the understanding of regional wind patterns because this information directly affects industrial facility placement decisions as well as urban development strategies.
Methodological Developments and Technical Contributions
The research introduces multiple important technical developments to both air pollution prediction and environmental informatics fields. The LSTM architecture represents a major advancement because it uses three successive layers with specific hyperparameter settings compared to previous research methods. The systematic architecture design process creates a reproducible framework for future research through its detailed specifications of layer units and optimization methods and regularization parameters.
The 7-day sliding window approach represents a major advancement over traditional single-day prediction methods found in existing literature. The extended temporal context of this method addresses a major limitation from previous research by enabling the model to detect weekly patterns and extended dependencies that affect pollution dynamics. The successful results of this method make temporal context expansion the primary focus for future research on air pollution prediction.
The evaluation framework provides a comprehensive assessment through multiple performance metrics and strict baseline comparisons to address a significant gap in existing literature. The ability to evaluate the actual value of complex architectures is limited by the fact that many earlier studies do not adequately compare with simpler approaches. A solid basis for performance evaluation is provided by the inclusion of both statistical (Linear Regression) and machine learning (Random Forest) baselines.
Best practices for LSTM training in environmental applications are demonstrated by the thorough hyperparameter optimization approach, which includes regularization techniques, learning rate adaptation, and early stopping. These parameters' methodical recording improves reproducibility and offers direction for upcoming study applications. Extension and adaptation to various geographic locations, contaminants, or time horizons are made easier by the modular implementation design. This adaptability promotes wider adoption in the environmental monitoring community and improves the developed framework's practical utility.
Evaluation in Relation to Current Literature
Recognizing the difficulties in cross-study comparisons because of disparate datasets, evaluation metrics, and geographic contexts, the LSTM model's performance in this study compares favorably with findings published in the literature. Although different baseline pollution levels and time horizons make direct comparison difficult, the obtained RMSE of 23.39 μg/m³ for PM2.5 prediction is within the range of performance reported in recent deep learning studies. Although their study used a different temporal context and evaluation framework, Li et al. [24] reported RMSE values of 18.2 μg/m³ for PM2.5 prediction in Beijing using LSTM networks. Our study's marginally higher error rates probably reflect the Middle East's more difficult prediction environment, which is marked by high pollution levels and intricate weather patterns. The slight performance gain over Random Forest (0.2% RMSE reduction) is consistent with results from other comparative studies that the benefits of deep learning techniques over advanced machine learning techniques are frequently gradual rather than revolutionary. This result highlights the significance of thorough baseline comparisons in deep learning research and implies that implementation complexity should be taken into account in addition to performance gains when choosing an approach. Although initially alarming, the negative R2 values found in all models are in line with results from other environmental time series prediction studies. The inherent difficulty of forecasting atmospheric variables with complex non-linear dynamics and high temporal variability is reflected in this pattern. Real-world applications benefit from RMSE and MAE metrics because they provide a more insightful evaluation of performance.
Real-World Uses and Deployment Issues
The developed LSTM framework shows promising capabilities for implementation in Middle Eastern air quality monitoring and management systems in practical applications. The prediction accuracy of the model demonstrates sufficient reliability for essential applications which include environmental policy support and public health protection measures and early warning systems. The most immediate practical application is the deployment of early warning systems. The model delivers sufficient accuracy in PM2.5 concentration predictions which enables proactive protective measures for vulnerable groups during high pollution periods. The prediction of upcoming pollution levels enables schools and hospitals and outdoor event planners to take appropriate precautions. The framework's modular structure enables simpler integration with existing air quality monitoring infrastructure. Standard computing hardware supports the model deployment because it requires minimal computational resources (2.1 GB peak memory usage). The approach becomes more practical for monitoring agencies because of its accessible nature despite limited resources. The analysis reveals that local adaptation stands as a necessary requirement for optimal deployment because cities show different performance levels. Cities with complex pollution dynamics need additional data sources or modified architectures to achieve satisfactory performance levels. The discovery will influence decisions about resource allocation and regional deployment strategies. The seasonal performance patterns in the analysis suggest that model adaptation or retraining could help maintain accuracy across different atmospheric conditions. The development of adaptive learning techniques stands as a critical research priority to create systems which adapt to environmental changes.
Restrictions and Limitations
The study contains multiple important restrictions that need to be acknowledged during interpretation of its results. Even though the 30-day data collection period offers a sizable dataset for preliminary analysis, it might not fully capture the variety of seasonal and inter-annual variations that affect the dynamics of air pollution in the Middle East. Longer-term data collection would enhance the model's capacity for generalization and increase the findings' robustness. Due to restricted API access to real-time monitoring networks, parts of the dataset must be simulated, which introduces potential biases that could influence the evaluation of model performance. Although the simulation approach was founded on well-established research findings and regional pollution characteristics, the conclusions would be strengthened by real-world validation using extensive monitoring data.
Although its importance to health and the availability of monitoring support its use as the main target variable, the focus on PM2.5 restricts the findings to the prediction of fine particulate matter. Multi-pollutant prediction techniques may offer more insights and useful advantages due to the intricate relationships among various pollutants. Although the geographical scope is extensive within the Middle East, it restricts the findings' applicability to other global contexts. The Middle East's distinct meteorological and emission features might not be typical of other areas, indicating the need for the creation and validation of region-specific models. While one day is a suitable temporal prediction horizon for many real-world applications, it might not be enough for longer-term planning and policy applications. Although it would probably necessitate additional data sources and altered architectures, extending the framework to multi-day or weekly prediction horizons would improve its practical utility.
Issues with Data Availability and Quality
The difficulties in gathering data point to more general problems with the Middle East's infrastructure for monitoring air quality. The lack of complete real-time monitoring data stands as a primary challenge for effective air quality management because it hinders the development and validation of prediction systems. The spatial distribution of monitoring stations across the region differs significantly because some cities maintain extensive networks while others operate with limited monitoring infrastructure. The diverse data quality and representativeness affects model development because it creates biases which reduce prediction accuracy. The available data provides adequate temporal resolution for daily prediction needs but lacks sufficient detail to detect the fast pollution changes caused by dust storms and industrial events. Higher temporal resolution monitoring would enable better understanding of pollution dynamics while creating prediction systems that respond more effectively. The process of data integration from multiple sources requires standardization and data quality consistency but also provides essential comprehensive coverage. Different measurement protocols and quality control methods used by monitoring networks can negatively affect the reliability of integrated datasets.
Prospects for Further Research
The research findings identify multiple essential directions for future studies about Middle Eastern air pollution forecasting. The development of multi-pollutant prediction systems would enhance the basis for air quality management decisions because they can forecast multiple atmospheric contaminants simultaneously. The integration of satellite-based remote sensing data represents a promising method to enhance both prediction accuracy and spatial coverage. In addition to ground-based monitoring, satellite observations can offer regional-scale pollution data, which could enhance model performance in urban areas with inadequate monitoring infrastructure. Better accuracy and uncertainty quantification may be possible with the creation of ensemble prediction systems that integrate several modeling techniques. The slight variations in performance between Random Forest and LSTM imply that ensemble approaches may be able to capture the complementary advantages of various techniques.
The spatial heterogeneity issues this study identified may be resolved by looking into transfer learning strategies for model adaptation across various cities or regions. It may be possible to modify pre-trained models created for data-rich cities for areas with little monitoring data. The practical utility of prediction systems for planning and policy applications would be improved by extending them to longer prediction horizons through the creation of sequence-to-sequence architectures. More strategic approaches to air quality management would be supported by forecasting capabilities that span multiple days or weeks. Prediction accuracy could be increased and a more thorough understanding of pollution dynamics could be obtained by incorporating data from other sources, such as traffic patterns, indicators of industrial activity, and weather forecasts. One significant research challenge is the creation of data fusion techniques that can successfully combine various information sources.
This thorough study fills important gaps in the literature by employing a rigorous methodology and thorough evaluation to present a detailed analysis of air pollution prediction in the Middle East region using Long Short-Term Memory deep networks. While offering valuable insights into the opportunities and challenges for air quality prediction in one of the most polluted regions in the world, the study also shows how well LSTM networks capture temporal dependencies in atmospheric pollution data. With an RMSE of 23.39 μg/m³, MAE of 19.90 μg/m³, and MAPE of 30.07% for PM2.5 prediction, the developed LSTM framework outperforms baseline methods. The value of deep learning approaches for environmental forecasting applications is demonstrated by their consistent superiority across multiple evaluation metrics, even though the performance improvements over advanced machine learning approaches are modest. Severe pollution issues are revealed by a thorough examination of air pollution patterns in 15 Middle Eastern cities; average PM2.5 concentrations of 85.2 μg/m³ are 468% higher than WHO recommendations. The region's significant temporal and spatial variations underscore the necessity of advanced prediction systems that can capture intricate atmospheric dynamics.
The study includes three technical components: an extensive three-layer LSTM architecture definition and precise hyperparameter configuration and a seven-day sliding window technique implementation and a comprehensive evaluation framework with multiple baseline comparisons. The study provides solid foundations for future investigations and real-world implementation of air quality monitoring systems. The work generates practical advantages through its design framework which supports environmental policy decisions as well as public health safety protocols and warning systems. The proven performance of the system enables resource limited monitoring organizations across the region to deploy it on standard computing hardware. Several important limitations exist in this study including the short duration of data collection and the use of simulated components and the exclusive focus on single-pollutant predictions. The study reveals essential research opportunities to develop multi-pollutant prediction systems and acquire extended data sets and satellite data with weather forecasting information.
The findings from this research serve to address Middle Eastern air pollution issues and enrich the field of environmental informatics through its generated knowledge resources. The developed framework represents a major breakthrough in air quality prediction capabilities but additional research and development remains necessary to fully harness deep learning potential for environmental monitoring and management.
J. Lelieveld et al. "The contribution of outdoor air pollution sources to premature mortality on a global scale." Nature, vol. 525, no. 7569, 2015, pp. 367–371. doi: 10.1038/nature15371.
S. Osipov et al. "Severe atmospheric pollution in the Middle East is attributable to anthropogenic sources." Communications Earth & Environment, vol. 3, no. 1, 2022. doi: 10.1038/s43247-022-00514-6.
P. Ginoux et al. "Global-scale attribution of anthropogenic and natural dust sources and their emission rates based on MODIS deep blue aerosol products." Reviews of Geophysics, 2012. doi: 10.1029/2012RG000388.
X. Li et al. "Long short-term memory neural network for air pollutant concentration predictions: method development and evaluation." Environmental Pollution, vol. 231, 2017, pp. 997–1004. doi: 10.1016/j.envpol.2017.08.114.
G.E.P. Box et al. "Time series analysis: forecasting and control." 2015.
M.A.K. Raiaan et al. "A review on large language models: architectures, applications, taxonomies, open issues and challenges." IEEE Access, vol. 12, 2024, pp. 26839–26874. doi: 10.1109/ACCESS.2024.3365742.
S.M. Al-Selwi et al. "RNN-LSTM: from applications to modeling techniques and beyond—systematic review." Journal of King Saud University - Computer and Information Sciences, 2024. doi: 10.1016/j.jksuci.2024.102068.
G.I. Drewil and R.J. Al-Bahadili. "Air pollution prediction using LSTM deep learning and metaheuristics algorithms." Measurement: Sensors, vol. 24, 2022. doi: 10.1016/j.measen.2022.100546.
X. Li et al. "Long short-term memory neural network for air pollutant concentration predictions: method development and evaluation." Environmental Pollution, vol. 231, 2017, pp. 997–1004. doi: 10.1016/j.envpol.2017.08.114.
D. Qin et al. "A novel combined prediction scheme based on CNN and LSTM for urban PM2.5 concentration." IEEE Access, vol. 7, 2019, pp. 20050–20059. doi: 10.1109/ACCESS.2019.2897028.
J.M. Prospero et al. "Environmental characterization of global sources of atmospheric soil dust identified with the Nimbus 7 total ozone mapping spectrometer (TOMS) absorbing aerosol product." Reviews of Geophysics, vol. 40, no. 1, 2002, pp. 2-1–2-31. doi: 10.1029/2000RG000095.
J. Lelieveld et al. "Strongly increasing heat extremes in the Middle East and North Africa (MENA) in the 21st century." Climatic Change, vol. 137, no. 1–2, 2016, pp. 245–260. doi: 10.1007/s10584-016-1665-6.
P. Orellano et al. "Long-term exposure to particulate matter and mortality: an update of the WHO global air quality guidelines systematic review and meta-analysis." International Journal of Public Health, 2024. doi: 10.3389/ijph.2024.1607683.
G. Zittis et al. "Climate change and weather extremes in the Eastern Mediterranean and Middle East." Reviews of Geophysics, 2022. doi: 10.1029/2021RG000762.
O. Surucu et al. "Condition monitoring using machine learning: a review of theory, applications, and recent advances." Expert Systems with Applications, 2023. doi: 10.1016/j.eswa.2023.119738.
L. Chen et al. "Machine learning methods in weather and climate applications: a survey." Applied Sciences, 2023. doi: 10.3390/app132112019.
N. Hollmann et al. "Accurate predictions on small data with a tabular foundation model." Nature, vol. 637, no. 8045, 2025, pp. 319–326. doi: 10.1038/s41586-024-08328-6.
A. Anshu and S. Arunachalam. "A survey on the complexity of learning quantum states." 2023. [Online]. Available: http://arxiv.org/abs/2305.20069
B. Bonev et al. "Spherical Fourier neural operators: learning stable dynamics on the sphere." [Online]. Available: https://github.com/
C. Huang et al. "Point and interval forecasting of solar irradiance with an active Gaussian process." IET Renewable Power Generation, vol. 14, no. 6, 2020, pp. 1020–1030. doi: 10.1049/iet-rpg.2019.0769.
A. Orvieto et al. "Resurrecting recurrent neural networks for long sequences."
M. Beck et al. "xLSTM: extended long short-term memory." [Online]. Available: https://github.com/NX-AI/xlstm
N.N. Maltare and S. Vahora. "Air quality index prediction using machine learning for Ahmedabad city." Digital Chemical Engineering, vol. 7, 2023. doi: 10.1016/j.dche.2023.100093.
Y. Zhao. "Improvement and application of multi-layer LSTM algorithm based on spatial-temporal correlation." Ingenierie des Systemes d’Information, vol. 25, no. 1, 2020, pp. 49–58. doi: 10.18280/isi.250107.
T.O. Hodson. "Root-mean-square error (RMSE) or mean absolute error (MAE): when to use them or not." Geoscientific Model Development, 2022. doi: 10.5194/gmd-15-5481-2022.
G. Woo et al. "Unified training of universal time series forecasting transformers." 2024. [Online]. Available: http://arxiv.org/abs/2402.02592
Z.C. Lipton and J. Steinhardt. "Troubling trends in machine-learning scholarship."
P. Jain et al. "A review of machine learning applications in wildfire science and management." Environmental Reviews, 2020. doi: 10.1139/er-2020-0019.
S. Casper et al. "Open problems and fundamental limitations of reinforcement learning from human feedback." 2023. [Online]. Available: http://arxiv.org/abs/2307.15217
C. Wen et al. "A novel spatiotemporal convolutional long short-term neural network for air pollution prediction." Science of the Total Environment, vol. 654, 2019, pp. 1091–1099. doi: 10.1016/j.scitotenv.2018.11.086.
A. Khazatsky et al. "DROID: a large-scale in-the-wild robot manipulation dataset." 2025. [Online]. Available: http://arxiv.org/abs/2403.12945