Masterprüfung mit Defensio, Papazek Petrina Eveline

31.10.2018 10:30 - 12:00

Artificial Neural Network and Data Mining Approaches for Short-range Wind Speed Forecasts

Due to Austria’s complex terrain, it is still an issue to provide reliable wind forecasts required for a wide range of applications. Current approaches involve classical statistical methods (e.g., based on regression) and numerical weather prediction models (NWPs), which involve high computation times due to their complexity. Still, for some application we require fast and more accurate predictions for nowcasting (1 to 6 hours ahead) and the short-range (up to two days ahead). In this master’s thesis we aim to predict wind speed in 10 meters height. For this reason we develop a novel machine learning and data mining framework – this is easy to validate by measurements of the Austrian meteorological observation system (Teilautomatische Wetterstationen, TAWES). Several related approaches employ different types and hybrids of artificial neural networks (ANN) in conjunction with metaheuristic techniques. However, most of them work with either data from numeric models or observations or employ computational expensive machine learning models to the whole data set. Some generate average wind speed forecasts for the whole day and do not consider variations within the day, which is not suitable for our nowcasting scenario. In contrast to this, we implement a data mining framework which employs various basic and fast machine learning models (ANNs, random forests, support vector machines), ensemble learning methods, and use data mining methods to preprocess and select data for optimized results for a particular location/observation site. For a basic setup we use a station-based model and data of the same season for the training. Consequently, we join location based data from different data sources, quality, and resolution in order to derive new knowledge for this particular observation site. Here, we investigate how to use a combination of data from the meteorological measurement systems and numeric weather prediction models (NWP) for the training process of data driven learning models. Key objectives of this project are to properly process and join the data in order to improve the results of the current nowcasting system named INCA (Integrated Nowcasting through Comprehensive Analysis) system. For this reason, we need to pre-classify the data by the selected forecast hour. We refer to our model as iANNe method, where i denotes the intervalization of data into disjoint intervals of the forecasting hour, e the ensemble of machine learning models for each interval, and ANN our best scaling machine learning model within the computational analysis. In a more complex model version we extend the training period and employ spatial variants to model complex terrain. In particular, we use station groups of three to seven spatially related stations. The combination of these methods (i.e.: iANNe, station groups, temporal extension of training data) performs well in computational experiments of wind speed forecasts for several sites. To test our machine learning and data mining based forecasting setup, we design the Python based prototype pywiNNd which runs in a Linux environment. The computational analysis indicates how to improve the selection of data for the used machine learning model (e.g.: ANN’s architecture) to provide short-range forecasts for wind speeds. We intend to optimize the architecture and settings of the machine i learning models and provide a good setup for most situations in praxis. Particularly, we also focus on methods to find good results for the nowcasting-range while still providing a computational efficient approach (i.e., forecasts available within minutes). We use one test scenario with 24 meteorological observation sites for which we give hourly forecasts for each hour of 31 days for a time horizon of up to 40 hours in the future. Another test scenario evaluates spatial, temporal, and spatio-temporal methods for different station groups for the station Wien Hohe Warte in Vienna. We work with data from the Austrian numerical weather forecast models ALARO, AROME, the European ECMWF, and observation data from the Austrian TAWES network (Teilautomatische Wetterstationen). We use this test scenario to validate the performance of various approaches and find reasonable settings on the architecture and input data. Different statistical metrics are used for validation purposes. They indicate that for a majority of observation sites the implemented framework significantly improves the forecasts of the NWP models. The developed data mining framework is able to improve NWPs for the whole forecasting range (0 to 40 forecasting hours), especially when compared to INCA (a statistical-physical nowcasting system) and META (a model output statistic) forecasts. Generally, providing more data by seasonal extensions seems beneficial. Moreover, we observed improvements by applying spatial methods on station groups that consider weather type, location, and topography for some tested scenarios in complex terrain. The iANNe method and its spatial and temporal extensions clearly outperform rival methods by using long training episodes and, thereby, utilize information from the past, observation of the current weather situation, and the output of available high-resolution NWP models.

Organiser:

SPL 5

Location:

Besprechungsraum 3.28

Währinger Straße 29
1090 Wien