Jeju Island Wind Power Plant Data

This post is about predicting changes in data based on wind turbines and weather data and weather forecast.
Table of Contents
- Data Introduction
- Data Visualization with Machbase Neo
- Table Creation and Data Upload in Machbase Neo
- Experimental Methodology
- Experiment Code
- Experimental Results
1. Data Introduction
- DataHub Serial Number: 2024-4.
- Data Name: Jeju Island Wind Power Plant Dataset.
- Data Collection Methods: The weather data is collected from the Korea Meteorological Administration, The power generation data is provided by the power plant.
- Data Source: Kyung Hee University Industrial Intelligence Research Laboratory
- Raw data size and format: 110MB, CSV & Excel.
- Number of tags: 38.
TAG | DESCRIPTION |
---|---|
h_temperature | Predicted temperature (°C) for the Hankyung area in Jeju. |
h_humidity | Predicted humidity (%) for the Hankyung area in Jeju. |
h_windspeed | Predicted wind speed (m/s) for the Hankyung area in Jeju. |
h_winddirection | Predicted wind direction (degrees) for the Hankyung area in Jeju. |
h_6hrain | Predicted rainfall (mm) for the Hankyung area in Jeju over 6 hours. |
h_6hsnow | Predicted snowfall (mm) for the Hankyung area in Jeju over 6 hours. |
h_rainprobability | Hourly predicted precipitation probability (%) for the Hankyung area in Jeju. |
h_raintype | Predicted type of precipitation for the Hankyung area in Jeju. |
h_seawave | Predicted sea wave height (m) for the Hankyung area in Jeju. |
h_skystatus | Predicted weather condition for the Hankyung area in Jeju. |
s_temperature | Predicted temperature (°C) for the Seongsan area in Jeju. |
s_humidity | Predicted humidity (%) for the Seongsan area in Jeju. |
s_windspeed | Predicted wind speed (m/s) for the Seongsan area in Jeju. |
s_winddirection | Predicted wind direction (degrees) for the Seongsan area in Jeju. |
s_6hrain | Predicted rainfall (mm) for the Seongsan area in Jeju over 6 hours. |
s_6hsnow | Predicted snowfall (mm) for the Seongsan area in Jeju over 6 hours. |
s_rainprobability | Hourly predicted precipitation probability (%) for the Seongsan area in Jeju. |
s_raintype | Predicted type of precipitation for the Seongsan area in Jeju. |
s_seawave | Predicted sea wave height (m) for the Seongsan area in Jeju. |
s_skystatus | Predicted weather condition for the Seongsan area in Jeju. |
hk1_1 | Power generation data (KW) for Turbine 1 at the first wind power plant in the Hankyung area, Jeju. |
hk1_2 | Power generation data (KW) for Turbine 2 at the first wind power plant in the Hankyung area, Jeju. |
hk1_3 | Power generation data (KW) for Turbine 3 at the first wind power plant in the Hankyung area, Jeju. |
hk2_1 | Power generation data (KW) for Turbine 1 at the second wind power plant in the Hankyung area, Jeju. |
hk2_2 | Power generation data (KW) for Turbine 2 at the second wind power plant in the Hankyung area, Jeju. |
hk2_3 | Power generation data (KW) for Turbine 3 at the second wind power plant in the Hankyung area, Jeju. |
hk2_4 | Power generation data (KW) for Turbine 4 at the second wind power plant in the Hankyung area, Jeju. |
hk2_5 | Power generation data (KW) for Turbine 5 at the second wind power plant in the Hankyung area, Jeju. |
ss_1 | Power generation data (KW) for Turbine 1 at the wind power plant in the Seongsan area, Jeju. |
ss_2 | Power generation data (KW) for Turbine 2 at the wind power plant in the Seongsan area, Jeju. |
ss_3 | Power generation data (KW) for Turbine 3 at the wind power plant in the Seongsan area, Jeju. |
ss_4 | Power generation data (KW) for Turbine 4 at the wind power plant in the Seongsan area, Jeju. |
ss_5 | Power generation data (KW) for Turbine 5 at the wind power plant in the Seongsan area, Jeju. |
ss_6 | Power generation data (KW) for Turbine 6 at the wind power plant in the Seongsan area, Jeju. |
ss_7 | Power generation data (KW) for Turbine 7 at the wind power plant in the Seongsan area, Jeju. |
ss_8 | Power generation data (KW) for Turbine 8 at the wind power plant in the Seongsan area, Jeju. |
ss_9 | Power generation data (KW) for Turbine 9 at the wind power plant in the Seongsan area, Jeju. |
ss_10 | Power generation data (KW) for Turbine 10 at the wind power plant in the Seongsan area, Jeju. |
- Data Time Range: 2014-01-01 06:00:00 to 2018-01-01 03:00:00.
- Number of data records collected: 3,916,438.
- CSV data URL 1: https://data.yotahub.com/2024-4/datahub-2024-4-wind_elec_gen_1.csv.gz
- CSV data URL 2: https://data.yotahub.com/2024-4/datahub-2024-4-wind_elec_gen_2.csv.gz
- Data Migration: Jeju Island Wind Power Plant Data Migration
2. Data Visualization with Machbase Neo
- Data visualization is possible through the Tag Analyzer in Machbase Neo.
- Select desired tag names and visualize them in various types of graphs.

- Below, access the 2024-4 DataHub in real-time, select the desired tag names from the data of 38 tags, visualize them, and preview the data patterns.
DataHub Viewer
3. Table Creation and Data Upload in Machbase Neo
- In the DataHub directory, use setup.wrk located in the Wind Electricity Generation Dataset folder to create tables and load data, as illustrated in the image below.

1) Table Creation
- The table is created immediately upon pressing the "Run" button in the menu.
- If the wind_elec_gen table exists, execute the first line and then the second. If it does not exist, start from the second line.

2) Data Upload
- Loading tables in two different ways.
Method 1) Table loading method using TQL in Machbase Neo (since machbase-neo v8.0.29-rc1
-
Pros
- Markbase Neo loads as soon as you hit the launch button.
-
Cons
- Slower table loading speed compared to other method.

Method 2) Loading tables using commands
-
Pros
- Fast table loading speed.
-
Cons
- The table loading process is cumbersome.
- Run cmd window - Change machbase-neo path - Enter command in cmd window.
- If run the below script from the command shell, the data will be entered at high speed into the wind_elec_gen table.
curl http://data.yotahub.com/2024-4/datahub-2024-4-wind_elec_gen_1.csv.gz | machbase-neo shell import --input - --compress gzip --header --method append --timeformat ns wind_elec_gen
curl http://data.yotahub.com/2024-4/datahub-2024-4-wind_elec_gen_2.csv.gz | machbase-neo shell import --input - --compress gzip --header --method append --timeformat ns wind_elec_gen
- If specify a separate username and password, use the --user and --password options (if not sys/manager) and add the options as shown below.
curl http://data.yotahub.com/2024-4/datahub-2024-4-wind_elec_gen_1.csv.gz | machbase-neo shell import --input - --compress gzip --header --method append --timeformat ns wind_elec_gen --user USERNAME --password PASSWORD
curl http://data.yotahub.com/2024-4/datahub-2024-4-wind_elec_gen_2.csv.gz | machbase-neo shell import --input - --compress gzip --header --method append --timeformat ns wind_elec_gen --user USERNAME --password PASSWORD
4. Experimental Methodology
- Model Objective: Wind Power Generation Prediction.
- Tags Used: Hankyung Power Plant No. 1.
- Model Configuration: LSTM AutoEncoder.
- Learning Method: Unsupervised Learning.
- Train: Model Training.
- Test: Model Performance Evaluation Based on Wind Power Generation Prediction.
- Model Optimizer: Adam.
- Model Loss Function: Mean Squared Error.
- Model Performance Metric: Mean Squared Error & R2 Score.
- Data Loading Method
- Loading the Entire Dataset.
- Loading the Batch Dataset.
- Data Preprocessing
- MinMax Scaling.
5. Experiment Code
- Below is the code for each of the two ways to get data from the database.
- If all the data can be loaded and trained at once without causing memory errors, then method 1 is the fastest and simplest.
- If the data is too large, causing memory errors, then the batch loading method proposed in method 2 is the most efficient.
Method 1) Loading the Entire Dataset
- The code below is implemented in a way that loads all the data needed for training from the database all at once.
- It is exactly the same as loading all CSV files (The only difference is that the data is loaded from Machbase Neo).
- Pros
- Can use the same code that was previously utilizing CSVs (Only the loading process is different).
- Cons
- Unable to train if trainable data size exceeds memory size.
- The entire code can be run through 4.Wind_Elec_Gen_General.ipynb.
Method 2) Loading the Batch Dataset
- Method for loading data from the Machbase Neo for a single batch size.
- The code below is for fetching a time range sequentially for a single batch size.
- Pros
- It is possible to train the model regardless of the data size, no matter how large it is.
- Cons
- It takes longer to train compared to method 1.
- The entire code can be run through 4.Wind_Elec_Gen_New_batch.ipynb.
6. Experimental Results
Method 1) Loading the Entire Dataset Result


Method 2) Loading the Batch Dataset Result


- The R2 score for loading the entire dataset resulted in 0.944, loading the batch dataset resulted in 0.959.
※ Various datasets and tutorial codes can be found in the GitHub repository below.
datahub/dataset/2024 at main · machbase/datahub
All Industrial IoT DataHub with data visualization and AI source - machbase/datahub