Jeju Island Wind Power Plant Data

Jeju Island Wind Power Plant Data

This post is about predicting changes in data based on wind turbines and weather data and weather forecast.

Table of Contents

  1. Data Introduction
  2. Data Visualization with Machbase Neo
  3. Table Creation and Data Upload in Machbase Neo
  4. Experimental Methodology
  5. Experiment Code
  6. Experimental Results

1. Data Introduction


  • DataHub Serial Number: 2024-4.
  • Data Name: Jeju Island Wind Power Plant Dataset.
  • Data Collection Methods: The weather data is collected from the Korea Meteorological Administration, The power generation data is provided by the power plant.
  • Data Source: Kyung Hee University Industrial Intelligence Research Laboratory
  • Raw data size and format: 110MB, CSV & Excel.
  • Number of tags: 38.
TAG DESCRIPTION
h_temperature Predicted temperature (°C) for the Hankyung area in Jeju.
h_humidity Predicted humidity (%) for the Hankyung area in Jeju.
h_windspeed Predicted wind speed (m/s) for the Hankyung area in Jeju.
h_winddirection Predicted wind direction (degrees) for the Hankyung area in Jeju.
h_6hrain Predicted rainfall (mm) for the Hankyung area in Jeju over 6 hours.
h_6hsnow Predicted snowfall (mm) for the Hankyung area in Jeju over 6 hours.
h_rainprobability Hourly predicted precipitation probability (%) for the Hankyung area in Jeju.
h_raintype Predicted type of precipitation for the Hankyung area in Jeju.
h_seawave Predicted sea wave height (m) for the Hankyung area in Jeju.
h_skystatus Predicted weather condition for the Hankyung area in Jeju.
s_temperature Predicted temperature (°C) for the Seongsan area in Jeju.
s_humidity Predicted humidity (%) for the Seongsan area in Jeju.
s_windspeed Predicted wind speed (m/s) for the Seongsan area in Jeju.
s_winddirection Predicted wind direction (degrees) for the Seongsan area in Jeju.
s_6hrain Predicted rainfall (mm) for the Seongsan area in Jeju over 6 hours.
s_6hsnow Predicted snowfall (mm) for the Seongsan area in Jeju over 6 hours.
s_rainprobability Hourly predicted precipitation probability (%) for the Seongsan area in Jeju.
s_raintype Predicted type of precipitation for the Seongsan area in Jeju.
s_seawave Predicted sea wave height (m) for the Seongsan area in Jeju.
s_skystatus Predicted weather condition for the Seongsan area in Jeju.
hk1_1 Power generation data (KW) for Turbine 1 at the first wind power plant in the Hankyung area, Jeju.
hk1_2 Power generation data (KW) for Turbine 2 at the first wind power plant in the Hankyung area, Jeju.
hk1_3 Power generation data (KW) for Turbine 3 at the first wind power plant in the Hankyung area, Jeju.
hk2_1 Power generation data (KW) for Turbine 1 at the second wind power plant in the Hankyung area, Jeju.
hk2_2 Power generation data (KW) for Turbine 2 at the second wind power plant in the Hankyung area, Jeju.
hk2_3 Power generation data (KW) for Turbine 3 at the second wind power plant in the Hankyung area, Jeju.
hk2_4 Power generation data (KW) for Turbine 4 at the second wind power plant in the Hankyung area, Jeju.
hk2_5 Power generation data (KW) for Turbine 5 at the second wind power plant in the Hankyung area, Jeju.
ss_1 Power generation data (KW) for Turbine 1 at the wind power plant in the Seongsan area, Jeju.
ss_2 Power generation data (KW) for Turbine 2 at the wind power plant in the Seongsan area, Jeju.
ss_3 Power generation data (KW) for Turbine 3 at the wind power plant in the Seongsan area, Jeju.
ss_4 Power generation data (KW) for Turbine 4 at the wind power plant in the Seongsan area, Jeju.
ss_5 Power generation data (KW) for Turbine 5 at the wind power plant in the Seongsan area, Jeju.
ss_6 Power generation data (KW) for Turbine 6 at the wind power plant in the Seongsan area, Jeju.
ss_7 Power generation data (KW) for Turbine 7 at the wind power plant in the Seongsan area, Jeju.
ss_8 Power generation data (KW) for Turbine 8 at the wind power plant in the Seongsan area, Jeju.
ss_9 Power generation data (KW) for Turbine 9 at the wind power plant in the Seongsan area, Jeju.
ss_10 Power generation data (KW) for Turbine 10 at the wind power plant in the Seongsan area, Jeju.

2. Data Visualization with Machbase Neo


  • Data visualization is possible through the Tag Analyzer in Machbase Neo.
  • Select desired tag names and visualize them in various types of graphs.
  • Below, access the 2024-4 DataHub in real-time, select the desired tag names from the data of 38 tags, visualize them, and preview the data patterns.
DataHub Viewer

3. Table Creation and Data Upload in Machbase Neo


  • In the DataHub directory, use setup.wrk located in the Wind Electricity Generation Dataset folder to create tables and load data, as illustrated in the image below.

1) Table Creation

  • The table is created immediately upon pressing the "Run" button in the menu.
  • If the wind_elec_gen table exists, execute the first line and then the second. If it does not exist, start from the second line.

2) Data Upload


  • Loading tables in two different ways.
Method 1) Table loading method using TQL in Machbase Neo (since machbase-neo v8.0.29-rc1

  • Pros

    • Markbase Neo loads as soon as you hit the launch button.
  • Cons

    • Slower table loading speed compared to other method.
Method 2) Loading tables using commands

  • Pros

    • Fast table loading speed.
  • Cons

    • The table loading process is cumbersome.
    • Run cmd window - Change machbase-neo path - Enter command in cmd window.
  • If run the below script from the command shell, the data will be entered at high speed into the wind_elec_gen table.
curl http://data.yotahub.com/2024-4/datahub-2024-4-wind_elec_gen_1.csv.gz | machbase-neo shell import --input -  --compress gzip --header --method append --timeformat ns wind_elec_gen

curl http://data.yotahub.com/2024-4/datahub-2024-4-wind_elec_gen_2.csv.gz | machbase-neo shell import --input -  --compress gzip --header --method append --timeformat ns wind_elec_gen
  • If specify a separate username and password, use the --user and --password options (if not sys/manager) and add the options as shown below.
curl http://data.yotahub.com/2024-4/datahub-2024-4-wind_elec_gen_1.csv.gz | machbase-neo shell import --input -  --compress gzip --header --method append --timeformat ns wind_elec_gen --user USERNAME --password PASSWORD

curl http://data.yotahub.com/2024-4/datahub-2024-4-wind_elec_gen_2.csv.gz | machbase-neo shell import --input -  --compress gzip --header --method append --timeformat ns wind_elec_gen --user USERNAME --password PASSWORD

4. Experimental Methodology


  • Model Objective: Wind Power Generation Prediction.
  • Tags Used: Hankyung Power Plant No. 1.
  • Model Configuration: LSTM AutoEncoder.
  • Learning Method: Unsupervised Learning.
    • Train: Model Training.
    • Test: Model Performance Evaluation Based on Wind Power Generation Prediction.
  • Model Optimizer: Adam.
  • Model Loss Function: Mean Squared Error.
  • Model Performance Metric: Mean Squared Error & R2 Score.
  • Data Loading Method
    • Loading the Entire Dataset.
    • Loading the Batch Dataset.
  • Data Preprocessing
    • MinMax Scaling.

5. Experiment Code


  • Below is the code for each of the two ways to get data from the database.
  • If all the data can be loaded and trained at once without causing memory errors, then method 1 is the fastest and simplest.
  • If the data is too large, causing memory errors, then the batch loading method proposed in method 2 is the most efficient.

Method 1) Loading the Entire Dataset


  • The code below is implemented in a way that loads all the data needed for training from the database all at once.
  • It is exactly the same as loading all CSV files (The only difference is that the data is loaded from Machbase Neo).
  • Pros
    • Can use the same code that was previously utilizing CSVs (Only the loading process is different).
  • Cons
    • Unable to train if trainable data size exceeds memory size.

Method 2) Loading the Batch Dataset


  • Method for loading data from the Machbase Neo for a single batch size.
  • The code below is for fetching a time range sequentially for a single batch size.
  • Pros
    • It is possible to train the model regardless of the data size, no matter how large it is.
  • Cons
    • It takes longer to train compared to method 1.

6. Experimental Results


Method 1) Loading the Entire Dataset Result


Method 2) Loading the Batch Dataset Result


  • The R2 score for loading the entire dataset resulted in 0.944, loading the batch dataset resulted in 0.959.





※ Various datasets and tutorial codes can be found in the GitHub repository below.

datahub/dataset/2024 at main · machbase/datahub
All Industrial IoT DataHub with data visualization and AI source - machbase/datahub

Back to Top