code0 | code | format | type | Lat | Lon | StartTime | EndTime |
---|---|---|---|---|---|---|---|
missing | ZB | 知本 | GM | 22.7398 | 121.065 | 20201006 | missing |
missing | XC | 新城 | GM | 24.0383 | 121.609 | 20201006 | missing |
missing | SM | 日月潭 | GM | 23.881 | 120.908 | 20191008 | missing |
missing | CN | 暨南 | GM | 23.9576 | 120.928 | 20221229 | missing |
em3 | KUOL | 過嶺 | GE | 24.9629 | 121.142 | 2011-09-22 | 9999-12-31 |
em4 | HUAL | 華陵 | GE | 24.6745 | 121.368 | 2012-01-17 | 9999-12-31 |
em5 | TOCH | 頭城 | GE | 24.8435 | 121.805 | 2012-02-10 | 9999-12-31 |
em6 | ENAN | 南澳 | GE | 24.4758 | 121.785 | 2012-02-15 | 9999-12-31 |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
Tutorial
Requirements
Environment
GEMS-MagTIP relies on the following toolboxes; type license('inuse')
in matlab command line or ver
to list the toolboxes currently available on your machine.
Toolbox in-use:
- MATLAB Version 9.11 (R2021b)
- Mapping Toolbox Version 5.2 (R2021b)
- Parallel Computing Toolbox Version 7.5 (R2021b)
- Signal Processing Toolbox Version 8.7 (R2021b)
- Statistics and Machine Learning Toolbox Version 12.2 (R2021b)
Dependencies
GEMS-MagTIP depends on okatsn/toolbox and CGRG-lab/GEMS-MagTIP-insider.
You have to add these dependencies to path every time before running your scripts, as follows
CGRG-lab/GEMS-MagTIP-insider
Copy CGRG-lab/GEMS-MagTIP-insider to your local drive (e.g., GEMS-MagTIP-insider
) and add the source code inside to path:
dir_src = 'GEMS-MagTIP-insider/src';
addpath(genpath(dir_src));
okatsn/toolbox
Copy okatsn/toolbox to the main directory (e.g., GEMS-MagTIP-insider/toolbox
) and add the source code inside to path:
dir_toolbox = 'GEMS-MagTIP-insider/toolbox';
addpath(genpath(dir_toolbox));
Getting Started
The main functions of GEMS-MagTIP take directories that contains intermediate data (those .mat
files) as input arguments, and output data in the assigned directory.
Here is the minimal example of the chain of main functions:
Prepare your data
You have to prepare/update the following data:
- earthquake catalog
- geo-electric data of standard format
- geo-magnetic data of standard format
Earthquake catalog and station information
Please download or update catalog of events \(M_L \geq 5\) from GDMSN, and move it to spreadsheet/catalog.csv
. The spreadsheet/station_location.csv
specifies the location of every station, which is included in the copy of GEMS-MagTIP-insider.
Keypoints
- The headers (column names) for
catalog.csv
should be ‘time’, ‘Lon’, ‘Lat’, ‘Depth’, and ‘Mag’. - The headers for
station_location.csv
should be being ‘code’, ‘format’, ‘Lon’, ‘Lat’ forstation_location.csv
. - The order of column names can be arbitrarily arranged, but the strings have to be exactly the same as above.
- To update earthquake catalog, just put the updated table in csv to
spreadsheet/catalog.csv
with overwriting option and runcheckcatalog(dir_catalog)
; if bothcatalog.mat
andcatalog.csv
are in the directorydir_catalog
,catalog.mat
will be overwritten by the new one converted fromcatalog.csv
. - To update
spreadsheet/station_location.csv
, the procedure/workflow is the same as the previous point.
Supported catalog format
GDMSN form
date | time | lat | lon | depth | ML | nstn | dmin | gap | trms | ERH | ERZ | fixed | nph | quality |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
… | … | … | … | … | … | … | … | … | … | … | … | … | … | … |
Free form
time | Lon | Lat | Depth | Mag |
---|---|---|---|---|
2020/8/10 06:41 | 121.59 | 23.81 | 29.86 | 3.41 |
2020/8/10 06:29 | 120.57 | 22.18 | 43.54 | 3.02 |
2020/8/10 06:14 | 121.7 | 22.17 | 124.78 | 4.13 |
… | … | … | … | … |
here is a partial view on an exemplary station_location.csv
:
Data of the standard format
Converting raw data into a standard format is crucial. The conversion involves verification of date-time, conduct projection for geo-electric data to NS-EW, format the file name for indexing, and etc..
Here is the minimal script for converting raw data into the standard format:
dir_gems_raw = 'data-raw/GEMSdat'; % raw GEMS data
dir_mag_raw = 'data-raw/MAG'; % raw MAG data
dir_data = 'data-standard'; % output directory
conv_gemsdata(dir_gems_raw, dir_data, dir_catalog);
conv_geomagdata(dir_mag_raw, dir_data);
Feel free to organize files and folders under dir_data
, or convert the same raw data twice, because in GEMS-MagTIP data loading relies on the name of the standard format and the conversion functions will handle duplicated files.
See standarddataname
and write_data
For more information please refer the section Load raw data and convert it into a standard format.
Setting Up Directory Paths
Assigning directories for input/output data or variables is necessary before running any of the main function.
For example:
% For windows, use backslash `\`; for unix systems, use slash `/` in the path to directories.
dir_catalog = 'GEMS-MagTIP-insider/spreadsheet';
% directory of event catalog & station location
dir_data = 'standard-data';
% directory of geomagnetic timeseries of "standard format"
dir_stat = 'var-output/StatisticIndex';
% directory of statistic indices
dir_tsAIN = 'var-output/tsAIN';
% directory for storing anomaly index number (AIN)
dir_molchan = 'var-output/Molchan';
% directory for storing Molchan scores
dir_jointstation = 'var-output/JointStation';
% directory for the time series of EQK, TIP and probability
- You can use
mkdir_default
to automatically generate empty directories for main functions. - You can use
dirselectassign
to assign directories via file explorer interface.
dir_catalog
must containcatalog.csv
orcatalog.mat
, andstation_location.csv
orstation_location.mat
.dir_data
is the directory that contains GE or GM data of the standard format; seeconv_geomagdata
andconv_gemsdata
for the conversion of raw data to the standard format.
Executing Main Functions in the Correct Order
statind(dir_data,dir_stat);
anomalyind(dir_stat,dir_tsAIN);
molscore(dir_tsAIN,dir_catalog,dir_molchan);
molscore3(dir_tsAIN,dir_molchan,dir_catalog,dir_jointstation);
statind
, molscore
, and molscore3
have alternatives which run the same calculation in parallel. In most cases you can simply append _parfor
(e.g., molscore3_parfor(...)
) in the function name with no need to modify input arguments to do run in parallel. See statind_parfor
, molscore_parfor
, and molscore3_parfor
.
Comprehensive Sample Script
Here is a sample script “demo/demo_script.m” for the whole process.
You can run startup0.m
and follows the instruction in the command window to add dependencies and assign input/output directories as described above.
%% Convert Raw data to standard type
% Load original data and save them as matfiles of the standard format
conv_gemsdata(dir_gems_raw, dir_data, dir_catalog,'FilterByDatetime',datetime(2020,1,1)); % Convert raw GE data of timestamps after 2020-1-1 to standard format.
% Assign 'FilterByDatetime' to save time when standard geomagnetic data out of the specified date-time range already exist.
% Discard 'FilterByDatetime' to convert everything in the raw-data directory.
conv_geomagdata(dir_mag_raw, dir_data,'FilterByDatetime',datetime(2020,1,1)); % The same as above but for the conversion of GM data.
%% Calculate Statistic Index
statind_parfor(dir_data,dir_stat, ... % Load data in dir_data, save index in dir_stat
'Preprocess',{'ULF_A','ULF_B','BP_40','BP_35'}, ... % with 4 kinds of filtering
'SavePreprocessedData',false, ... % without saving filtered timeseries
'StatName', {'S', 'K', 'FI', 'SE'}, ... % the variable name for 'StatFunction'.
'StatFunction', {@skewness, @kurtosis, @fisherinformation, @shannonentropy}, ...
'FilterByDatetime',[datetime(2011,1,1), datetime(2022,12,31)]);
% Assign 'FilterByDatetime' to calculate statistical indices only between 2011-1-1 and 2022-12-31.
% Noted that standard format GE/GM data have to be available in this range, otherwise you will get NaN if data is missing (or not converted) in `dir_data`.
%% Data overview
% An overview of data avaliability/deficiency according to the results in dir_stat
plot_dataoverview(dir_stat, dir_catalog);
%% Calculate Anomaly Indices
anomalyind(dir_stat,dir_tsAIN);
%% Training
molscore_parfor(dir_tsAIN,dir_catalog,dir_molchan,...
'TrainingPhase', {calyears(3),datetime(2022,4,1);... % Use up to 3-years data before 2022-4-1 for model training.
calyears(5),datetime(2022,4,1);... % Use up to 5-years data before 2022-4-1 for model training.
calyears(7),datetime(2022,4,1);... % Use up to 7-years data before 2022-4-1 for model training.
calyears(9),datetime(2022,4,1)},...% Use up to 9-years data before 2022-4-1 for model training.
'modparam',{'Test', 5000}); % Remember to disable 'Test' in a real run.
%% Forecast and test
molscore3_parfor(dir_tsAIN,dir_molchan,dir_catalog,dir_jointstation,...
'OverwriteFile',true, ...
'ForecastingPhase', repmat([datetime(2022,4,2), datetime(2022,9,27)], 4,1));
% Manually assign forecasting phases. Typically the size of 'ForecastingPhase' should align with the size of 'TrainingPhase'.
Visualization of the results
Plot EQK-TIP Matching diagram
EQK-TIP of each station:
%% Plot EQK-TIP of each station
% the output directory for the images
dir_png = fullfile(dir_molchan,'png','EQKTIP'); % no need to mkdir
% Plot TIP of individual stations as heatmap,
% with target earthquakes (EQK) scattered on the top.
plotEQKTIP1(dir_tsAIN,dir_molchan,dir_catalog,dir_png); %,...
'ForecastingPhase',calyears(1),'ShowTrainingPhase',1,'Rank',1,...
'ForceMagnitude',false, ...
'scatter',1);
% remove the white space around the image.
cropimg(dir_png,'SaveInplace',true);
In this figure, the match diagram of EQK and TIP defined by the model of first rank is demonstrated. The intervals of black color denotes the days where there is no data in \(T_\text{obs}\) at all (i.e. this model is invalid at these time) and hence TIP cannot be calculated.
Plot 2D EQK-TIP Matching diagram
EQK-TIP in a 2D temporal-spatial coordinate system:
%% EQK-TIP in a 2D temporal-spatial coordinate system
dir_png = fullfile(dir_jointstation,'png', 'EQKTIP3'); mkdir(dir_png);
% Find the data of ID 'AMn6ei' and filter tag 'ULF_A'
jid = 'AMn6ei';
filter_tag = 'ULF_A';
jpathlist = datalist(sprintf('[JointStation]ID[%s]prp[%s]*.mat', jid, filter_tag), dir_jointstation).fullpath;
% Retrieve essential data from '[JointStation]' files in the jpathlist:
AlarmedRate, MissingRate,xlabels, EQKs, TIP3s, TIPv3s, TIPTimes, LatLons] = ...
[calcFittingDegree(jpathlist);
% Plot EQK-TIP on a 2D temporal-spatial coordinates
titletag = sprintf('EQK-TIP (trial ID: %s; filter: %s)', jid, filter_tag)
plotEQKTIP3(dir_png, titletag, xlabels, EQKs, TIP3s, TIPv3s,TIPTimes, LatLons);
Plot fitting degree
%% Plot fitting degree
dir_png = fullfile(dir_jointstation,'fitting_degree'); mkdir(dir_png);
plotFittingDegree(dir_jointstation,dir_catalog,dir_png,...
'ConfidenceLevel',0.95);
Plot probability map
%% Plot probability map
% the output directory for the images
dir_prob = fullfile(dir_jointstation,'png','prob');
% specific datetime to be plotted
dates2plot = [datetime(2020,12,1):caldays(30):datetime(2021,2,11)]';
% plot GEMS-MagTIP probability map
% (export individual image for each date)
dir_prob2 = plotProbability(dir_jointstation,dir_catalog,dir_prob, ...
'TimeRange',dates2plot,...
'PlotEpicenter','all');
; )