SXMERGE Merge data from combined SeaExplorer glider and payload data sets into a single data set. Syntax: [META, DATA] = SXMERGE(META_GLI, DATA_GLI, META_PLD, DATA_PLD) [META, DATA] = SXMERGE(META_GLI, DATA_GLI, META_PLD, DATA_PLD, OPTIONS) [META, DATA] = SXMERGE(META_GLI, DATA_GLI, META_PLD, DATA_PLD, OPT1, VAL1, ...) Description: [META, DATA] = SXMERGE(META_GLI, DATA_GLI, META_PLD, DATA_PLD) merges the glider and payload data sets described by metadata structs META_GLI and META_PLD, and data arrays DATA_GLI and DATA_PLD into a single data set described by metadata struct META and data array or struct DATA (see format option described below). Input metadata and data should be in the format returned by the function SXCAT. Data rows from both data sets are merged based on the order of the respective timestamps. See note on merging process. [META, DATA] = SXMERGE(META_GLI, DATA_GLI, META_PLD, DATA_PLD, OPTIONS) and [META, DATA] = SXMERGE(META_GLI, DATA_GLI, META_PLD, DATA_PLD, OPT1, VAL1, ...) accept the following options given in key-value pairs OPT1, VAL1... or in a struct OPTIONS with field names as option keys and field values as option values: FORMAT: data output format. String setting the format of the output DATA. Valid values are: 'array': DATA is a matrix with variable readings in the column order specified by the VARIABLES metadata field. 'struct': DATA is a struct with variable names as field names and column vectors of variable readings as field values. Default value: 'array' TIMEGLI: glider timestamp. String setting the name of the time variable for merging and sorting data row readings from SeaExplorer .gli data set. Default value: 'Timestamp' TIMEPLD: payload timestamp. String setting the name of the time variable for merging and sorting data row readings from SeaExplorer payload data set. Default value: 'PLD_REALTIMECLOCK' VARIABLES: variable filtering list. String cell array with the names of the variables of interest. If given, only variables present in both the input data sets and this list will be present in output. The string 'all' may also be given, in which case variable filtering is not performed and all variables in the input data sets will be present in output. Default value: 'all' (do not perform variable filtering). PERIOD: time filtering boundaries. Two element numeric array with the start and the end of the period of interest (seconds since 1970-01-01 00:0:00.00 UTC). If given, only row readings with timestamps within this period will be present in output. The string 'all' may also be given, in which case time filtering is not performed and all row readings in the input data sets will be present in output. Default value: 'all' (do not perform time filtering). Notes: This function should be used to merge data from SeaExplorer glider and payload data sets, not from data sets coming from the same type of files (use SXCAT instead). The merging process sorts row variable readings from glider and payload data sets comparing the respective timestamp values. Row variable readings coming from glider and payload data arrays with equal timestamp values are merged into a single row, otherwise the missing variable values are filled with invalid values (NaN). Variables in glider and payload data sets are all different, but if there were duplicated variables, the values from each data set would be merged in a common column according to the timestamp, and an error would be raised if there were inconsistent valid data entries (not NaN) for the same timestamp value. All values in timestamp columns should be valid (not NaN). In output, the .gli timestamp column contains the merged glider and payload timestamps to provide a consistent comprehensive timestamp variable for the merged data set. The payload timestamp contains only the timestamps of the payload data set. Examples: [meta, data] = sxmerge(meta_gli, data_gli, meta_pld, data_pld) See also: SX2MAT SXCAT Authors: Frederic Cyr <Frederic.Cyr@mio.osupytheas.fr> Joan Pau Beltran <joanpau.beltran@socib.cat>
0001 function [meta, data] = sxmerge(meta_gli, data_gli, meta_pld, data_pld, varargin) 0002 %SXMERGE Merge data from combined SeaExplorer glider and payload data sets into a single data set. 0003 % 0004 % Syntax: 0005 % [META, DATA] = SXMERGE(META_GLI, DATA_GLI, META_PLD, DATA_PLD) 0006 % [META, DATA] = SXMERGE(META_GLI, DATA_GLI, META_PLD, DATA_PLD, OPTIONS) 0007 % [META, DATA] = SXMERGE(META_GLI, DATA_GLI, META_PLD, DATA_PLD, OPT1, VAL1, ...) 0008 % 0009 % Description: 0010 % [META, DATA] = SXMERGE(META_GLI, DATA_GLI, META_PLD, DATA_PLD) merges the 0011 % glider and payload data sets described by metadata structs META_GLI and 0012 % META_PLD, and data arrays DATA_GLI and DATA_PLD into a single data set 0013 % described by metadata struct META and data array or struct DATA 0014 % (see format option described below). Input metadata and data should be 0015 % in the format returned by the function SXCAT. Data rows from both 0016 % data sets are merged based on the order of the respective timestamps. 0017 % See note on merging process. 0018 % 0019 % [META, DATA] = SXMERGE(META_GLI, DATA_GLI, META_PLD, DATA_PLD, OPTIONS) and 0020 % [META, DATA] = SXMERGE(META_GLI, DATA_GLI, META_PLD, DATA_PLD, OPT1, VAL1, ...) 0021 % accept the following options given in key-value pairs OPT1, VAL1... 0022 % or in a struct OPTIONS with field names as option keys and field values 0023 % as option values: 0024 % FORMAT: data output format. 0025 % String setting the format of the output DATA. Valid values are: 0026 % 'array': DATA is a matrix with variable readings in the column order 0027 % specified by the VARIABLES metadata field. 0028 % 'struct': DATA is a struct with variable names as field names 0029 % and column vectors of variable readings as field values. 0030 % Default value: 'array' 0031 % TIMEGLI: glider timestamp. 0032 % String setting the name of the time variable for merging and sorting 0033 % data row readings from SeaExplorer .gli data set. 0034 % Default value: 'Timestamp' 0035 % TIMEPLD: payload timestamp. 0036 % String setting the name of the time variable for merging and sorting 0037 % data row readings from SeaExplorer payload data set. 0038 % Default value: 'PLD_REALTIMECLOCK' 0039 % VARIABLES: variable filtering list. 0040 % String cell array with the names of the variables of interest. 0041 % If given, only variables present in both the input data sets and this 0042 % list will be present in output. The string 'all' may also be given, 0043 % in which case variable filtering is not performed and all variables 0044 % in the input data sets will be present in output. 0045 % Default value: 'all' (do not perform variable filtering). 0046 % PERIOD: time filtering boundaries. 0047 % Two element numeric array with the start and the end of the period 0048 % of interest (seconds since 1970-01-01 00:0:00.00 UTC). If given, 0049 % only row readings with timestamps within this period will be present 0050 % in output. The string 'all' may also be given, in which case time 0051 % filtering is not performed and all row readings in the input 0052 % data sets will be present in output. 0053 % Default value: 'all' (do not perform time filtering). 0054 % 0055 % Notes: 0056 % This function should be used to merge data from SeaExplorer glider and 0057 % payload data sets, not from data sets coming from the same type of files 0058 % (use SXCAT instead). 0059 % 0060 % The merging process sorts row variable readings from glider and payload 0061 % data sets comparing the respective timestamp values. Row variable readings 0062 % coming from glider and payload data arrays with equal timestamp values are 0063 % merged into a single row, otherwise the missing variable values are filled 0064 % with invalid values (NaN). Variables in glider and payload data sets are 0065 % all different, but if there were duplicated variables, the values from each 0066 % data set would be merged in a common column according to the timestamp, 0067 % and an error would be raised if there were inconsistent valid data entries 0068 % (not NaN) for the same timestamp value. 0069 % 0070 % All values in timestamp columns should be valid (not NaN). 0071 % In output, the .gli timestamp column contains the merged glider and payload 0072 % timestamps to provide a consistent comprehensive timestamp variable 0073 % for the merged data set. The payload timestamp contains only the timestamps 0074 % of the payload data set. 0075 % 0076 % Examples: 0077 % [meta, data] = sxmerge(meta_gli, data_gli, meta_pld, data_pld) 0078 % 0079 % See also: 0080 % SX2MAT 0081 % SXCAT 0082 % 0083 % Authors: 0084 % Frederic Cyr <Frederic.Cyr@mio.osupytheas.fr> 0085 % Joan Pau Beltran <joanpau.beltran@socib.cat> 0086 0087 % Copyright (C) 2016 0088 % ICTS SOCIB - Servei d'observacio i prediccio costaner de les Illes Balears 0089 % <http://www.socib.es> 0090 % 0091 % This program is free software: you can redistribute it and/or modify 0092 % it under the terms of the GNU General Public License as published by 0093 % the Free Software Foundation, either version 3 of the License, or 0094 % (at your option) any later version. 0095 % 0096 % This program is distributed in the hope that it will be useful, 0097 % but WITHOUT ANY WARRANTY; without even the implied warranty of 0098 % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 0099 % GNU General Public License for more details. 0100 % 0101 % You should have received a copy of the GNU General Public License 0102 % along with this program. If not, see <http://www.gnu.org/licenses/>. 0103 0104 error(nargchk(4, 14, nargin, 'struct')); 0105 0106 0107 %% Set options and default values. 0108 options.format = 'array'; 0109 options.timegli = 'Timestamp'; 0110 options.timepld = 'PLD_REALTIMECLOCK'; 0111 options.variables = 'all'; 0112 options.period = 'all'; 0113 0114 0115 %% Parse optional arguments. 0116 % Get option key-value pairs in any accepted call signature. 0117 argopts = varargin; 0118 if isscalar(argopts) && isstruct(argopts{1}) 0119 % Options passed as a single option struct argument: 0120 % field names are option keys and field values are option values. 0121 opt_key_list = fieldnames(argopts{1}); 0122 opt_val_list = struct2cell(argopts{1}); 0123 elseif mod(numel(argopts), 2) == 0 0124 % Options passed as key-value argument pairs. 0125 opt_key_list = argopts(1:2:end); 0126 opt_val_list = argopts(2:2:end); 0127 else 0128 error('glider_toolbox:sxmerge:InvalidOptions', ... 0129 'Invalid optional arguments (neither key-value pairs nor struct).'); 0130 end 0131 % Overwrite default options with values given in extra arguments. 0132 for opt_idx = 1:numel(opt_key_list) 0133 opt = lower(opt_key_list{opt_idx}); 0134 val = opt_val_list{opt_idx}; 0135 if isfield(options, opt) 0136 options.(opt) = val; 0137 else 0138 error('glider_toolbox:sxmerge:InvalidOption', ... 0139 'Invalid option: %s.', opt); 0140 end 0141 end 0142 0143 0144 %% Set option flags and values. 0145 output_format = lower(options.format); 0146 time_variable_gli = options.timegli; 0147 time_variable_pld = options.timepld; 0148 variable_filtering = true; 0149 variable_list = cellstr(options.variables); 0150 time_filtering = true; 0151 time_range = options.period; 0152 if ischar(options.variables) && strcmp(options.variables, 'all') 0153 variable_filtering = false; 0154 end 0155 if ischar(options.period) && strcmp(options.period, 'all') 0156 time_filtering = false; 0157 end 0158 0159 0160 %% Merge data and metadata checking for empty input cases. 0161 if isempty(meta_gli.sources) && isempty(meta_pld.sources) 0162 % No input data. 0163 % Both META_GLI and DATA_GLI, and META_PLD and DATA_PLD 0164 % are equal to the trivial output of SXCAT. 0165 % Disable filtering. 0166 meta = meta_gli; 0167 data = data_gli; 0168 variable_filtering = false; 0169 time_filtering = false; 0170 elseif isempty(meta_pld.sources) 0171 % Only glider data. 0172 meta = meta_gli; 0173 data = data_gli; 0174 time_variable_merged = time_variable_gli; % Time variable for filtering. 0175 elseif isempty(meta_gli.sources) 0176 % Only payload data. 0177 meta = meta_pld; 0178 data = data_pld; 0179 time_variable_merged = time_variable_pld; % Time variable for filtering. 0180 else 0181 % Build list of sources and variables for merged data and metadata. 0182 sources_gli = meta_gli.sources; 0183 sources_pld = meta_pld.sources; 0184 sources_merged = vertcat(sources_gli, sources_pld); 0185 variables_gli = meta_gli.variables; 0186 variables_pld = meta_pld.variables; 0187 [variables_merged, ~, variables_merged_indices_to] = ... 0188 unique(vertcat(variables_gli, variables_pld)); 0189 0190 % Check that both data sets have their own timestamp variable. 0191 [time_variable_gli_present, time_variable_gli_col] = ... 0192 ismember(time_variable_gli, variables_gli); 0193 if ~time_variable_gli_present 0194 error('glider_toolbox:sxmerge:MissingTimestamp', ... 0195 'Missing timestamp variable in glider data set: %s.', ... 0196 time_variable_gli); 0197 end 0198 [time_variable_pld_present, time_variable_pld_col] = ... 0199 ismember(time_variable_pld, variables_pld); 0200 if ~time_variable_pld_present 0201 error('glider_toolbox:sxmerge:MissingTimestamp', ... 0202 'Missing timestamp variable in payload data set: %s.', ... 0203 time_variable_pld); 0204 end 0205 0206 % Build list of unique timestamps and the output index of each data row. 0207 stamp_gli = data_gli(:, time_variable_gli_col); 0208 stamp_pld = data_pld(:, time_variable_pld_col); 0209 [stamp_merged, ~, stamp_merged_indices_to] = ... 0210 unique(vertcat(stamp_gli, stamp_pld)); 0211 0212 % Build indices of glider and payload entries in merged data output. 0213 row_num_gli = numel(stamp_gli); 0214 row_range_gli = 1:row_num_gli; 0215 row_indices_gli = stamp_merged_indices_to(row_range_gli); 0216 row_num_pld = numel(stamp_pld); 0217 row_range_pld = row_num_gli + (1:row_num_pld); 0218 row_indices_pld = stamp_merged_indices_to(row_range_pld); 0219 row_num_merged = numel(stamp_merged); 0220 col_num_gli = numel(variables_gli); 0221 col_range_gli = 1:col_num_gli; 0222 col_indices_gli = variables_merged_indices_to(col_range_gli); 0223 col_num_pld = numel(variables_pld); 0224 col_range_pld = col_num_gli + (1:col_num_pld); 0225 col_indices_pld = variables_merged_indices_to(col_range_pld); 0226 col_num_merged = numel(variables_merged); 0227 0228 % Check for consistency of overlapped glider and payload data. 0229 [row_overlap_merged, row_overlap_gli, row_overlap_pld] = ... 0230 intersect(row_indices_gli, row_indices_pld); 0231 [col_overlap_merged, col_overlap_gli, col_overlap_pld] = ... 0232 intersect(col_indices_gli, col_indices_pld); 0233 data_overlap_gli = data_gli(row_overlap_gli, col_overlap_gli); 0234 data_overlap_pld = data_pld(row_overlap_pld, col_overlap_pld); 0235 data_overlap_gli_valid = ~isnan(data_overlap_gli); 0236 data_overlap_pld_valid = ~isnan(data_overlap_pld); 0237 data_overlap_inconsistent = (data_overlap_gli ~= data_overlap_pld) ... 0238 & data_overlap_gli_valid ... 0239 & data_overlap_pld_valid; 0240 if any(data_overlap_inconsistent(:)) 0241 [row_inconsistent, col_inconsistent] = find(data_overlap_inconsistent); 0242 err_msg_arg_list = cell(4, numel(row_inconsistent)); 0243 err_msg_arg_list(1, :) = ... 0244 variables_merged(col_overlap_merged(col_inconsistent)); 0245 err_msg_arg_list(2, :) = cellstr( ... 0246 datestr(posixtime2utc(stamp_merged(row_overlap_merged(row_inconsistent))), ... 0247 'dd/mm/yyyy HH:MM:SS.FFF')); 0248 err_msg_arg_list(3, :) = ... 0249 num2cell(data_overlap_gli(data_overlap_inconsistent)); 0250 err_msg_arg_list(4, :) = ... 0251 num2cell(data_overlap_pld(data_overlap_inconsistent)); 0252 err_msg_fmt = '\nInconsistent glider and payload value of %s at %s: %12f %12f'; 0253 error('glider_toolbox:sxmerge:InconsistentData', ... 0254 'Inconsistent data:%s', sprintf(err_msg_fmt, err_msg_arg_list{:})); 0255 end 0256 0257 % Set output merged data. 0258 data = nan(row_num_merged, col_num_merged); 0259 data(row_indices_gli, col_indices_gli) = data_gli; 0260 data(row_indices_pld, col_indices_pld) = data_pld; 0261 data_overlap_merged = data_overlap_gli; 0262 data_overlap_merged(data_overlap_pld_valid) = ... 0263 data_overlap_pld(data_overlap_pld_valid); 0264 data(row_overlap_merged, col_overlap_merged) = data_overlap_merged; 0265 0266 % Copy payload timestamp entries to glider timestamp entries. 0267 data(row_indices_pld, col_indices_gli(time_variable_gli_col)) = stamp_pld; 0268 time_variable_merged = time_variable_gli; 0269 0270 % Set metadata fields. 0271 meta.sources = sources_merged; 0272 meta.variables = variables_merged; 0273 end 0274 0275 0276 %% Perform time filtering if needed. 0277 if time_filtering 0278 [time_variable_merged_present, time_variable_merged_col] = ... 0279 ismember(time_variable_merged, meta.variables); 0280 if ~time_variable_merged_present 0281 error('glider_toolbox:sxmerge:MissingTimestamp', ... 0282 'Missing timestamp variable in merged data set: %s.', ... 0283 time_variable_merged); 0284 end 0285 stamp_merged = data(:, time_variable_merged_col); 0286 stamp_select = ... 0287 ~(stamp_merged < time_range(1) | stamp_merged > time_range(2)); 0288 data = data(stamp_select, :); 0289 end 0290 0291 0292 %% Perform variable filtering if needed. 0293 if variable_filtering 0294 [variable_select, ~] = ismember(meta.variables, variable_list); 0295 meta.variables = meta.variables(variable_select); 0296 data = data(:, variable_select); 0297 end 0298 0299 0300 %% Convert output data to struct format if needed. 0301 switch output_format 0302 case 'array' 0303 case 'struct' 0304 data = cell2struct(num2cell(data, 1), meta.variables, 2); 0305 otherwise 0306 error('glider_toolbox:sxmerge:InvalidFormat', ... 0307 'Invalid output format: %s.', output_format) 0308 end 0309 0310 end