DBACAT Combine data from several dba data sets into a single data set. Syntax: [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP) [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP, OPTIONS) [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP, OPT1, VAL1, ...) Description: [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP) combines data from arrays in cell array DATA_LIST and metadata from structs in cell array META_LIST into a single data set with data in array DATA and metadata in struct array META. Elements in META_LIST and DATA_LIST should have the format returned by function DBA2MAT, but they do not need to have the same set of sensors. Outputs META and DATA have the same format, too. META is a struct array combining the information in elements of META_LIST. It has following fields: HEADERS: struct array built concatenating the HEADERS field of all elements in META_LIST. SENSORS: string cell array with the names of the sensors present in the returned data array (in the same column order), built merging the SENSORS field of all elements in META_LIST. UNITS: string cell array with the units of the sensors present in the returned data array (in the same column order), built merging the UNITS field of all elements in META_LIST. BYTES: array with the number of bytes of each sensor present in the returned data array, (in the same column order), built merging the BYTES field of all elements in META_LIST. SOURCES: string cell array built concatenating the SOURCES field of all elements in META_LIST. DATA is a numeric array combining the rows of arrays in DATA_LIST, reordering the sensor columns if needed, and sorting the resulting rows according to a timestamp from sensor named by string TIMESTAMP. [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP, OPTIONS) and [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP, OPT1, VAL1, ...) accept the following options given in key-value pairs OPT1, VAL1... or in a struct OPTIONS with field names as option keys and field values as option values: FORMAT: data output format. String setting the format of the output DATA. Valid values are: 'array': DATA is a matrix with sensor readings in the column order specified by the SENSORS metadata field. 'struct': DATA is a struct with sensor names as field names and column vectors of sensor readings as field values. Default value: 'array' SENSORS: sensor filtering list. String cell array with the names of the sensors of interest. If given, only sensors present in both the input data sets and this list will be present in output. The string 'all' may also be given, in which case sensor filtering is not performed and all sensors in the input list will be present in output. Default value: 'all' (do not perform sensor filtering). PERIOD: time filtering boundaries. Two element numeric array with the start and the end of the period of interest (seconds since 1970-01-01 00:0:00.00 UTC). If given, only sensor cycles with timestamps within this period will be present in output. The string 'all' may also be given, in which case time filtering is not performed and all sensors cycles in the input list will be present in output. Default value: 'all' (do not perform time filtering). Notes: This function should be used to combine data from several navigation files, or from several science files, but not from both navigation and science files (use DBAMERGE instead). Since sensor cycles (data rows) with the same timestamp may be present in several data sets (e.g. when combining data from sbd and mbd files), the function checks that data in those sensor cycles are consistent. If the same sensor is present in sensor cycles from different data sets with the same timestamp and different valid values (not NaN), an error is thrown. Otherwise the values are merged into a single sensor cycle. However, note that sensor cycles with the same timestamp in the same data set are not merged, and the values in the latest sensor cycle will be used. This may be relevant when binary data files are converted to ascii format with the option -o (the initial sensor cycle values may be omited if the timestamp in the following sensor cycle is the same). All values in the timestamp columns should be valid (not NaN). Examples: [meta, data] = dbacat(meta_list, data_list, timestamp) See also: XBD2DBA DBA2MAT DBAMERGE Authors: Joan Pau Beltran <joanpau.beltran@socib.cat>
0001 function [meta, data] = dbacat(meta_list, data_list, timestamp, varargin) 0002 %DBACAT Combine data from several dba data sets into a single data set. 0003 % 0004 % Syntax: 0005 % [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP) 0006 % [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP, OPTIONS) 0007 % [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP, OPT1, VAL1, ...) 0008 % 0009 % Description: 0010 % [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP) combines data from 0011 % arrays in cell array DATA_LIST and metadata from structs in cell array 0012 % META_LIST into a single data set with data in array DATA and metadata in 0013 % struct array META. Elements in META_LIST and DATA_LIST should have the 0014 % format returned by function DBA2MAT, but they do not need to have the same 0015 % set of sensors. Outputs META and DATA have the same format, too. 0016 % META is a struct array combining the information in elements of META_LIST. 0017 % It has following fields: 0018 % HEADERS: struct array built concatenating the HEADERS field of all 0019 % elements in META_LIST. 0020 % SENSORS: string cell array with the names of the sensors present in the 0021 % returned data array (in the same column order), built merging 0022 % the SENSORS field of all elements in META_LIST. 0023 % UNITS: string cell array with the units of the sensors present in the 0024 % returned data array (in the same column order), built merging 0025 % the UNITS field of all elements in META_LIST. 0026 % BYTES: array with the number of bytes of each sensor present in the 0027 % returned data array, (in the same column order), built merging 0028 % the BYTES field of all elements in META_LIST. 0029 % SOURCES: string cell array built concatenating the SOURCES field 0030 % of all elements in META_LIST. 0031 % DATA is a numeric array combining the rows of arrays in DATA_LIST, 0032 % reordering the sensor columns if needed, and sorting the resulting rows 0033 % according to a timestamp from sensor named by string TIMESTAMP. 0034 % 0035 % [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP, OPTIONS) and 0036 % [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP, OPT1, VAL1, ...) 0037 % accept the following options given in key-value pairs OPT1, VAL1... 0038 % or in a struct OPTIONS with field names as option keys and field values 0039 % as option values: 0040 % FORMAT: data output format. 0041 % String setting the format of the output DATA. Valid values are: 0042 % 'array': DATA is a matrix with sensor readings in the column order 0043 % specified by the SENSORS metadata field. 0044 % 'struct': DATA is a struct with sensor names as field names 0045 % and column vectors of sensor readings as field values. 0046 % Default value: 'array' 0047 % SENSORS: sensor filtering list. 0048 % String cell array with the names of the sensors of interest. 0049 % If given, only sensors present in both the input data sets and this 0050 % list will be present in output. The string 'all' may also be given, 0051 % in which case sensor filtering is not performed and all sensors 0052 % in the input list will be present in output. 0053 % Default value: 'all' (do not perform sensor filtering). 0054 % PERIOD: time filtering boundaries. 0055 % Two element numeric array with the start and the end of the period 0056 % of interest (seconds since 1970-01-01 00:0:00.00 UTC). If given, 0057 % only sensor cycles with timestamps within this period will be present 0058 % in output. The string 'all' may also be given, in which case time 0059 % filtering is not performed and all sensors cycles in the input list 0060 % will be present in output. 0061 % Default value: 'all' (do not perform time filtering). 0062 % 0063 % Notes: 0064 % This function should be used to combine data from several navigation files, 0065 % or from several science files, but not from both navigation and science 0066 % files (use DBAMERGE instead). 0067 % 0068 % Since sensor cycles (data rows) with the same timestamp may be present 0069 % in several data sets (e.g. when combining data from sbd and mbd files), 0070 % the function checks that data in those sensor cycles are consistent. 0071 % If the same sensor is present in sensor cycles from different data sets 0072 % with the same timestamp and different valid values (not NaN), an error is 0073 % thrown. Otherwise the values are merged into a single sensor cycle. 0074 % 0075 % However, note that sensor cycles with the same timestamp in the same data 0076 % set are not merged, and the values in the latest sensor cycle will be used. 0077 % This may be relevant when binary data files are converted to ascii format 0078 % with the option -o (the initial sensor cycle values may be omited if the 0079 % timestamp in the following sensor cycle is the same). 0080 % 0081 % All values in the timestamp columns should be valid (not NaN). 0082 % 0083 % Examples: 0084 % [meta, data] = dbacat(meta_list, data_list, timestamp) 0085 % 0086 % See also: 0087 % XBD2DBA 0088 % DBA2MAT 0089 % DBAMERGE 0090 % 0091 % Authors: 0092 % Joan Pau Beltran <joanpau.beltran@socib.cat> 0093 0094 % Copyright (C) 2013-2016 0095 % ICTS SOCIB - Servei d'observacio i prediccio costaner de les Illes Balears 0096 % <http://www.socib.es> 0097 % 0098 % This program is free software: you can redistribute it and/or modify 0099 % it under the terms of the GNU General Public License as published by 0100 % the Free Software Foundation, either version 3 of the License, or 0101 % (at your option) any later version. 0102 % 0103 % This program is distributed in the hope that it will be useful, 0104 % but WITHOUT ANY WARRANTY; without even the implied warranty of 0105 % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 0106 % GNU General Public License for more details. 0107 % 0108 % You should have received a copy of the GNU General Public License 0109 % along with this program. If not, see <http://www.gnu.org/licenses/>. 0110 0111 error(nargchk(3, 9, nargin, 'struct')); 0112 0113 0114 %% Set options and default values. 0115 options.format = 'array'; 0116 options.sensors = 'all'; 0117 options.period = 'all'; 0118 0119 0120 %% Parse optional arguments. 0121 % Get option key-value pairs in any accepted call signature. 0122 argopts = varargin; 0123 if isscalar(argopts) && isstruct(argopts{1}) 0124 % Options passed as a single option struct argument: 0125 % field names are option keys and field values are option values. 0126 opt_key_list = fieldnames(argopts{1}); 0127 opt_val_list = struct2cell(argopts{1}); 0128 elseif mod(numel(argopts), 2) == 0 0129 % Options passed as key-value argument pairs. 0130 opt_key_list = argopts(1:2:end); 0131 opt_val_list = argopts(2:2:end); 0132 else 0133 error('glider_toolbox:dbacat:InvalidOptions', ... 0134 'Invalid optional arguments (neither key-value pairs nor struct).'); 0135 end 0136 % Overwrite default options with values given in extra arguments. 0137 for opt_idx = 1:numel(opt_key_list) 0138 opt = lower(opt_key_list{opt_idx}); 0139 val = opt_val_list{opt_idx}; 0140 if isfield(options, opt) 0141 options.(opt) = val; 0142 else 0143 error('glider_toolbox:dbacat:InvalidOption', ... 0144 'Invalid option: %s.', opt); 0145 end 0146 end 0147 0148 0149 %% Set option flags and values. 0150 output_format = lower(options.format); 0151 sensor_filtering = true; 0152 sensor_list = cellstr(options.sensors); 0153 time_filtering = true; 0154 time_range = options.period; 0155 if ischar(options.sensors) && strcmp(options.sensors, 'all') 0156 sensor_filtering = false; 0157 end 0158 if ischar(options.period) && strcmp(options.period, 'all') 0159 time_filtering = false; 0160 end 0161 0162 0163 %% Cat data and metadata checkin for trivial empty input. 0164 % Check for trivial empty input. 0165 if isempty(meta_list) 0166 sources_cat = cell(0, 1); 0167 headers_cat = ... 0168 struct('dbd_label', cell(0, 1), 'encoding_ver', cell(0, 1), ... 0169 'num_ascii_tags', cell(0, 1), 'all_sensors', cell(0, 1), ... 0170 'filename', cell(0, 1), 'the8x3_filename', cell(0, 1), ... 0171 'filename_extension', cell(0, 1), 'filename_label', cell(0, 1), ... 0172 'mission_name', cell(0, 1), 'fileopen_time', cell(0, 1), ... 0173 'sensors_per_cycle', cell(0, 1), 'num_label_lines', cell(0, 1), ... 0174 'num_segments', cell(0, 1), 'segment_filenames', cell(0, 1)); 0175 sensors_cat_list = cell(0, 1); 0176 units_cat_list = cell(0, 1); 0177 bytes_cat_list = cell(0, 1); 0178 else 0179 meta_struct = [meta_list{:}]; 0180 sources_cat = vertcat(meta_struct.sources); 0181 headers_cat = vertcat(meta_struct.headers); 0182 sensors_cat_list = {meta_struct.sensors}'; 0183 units_cat_list = {meta_struct.units}'; 0184 bytes_cat_list = {meta_struct.bytes}'; 0185 end 0186 0187 % Build list of sensor information for concatenated data and metadata. 0188 [~, ~, sensors_cat_indices_to] = unique(vertcat(sensors_cat_list{:})); 0189 sensors_cat = cell(0, 1); 0190 units_cat = cell(0, 1); 0191 bytes_cat = zeros(0, 1, 'int32'); 0192 sensors_cat(sensors_cat_indices_to) = vertcat(sensors_cat_list{:}); 0193 units_cat(sensors_cat_indices_to) = vertcat(units_cat_list{:}); 0194 bytes_cat(sensors_cat_indices_to) = vertcat(bytes_cat_list{:}); 0195 0196 % Build list of indices of input data entries in concatenated data output. 0197 stamp_cat_list = cellfun(@(d, m) d(:, strcmp(timestamp, m.sensors)), ... 0198 data_list(:), meta_list(:), 'UniformOutput', false); 0199 [~, ~, stamp_cat_indices_to] = unique(vertcat(stamp_cat_list{:})); 0200 stamp_cat = zeros(0, 1); 0201 stamp_cat(stamp_cat_indices_to) = vertcat(stamp_cat_list{:}); 0202 0203 % Build list of indices of input data entries in concatenated data output. 0204 row_num_total = numel(stamp_cat); 0205 row_num_list = cellfun(@numel, stamp_cat_list(:)); 0206 row_end_list = cumsum(row_num_list); 0207 row_start_list = 1 + [0; row_end_list(1:end-1)]; 0208 col_num_total = numel(sensors_cat); 0209 col_num_list = cellfun(@numel, sensors_cat_list(:)); 0210 col_end_list = cumsum(col_num_list); 0211 col_start_list = 1 + [0; col_end_list(1:end-1)]; 0212 0213 % Set output concatenated data checking for consistency of overlapped data. 0214 data = nan(row_num_total, col_num_total); 0215 for data_idx = 1:numel(data_list) 0216 row_range = row_start_list(data_idx):row_end_list(data_idx); 0217 row_indices = stamp_cat_indices_to(row_range); 0218 col_range = col_start_list(data_idx):col_end_list(data_idx); 0219 col_indices = sensors_cat_indices_to(col_range); 0220 data_old = data(row_indices, col_indices); 0221 data_new = data_list{data_idx}; 0222 data_old_valid = ~isnan(data_old); 0223 data_new_valid = ~isnan(data_new); 0224 data_inconsistent = ... 0225 (data_old ~= data_new) & data_old_valid & data_new_valid; 0226 if any(data_inconsistent(:)) 0227 [row_inconsistent, col_inconsistent] = find(data_inconsistent); 0228 err_msg_arg_list = cell(4, numel(row_inconsistent)); 0229 err_msg_arg_list(1, :) = sensors_cat(col_indices(col_inconsistent)); 0230 err_msg_arg_list(2, :) = num2cell(stamp_cat(row_indices(row_inconsistent))); 0231 err_msg_arg_list(3, :) = num2cell(data_old(data_inconsistent)); 0232 err_msg_arg_list(4, :) = num2cell(data_new(data_inconsistent)); 0233 err_msg_fmt = '\nInconsistent value of %s at %12f: %12f %12f'; 0234 error('glider_toolbox:dbacat:InconsistentData', ... 0235 'Inconsistent data:%s', sprintf(err_msg_fmt, err_msg_arg_list{:})); 0236 end 0237 data_old(data_new_valid) = data_new(data_new_valid); 0238 data(row_indices, col_indices) = data_old; 0239 end 0240 0241 % Set metadata fields. 0242 meta.sources = sources_cat; 0243 meta.headers = headers_cat; 0244 meta.sensors = sensors_cat; 0245 meta.units = units_cat; 0246 meta.bytes = bytes_cat; 0247 0248 0249 %% Perform time filtering if needed. 0250 if time_filtering 0251 stamp_select = ~(stamp_cat < time_range(1) | stamp_cat > time_range(2)); 0252 data = data(stamp_select, :); 0253 end 0254 0255 0256 %% Perform sensor filtering if needed. 0257 if sensor_filtering 0258 [sensor_select, ~] = ismember(meta.sensors, sensor_list); 0259 meta.sensors = meta.sensors(sensor_select); 0260 meta.units = meta.units(sensor_select); 0261 meta.bytes = meta.bytes(sensor_select); 0262 data = data(:, sensor_select); 0263 end 0264 0265 0266 %% Convert output data to struct format if needed. 0267 switch output_format 0268 case 'array' 0269 case 'struct' 0270 data = cell2struct(num2cell(data, 1), meta.sensors, 2); 0271 otherwise 0272 error('glider_toolbox:dbacat:InvalidFormat', ... 0273 'Invalid output format: %s.', output_format) 0274 end 0275 0276 end