SGLOG2MAT Load data and metadata from a Seaglider log file. Syntax: [META, DATA] = SGLOG2MAT(FILENAME) [META, DATA] = SGLOG2MAT(FILENAME, OPTIONS) [META, DATA] = SGLOG2MAT(FILENAME, OPT1, VAL1, ...) Description: [META, DATA] = SGLOG2MAT(FILENAME, VARARGIN) reads the Seaglider log file named by string FILENAME, loading its metadata in struct META and its data in struct DATA. [META, DATA] = SGLOG2MAT(FILENAME, OPTIONS) and [META, DATA] = SGLOG2MAT(FILENAME, OPT1, VAL1, ...) accept the following options given in key-value pairs OPT1, VAL1... or in a struct OPTIONS with field names as option keys and field values as option values: FORMAT: data output format. String setting the format of the output DATA. Valid values are: 'array': DATA is a struct with a scalar field for each scalar parameter and an array or cell array field for each non-scalar parameter. Values of non-scalar parameters are in the column order given by the corresponding field of the PARAMS metadata field. 'merged': DATA is a struct with a scalar or column vector field for each scalar parameter or entry of a non-scalar parameter. For scalar parameters, the field is named after the parameter, while for non-scalar parameters the field names are the parameter name and its field names, separated by underscore. 'struct': DATA is a struct with a scalar field for each scalar parameter and a struct array for each non-scalar parameter. The fields of the non-scalar parameters are given by the corresponding field of the PARAMS metadata field. Default value: 'array' PARAMS: parameter filtering list. String cell array with the names of the parameters of interest. If given, only parameters present in both the input file and this list will be present in output. For non-scalar parameters, the name of the identifier as it appears in the log line specifies including all of its fields. Individual parameter fields are selected with the identifier and the name of the field separated by underscore (e.g. 'GC_st_secs'). The string 'all' may also be given, in which case parameter filtering is not performed and all parameters in input file will be present in output. Default value: 'all' (do not perform parameter filtering). META has the following fields based on the tags of the header and the content of some metaparameters: HEADERS: a struct with the initial tags in the log file: VERSION: string with the version tag in log header. GLIDER : string with the glider id tag in log header. MISSION: mission number tag in log header. DIVE : dive number tag in log header. START : start date and time tag in log header (month, day of month, year after 1900, hour, minute and second). START_SECS: dive start time from header tag in POSIX time (seconds since 1970 Janyuay 01 00:00:00 UTC). PARAMS: struct with the names of the fields of non-scalar parameters. There is one field named after each parameter in data output, whose value is a string cell array with the names for the parameter fields, empty for scalar parameters. See note below for some exceptions. GCHEAD: string cell array with the names of the fields for the GC lines (in the same column order as in the output data). DEVICES: string cell array with the names of the device fields for device lines (in the same column order as in the output data). SENSORS: string cell array with the names of the sensor fields for sensor lines (in the same column order as in the output data). SOURCES: string cell array containing FILENAME. Notes: This parsing is based on the information about the log files provided by the Seaglider User's Guide, the Seaglider File Formats Manual and the Parameter Reference Manual. The information there is not complete, so the final result might be suboptimal. The parsing is done according to the following general rules: - The log data is preceded by a header identifying the dive. - Each data line provides a parameter and is of the form: $PARAM,VAL1,VAL2,...\n - Some parameter names are not valid as identifiers. - GPS parameters GPS1, GPS2 and GPS are gathered together. - Some parameters are scalar (single valued). Others are multi-valued and each value is treated as a parameter field. - Field names has been chosen to reflect their meaning according to the documentation cited above. - If field names are not known, they are generated as sequentially as 'fieldXX' where XX is the positional index of the correpsonding value. - If field names are known, any missing values are treated as empty strings ('') or invalid number (NaN). - All parameter values are numeric except STATE, TGT_NAME, and the GPS date and time fields, which are strings. - There are some metaparameters that set the fields of other parameters: DEVICES line sets the fields of DEVICE_SECS and DEVICE_MAMPS. SENSORS line sets the fields of SENSOR_SECS and SENSOR_MAMPS. GCHEAD line sets the fields of GC. Examples: [meta, data] = sglog2mat(filename) [meta, data] = sglog2mat(filename, 'format', 'struct') options = struct( ... 'params', {{'GPS_date' 'GPS_time' 'GPS_latf' 'GPS_lonf' 'CURRENT'}}, ... 'format', {'merged'}); [meta, data] = sglog2mat(filename, options) See also: SGENG2MAT SGLOGCAT SGENGCAT SGENGLOGMERGE Authors: Joan Pau Beltran <joanpau.beltran@socib.cat>
0001 function [meta, data] = sglog2mat(filename, varargin) 0002 %SGLOG2MAT Load data and metadata from a Seaglider log file. 0003 % 0004 % Syntax: 0005 % [META, DATA] = SGLOG2MAT(FILENAME) 0006 % [META, DATA] = SGLOG2MAT(FILENAME, OPTIONS) 0007 % [META, DATA] = SGLOG2MAT(FILENAME, OPT1, VAL1, ...) 0008 % 0009 % Description: 0010 % [META, DATA] = SGLOG2MAT(FILENAME, VARARGIN) reads the Seaglider log file 0011 % named by string FILENAME, loading its metadata in struct META and its data 0012 % in struct DATA. 0013 % 0014 % [META, DATA] = SGLOG2MAT(FILENAME, OPTIONS) and 0015 % [META, DATA] = SGLOG2MAT(FILENAME, OPT1, VAL1, ...) accept the following 0016 % options given in key-value pairs OPT1, VAL1... or in a struct OPTIONS 0017 % with field names as option keys and field values as option values: 0018 % FORMAT: data output format. 0019 % String setting the format of the output DATA. Valid values are: 0020 % 'array': DATA is a struct with a scalar field for each scalar 0021 % parameter and an array or cell array field for each non-scalar 0022 % parameter. Values of non-scalar parameters are in the column order 0023 % given by the corresponding field of the PARAMS metadata field. 0024 % 'merged': DATA is a struct with a scalar or column vector field for 0025 % each scalar parameter or entry of a non-scalar parameter. 0026 % For scalar parameters, the field is named after the parameter, 0027 % while for non-scalar parameters the field names are the parameter 0028 % name and its field names, separated by underscore. 0029 % 'struct': DATA is a struct with a scalar field for each scalar 0030 % parameter and a struct array for each non-scalar parameter. 0031 % The fields of the non-scalar parameters are given by the 0032 % corresponding field of the PARAMS metadata field. 0033 % Default value: 'array' 0034 % PARAMS: parameter filtering list. 0035 % String cell array with the names of the parameters of interest. 0036 % If given, only parameters present in both the input file and this list 0037 % will be present in output. For non-scalar parameters, the name 0038 % of the identifier as it appears in the log line specifies including 0039 % all of its fields. Individual parameter fields are selected 0040 % with the identifier and the name of the field separated by underscore 0041 % (e.g. 'GC_st_secs'). The string 'all' may also be given, in which case 0042 % parameter filtering is not performed and all parameters in input file 0043 % will be present in output. 0044 % Default value: 'all' (do not perform parameter filtering). 0045 % 0046 % META has the following fields based on the tags of the header and the 0047 % content of some metaparameters: 0048 % HEADERS: a struct with the initial tags in the log file: 0049 % VERSION: string with the version tag in log header. 0050 % GLIDER : string with the glider id tag in log header. 0051 % MISSION: mission number tag in log header. 0052 % DIVE : dive number tag in log header. 0053 % START : start date and time tag in log header (month, day of month, 0054 % year after 1900, hour, minute and second). 0055 % START_SECS: dive start time from header tag in POSIX time 0056 % (seconds since 1970 Janyuay 01 00:00:00 UTC). 0057 % PARAMS: struct with the names of the fields of non-scalar parameters. 0058 % There is one field named after each parameter in data output, whose 0059 % value is a string cell array with the names for the parameter fields, 0060 % empty for scalar parameters. See note below for some exceptions. 0061 % GCHEAD: string cell array with the names of the fields for the GC lines 0062 % (in the same column order as in the output data). 0063 % DEVICES: string cell array with the names of the device fields 0064 % for device lines (in the same column order as in the output data). 0065 % SENSORS: string cell array with the names of the sensor fields 0066 % for sensor lines (in the same column order as in the output data). 0067 % SOURCES: string cell array containing FILENAME. 0068 % 0069 % Notes: 0070 % This parsing is based on the information about the log files provided by 0071 % the Seaglider User's Guide, the Seaglider File Formats Manual and the 0072 % Parameter Reference Manual. The information there is not complete, so the 0073 % final result might be suboptimal. 0074 % 0075 % The parsing is done according to the following general rules: 0076 % - The log data is preceded by a header identifying the dive. 0077 % - Each data line provides a parameter and is of the form: 0078 % $PARAM,VAL1,VAL2,...\n 0079 % - Some parameter names are not valid as identifiers. 0080 % - GPS parameters GPS1, GPS2 and GPS are gathered together. 0081 % - Some parameters are scalar (single valued). 0082 % Others are multi-valued and each value is treated as a parameter field. 0083 % - Field names has been chosen to reflect their meaning according to 0084 % the documentation cited above. 0085 % - If field names are not known, they are generated as sequentially as 0086 % 'fieldXX' where XX is the positional index of the correpsonding value. 0087 % - If field names are known, any missing values are treated as empty 0088 % strings ('') or invalid number (NaN). 0089 % - All parameter values are numeric except STATE, TGT_NAME, and the GPS 0090 % date and time fields, which are strings. 0091 % - There are some metaparameters that set the fields of other parameters: 0092 % DEVICES line sets the fields of DEVICE_SECS and DEVICE_MAMPS. 0093 % SENSORS line sets the fields of SENSOR_SECS and SENSOR_MAMPS. 0094 % GCHEAD line sets the fields of GC. 0095 % 0096 % Examples: 0097 % [meta, data] = sglog2mat(filename) 0098 % [meta, data] = sglog2mat(filename, 'format', 'struct') 0099 % options = struct( ... 0100 % 'params', {{'GPS_date' 'GPS_time' 'GPS_latf' 'GPS_lonf' 'CURRENT'}}, ... 0101 % 'format', {'merged'}); 0102 % [meta, data] = sglog2mat(filename, options) 0103 % 0104 % See also: 0105 % SGENG2MAT 0106 % SGLOGCAT 0107 % SGENGCAT 0108 % SGENGLOGMERGE 0109 % 0110 % Authors: 0111 % Joan Pau Beltran <joanpau.beltran@socib.cat> 0112 0113 % Copyright (C) 2013-2016 0114 % ICTS SOCIB - Servei d'observacio i prediccio costaner de les Illes Balears 0115 % <http://www.socib.es> 0116 % 0117 % This program is free software: you can redistribute it and/or modify 0118 % it under the terms of the GNU General Public License as published by 0119 % the Free Software Foundation, either version 3 of the License, or 0120 % (at your option) any later version. 0121 % 0122 % This program is distributed in the hope that it will be useful, 0123 % but WITHOUT ANY WARRANTY; without even the implied warranty of 0124 % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 0125 % GNU General Public License for more details. 0126 % 0127 % You should have received a copy of the GNU General Public License 0128 % along with this program. If not, see <http://www.gnu.org/licenses/>. 0129 0130 error(nargchk(1, 5, nargin, 'struct')); 0131 0132 0133 %% Set options and default values. 0134 options.format = 'array'; 0135 options.params = 'all'; 0136 0137 0138 %% Parse optional arguments. 0139 % Get option key-value pairs in any accepted call signature. 0140 argopts = varargin; 0141 if isscalar(argopts) && isstruct(argopts{1}) 0142 % Options passed as a single option struct argument: 0143 % field names are option keys and field values are option values. 0144 opt_key_list = fieldnames(argopts{1}); 0145 opt_val_list = struct2cell(argopts{1}); 0146 elseif mod(numel(argopts), 2) == 0 0147 % Options passed as key-value argument pairs. 0148 opt_key_list = argopts(1:2:end); 0149 opt_val_list = argopts(2:2:end); 0150 else 0151 error('glider_toolbox:sglog2mat:InvalidOptions', ... 0152 'Invalid optional arguments (neither key-value pairs nor struct).'); 0153 end 0154 % Overwrite default options with values given in extra arguments. 0155 for opt_idx = 1:numel(opt_key_list) 0156 opt = lower(opt_key_list{opt_idx}); 0157 val = opt_val_list{opt_idx}; 0158 if isfield(options, opt) 0159 options.(opt) = val; 0160 else 0161 error('glider_toolbox:sglog2mat:InvalidOption', ... 0162 'Invalid option: %s.', opt); 0163 end 0164 end 0165 0166 0167 %% Set option flags and values. 0168 output_format = lower(options.format); 0169 all_params = false; 0170 if ischar(options.params) && strcmp(options.params, 'all') 0171 all_params = true; 0172 end 0173 param_list = cellstr(options.params); 0174 0175 0176 %% Open the file. 0177 [fid, fid_msg] = fopen(filename, 'r'); 0178 if fid < 0 0179 error('glider_toolbox:sglog2mat:FileError', fid_msg); 0180 end 0181 0182 0183 %% Parse the file. 0184 try 0185 % Read header tags: 0186 header_map = { ... 0187 'version' 'version: %s\n' 0188 'glider' 'glider: %s\n' 0189 'mission' 'mission: %d\n' 0190 'dive' 'dive: %d\n' 0191 'start' 'start: %s\n' }; 0192 header_fields = header_map(:,1); 0193 header_fmtstr = [header_map{:,2}]; 0194 header_values = ... 0195 textscan(fid, header_fmtstr, 1, 'ReturnOnError', false, 'Delimiter', ''); 0196 header_struct = {header_fields{:}; header_values{:}}; 0197 header_struct = struct(header_struct{:}); 0198 header_struct.start = sscanf(header_struct.start, '%d')'; 0199 0200 % Build metadata structure: 0201 % - The filename (without base directory). 0202 % - The log file header lines. 0203 % - The dive start time as POSIX time. 0204 % - Field names for composite parameters. 0205 % - Non-constant metaparameters affecting other parameters. 0206 [~, name, ext] = fileparts(filename); 0207 meta.sources = {[name ext]}; 0208 meta.headers = header_struct; 0209 meta.start_secs = ... 0210 utc2posixtime(datenum([header_struct.start(:, 3) + 1900 ... 0211 header_struct.start(:, [1 2 4 5 6])])); 0212 meta.params = struct(); 0213 meta.GCHEAD = cell(1,0); 0214 meta.DEVICES = cell(1,0); 0215 meta.SENSORS = cell(1,0); 0216 0217 % Read log parameter data: 0218 % GPS parameters to store together and decompose: 0219 gps_field = 'GPSFIX'; 0220 gps_param_list = {'GPS1' 'GPS2' 'GPS'}; 0221 gps_member_list = {'ddmmyy' 'hhmmss' 'fixlat' 'fixlon' 'ttffix' 'hordop' 'ttafix' 'magvar'}; 0222 0223 % Parameters to rename (valid identifiers begin with a letter): 0224 rename_param_map = [ { 0225 '_CALLS' 'CALLS' 0226 '_XMS_NAKs' 'XMS_NAKs' 0227 '_XMS_TOUTs' 'XMS_TOUTs' 0228 '_SM_DEPTHo' 'SM_DEPTHo' 0229 '_SM_ANGLEo' 'SM_ANGLEo' 0230 '24V_AH' 'x24V_AH' 0231 '10V_AH' 'x10V_AH' }; 0232 gps_param_list' repmat({gps_field}, size(gps_param_list))' 0233 ]; 0234 rename_param_list = rename_param_map(:,1); 0235 rename_field_list = rename_param_map(:,2); 0236 0237 % Multi-valued parameters to decompose: 0238 mulval_param_map = [ { 0239 'SPEED_LIMITS' {'min_spd' 'max_spd'} 0240 'TGT_LATLONG' {'tgt_lat' 'tgt_lon'} 0241 'KALMAN_CONTROL' {'spd_east' 'spd_nrth'} 0242 'KALMAN_X' {'cur_mean_east' 'cur_diur_east' 'cur_semi_east' 'gld_wspd_east' 'delta_x'} 0243 'KALMAN_Y' {'cur_mean_nrth' 'cur_diur_nrth' 'cur_semi_nrth' 'gld_wspd_nrth' 'delta_y'} 0244 'MHEAD_RNG_PITCHd_Wd' {'mag_head' 'tgt_rnge' 'ptch_ang' 'vert_vel'} 0245 'GC' {} % from $GCHEAD log line. 0246 'FINISH' {'dpth' 'dens'} 0247 'STATE' {'st_secs' 'status' 'result'} 0248 'SM_CCo' {'st_secs' 'pmp_secs' 'pmp_amps' 'pmp_rets' 'pmp_errs' 'pmp_cnts' 'pmp_ccss'} 0249 'ALTIM_BOTTOM_PING' {'dpth' 'rnge'} 0250 '24V_AH' {'volts_min' 'ampsh_tot'} 0251 '10V_AH' {'volts_min' 'ampsh_tot'} 0252 'DEVICE_SECS' {} % from $DEVICES log line. 0253 'DEVICE_MAMPS' {} % from $DEVICES log line. 0254 'SENSOR_SECS' {} % from $SENSORS line. 0255 'SENSOR_MAMPS' {} % from $SENSORS line. 0256 'DATA_FILE_SIZE' {'bytes' 'samples'} 0257 'CFSIZE' {'bytes_total' 'bytes_free'} 0258 'ERRORS' {'bufoverrun' 'interrupts' ... 0259 'fopen_errs' 'fwrit_errs' 'fclos_errs' 'fopen_rets' 'fwrit_rets' 'fclos_rets' ... 0260 'ptch_errs' 'roll_errs' 'vbd_errs' 'ptch_rets' 'roll_rets' 'vbd_rets' ... 0261 'gps_mis' 'gps_pps'} 0262 'CURRENT' {'cur_spd' 'cur_dir' 'cur_val'} } 0263 gps_param_list' repmat({gps_member_list}, size(gps_param_list))' 0264 ]; 0265 mulval_param_list = mulval_param_map(:,1); 0266 mulval_member_list = mulval_param_map(:,2); 0267 0268 % Non-numeric parameters: 0269 nonum_param_map = { 0270 'TGT_NAME' @cellstr 0271 'GPS1' @(v)([{''} v(1) num2cell(str2double(v(2:end)))]) 0272 'GPS2' @(v)([{''} v(1) num2cell(str2double(v(2:end)))]) 0273 'GPS' @(v)([v(1) v(2) num2cell(str2double(v(3:end)))]) 0274 'STATE' @(v)({str2double(v{1}) v{2:end}}) 0275 'GCHEAD' @cellstr 0276 'DEVICES' @(v)(v(~strcmp(v, 'nil'))) 0277 'SENSORS' @(v)(v(~strcmp(v, 'nil'))) 0278 'RECOV_CODE' @cellstr 0279 'RESTART_TIME' @cellstr 0280 }; 0281 nonum_param_list = nonum_param_map(:,1); 0282 nonum_funct_list = nonum_param_map(:,2); 0283 0284 % Metadata parameters. 0285 meta_params_map = { 0286 'GCHEAD' {'GC'} 0287 'DEVICES' {'DEVICE_SECS' 'DEVICE_MAMPS'} 0288 'SENSORS' {'SENSOR_SECS' 'SENSOR_MAMPS'} 0289 }; 0290 meta_param_list = meta_params_map(:,1); 0291 meta_which_list = meta_params_map(:,2); 0292 0293 % Read each line and parse its contents. 0294 textscan(fid, 'data:\n', 1, 'ReturnOnError', false); 0295 data = struct(); 0296 while ~feof(fid) 0297 logline = fgetl(fid); 0298 parts = regexp(logline, ',', 'split'); 0299 head = parts{1}; 0300 rest = parts(2:end); 0301 if head(1) ~= '$' 0302 error('glider_toolbox:sglog2mat:BadDataLine', ... 0303 'Bad data line: %s.', logline); 0304 end 0305 param = head(2:end); 0306 % Do not use ISMEMBER, it is too slow (it is not designed for this use). 0307 i = find(strcmp(param, rename_param_list), 1); 0308 if i 0309 field = rename_field_list{i}; 0310 else 0311 field = param; 0312 end 0313 i = find(strcmp(param, nonum_param_list), 1); 0314 if i 0315 value_list = nonum_funct_list{i}(rest); 0316 else 0317 % This is 2.5x faster than STR2DOUBLE provided that numbers are real. 0318 value_list = sscanf(sprintf(' %s', rest{:}), '%f')'; 0319 end 0320 i = find(strcmp(param, mulval_param_list), 1); 0321 if i 0322 member_list = mulval_member_list{i}; 0323 else 0324 member_list = cell(0,1); 0325 end 0326 i = find(strcmp(param, meta_param_list), 1); 0327 if i 0328 which_param_list = meta_which_list{i}; 0329 [~, mulval_map_index] = ... 0330 intersect(mulval_param_list, which_param_list); 0331 mulval_member_list(mulval_map_index) = {genvarname(value_list)}; 0332 meta.(field) = value_list; 0333 else 0334 num_mems = numel(member_list); 0335 num_vals = numel(value_list); 0336 if num_vals < num_mems 0337 if isnumeric(value_list) 0338 value_list(num_vals+1:num_mems) = nan; 0339 else 0340 value_list(num_vals+1:num_mems) = {''}; 0341 end 0342 elseif num_vals > 1 0343 member_list(num_mems+1:num_vals) = ... 0344 cellstr(num2str((num_mems+1:num_vals)', 'field%02d')); 0345 end 0346 if all_params || any(strcmp(field, param_list)) 0347 member_select = true; 0348 else 0349 % Avoid ISMEMBER and STRCAT, they decrease performance dramatically. 0350 % The alternative with CELLFUN is less efficient that the loop, too. 0351 %member_select = ismember(strcat(param, '_', member_list), param_list); 0352 member_select = false(size(member_list)); 0353 for member_idx = 1:numel(member_list) 0354 member_select(member_idx) = ... 0355 any(strcmp([field '_' member_list{member_idx}], param_list)); 0356 end 0357 member_list = member_list(member_select); 0358 value_list = value_list(member_select); 0359 end 0360 if any(member_select) 0361 if isfield(data, field) 0362 data.(field)(end+1, :) = value_list; 0363 else 0364 meta.params.(field) = member_list(:); 0365 data.(field) = value_list; 0366 end 0367 end 0368 end 0369 end 0370 catch exception 0371 % Close the file after a reading error. 0372 fclose(fid); 0373 rethrow(exception); 0374 end 0375 0376 0377 %% Close the file after successful reading. 0378 fclose(fid); 0379 0380 0381 %% Convert data to desired format: 0382 switch output_format 0383 case 'array' 0384 case 'merged' 0385 field_list = fieldnames(data); 0386 for field_idx = 1:numel(field_list) 0387 field = field_list{field_idx}; 0388 member_list = meta.params.(field); 0389 value_list = data.(field); 0390 if ~isempty(member_list) 0391 for member_idx = 1:numel(member_list) 0392 member = member_list{member_idx}; 0393 field_member = [field '_' member]; 0394 value = value_list(:, member_idx); 0395 if iscell(value) && ~iscellstr(value) 0396 value = vertcat(value{:}); 0397 end 0398 data.(field_member) = value; 0399 end 0400 data = rmfield(data, field); 0401 end 0402 end 0403 case 'struct' 0404 field_list = fieldnames(data); 0405 for field_idx = 1:numel(field_list) 0406 field = field_list{field_idx}; 0407 member_list = meta.params.(field); 0408 value_list = data.(field); 0409 if ~isempty(member_list) 0410 if isnumeric(value_list) 0411 value_list = num2cell(value_list); 0412 end 0413 data.(field) = cell2struct(value_list, member_list, 2); 0414 end 0415 end 0416 otherwise 0417 error('glider_toolbox:sglog2mat:InvalidFormat', ... 0418 'Invalid output format: %s.', output_format) 0419 end 0420 0421 end