dbacat

PURPOSE ^

DBACAT Combine data from several dba data sets into a single data set.

SYNOPSIS ^

function [meta, data] = dbacat(meta_list, data_list, timestamp, varargin)

DESCRIPTION ^

DBACAT  Combine data from several dba data sets into a single data set.

  Syntax:
    [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP)
    [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP, OPTIONS)
    [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP, OPT1, VAL1, ...)

  Description:
    [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP) combines data from 
    arrays in cell array DATA_LIST and metadata from structs in cell array 
    META_LIST into a single data set with data in array DATA and metadata in 
    struct array META. Elements in META_LIST and DATA_LIST should have the
    format returned by function DBA2MAT, but they do not need to have the same
    set of sensors. Outputs META and DATA have the same format, too.
    META is a struct array combining the information in elements of META_LIST.
    It has following fields:
      HEADERS: struct array built concatenating the HEADERS field of all 
        elements in META_LIST.
      SENSORS: string cell array with the names of the sensors present in the
        returned data array (in the same column order), built merging 
        the SENSORS field of all elements in META_LIST.
      UNITS: string cell array with the units of the sensors present in the
        returned data array (in the same column order), built merging 
        the UNITS field of all elements in META_LIST.
      BYTES: array with the number of bytes of each sensor present in the 
        returned data array, (in the same column order), built merging 
        the BYTES field of all elements in META_LIST.
      SOURCES: string cell array built concatenating the SOURCES field
        of all elements in META_LIST.
    DATA is a numeric array combining the rows of arrays in DATA_LIST,
    reordering the sensor columns if needed, and sorting the resulting rows
    according to a timestamp from sensor named by string TIMESTAMP.

    [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP, OPTIONS) and 
    [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP, OPT1, VAL1, ...) 
    accept the following options given in key-value pairs OPT1, VAL1...
    or in a struct OPTIONS with field names as option keys and field values
    as option values:
      FORMAT: data output format.
        String setting the format of the output DATA. Valid values are:
          'array': DATA is a matrix with sensor readings in the column order
            specified by the SENSORS metadata field.
          'struct': DATA is a struct with sensor names as field names
            and column vectors of sensor readings as field values.
        Default value: 'array'
      SENSORS: sensor filtering list.
        String cell array with the names of the sensors of interest.
        If given, only sensors present in both the input data sets and this
        list will be present in output. The string 'all' may also be given,
        in which case sensor filtering is not performed and all sensors
        in the input list will be present in output.
        Default value: 'all' (do not perform sensor filtering).
      PERIOD: time filtering boundaries.
        Two element numeric array with the start and the end of the period
        of interest (seconds since 1970-01-01 00:0:00.00 UTC). If given,
        only sensor cycles with timestamps within this period will be present
        in output. The string 'all' may also be given, in which case time 
        filtering is not performed and all sensors cycles in the input list
        will be present in output.
        Default value: 'all' (do not perform time filtering).

  Notes:
    This function should be used to combine data from several navigation files,
    or from several science files, but not from both navigation and science
    files (use DBAMERGE instead).

    Since sensor cycles (data rows) with the same timestamp may be present 
    in several data sets (e.g. when combining data from sbd and mbd files),
    the function checks that data in those sensor cycles are consistent.
    If the same sensor is present in sensor cycles from different data sets
    with the same timestamp and different valid values (not NaN), an error is
    thrown. Otherwise the values are merged into a single sensor cycle.

    However, note that sensor cycles with the same timestamp in the same data
    set are not merged, and the values in the latest sensor cycle will be used.
    This may be relevant when binary data files are converted to ascii format 
    with the option -o (the initial sensor cycle values may be omited if the
    timestamp in the following sensor cycle is the same).

    All values in the timestamp columns should be valid (not NaN).

  Examples:
    [meta, data] = dbacat(meta_list, data_list, timestamp)

  See also:
    XBD2DBA
    DBA2MAT
    DBAMERGE

  Authors:
    Joan Pau Beltran  <joanpau.beltran@socib.cat>

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

DOWNLOAD ^

dbacat.m

SOURCE CODE ^

0001 function [meta, data] = dbacat(meta_list, data_list, timestamp, varargin)
0002 %DBACAT  Combine data from several dba data sets into a single data set.
0003 %
0004 %  Syntax:
0005 %    [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP)
0006 %    [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP, OPTIONS)
0007 %    [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP, OPT1, VAL1, ...)
0008 %
0009 %  Description:
0010 %    [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP) combines data from
0011 %    arrays in cell array DATA_LIST and metadata from structs in cell array
0012 %    META_LIST into a single data set with data in array DATA and metadata in
0013 %    struct array META. Elements in META_LIST and DATA_LIST should have the
0014 %    format returned by function DBA2MAT, but they do not need to have the same
0015 %    set of sensors. Outputs META and DATA have the same format, too.
0016 %    META is a struct array combining the information in elements of META_LIST.
0017 %    It has following fields:
0018 %      HEADERS: struct array built concatenating the HEADERS field of all
0019 %        elements in META_LIST.
0020 %      SENSORS: string cell array with the names of the sensors present in the
0021 %        returned data array (in the same column order), built merging
0022 %        the SENSORS field of all elements in META_LIST.
0023 %      UNITS: string cell array with the units of the sensors present in the
0024 %        returned data array (in the same column order), built merging
0025 %        the UNITS field of all elements in META_LIST.
0026 %      BYTES: array with the number of bytes of each sensor present in the
0027 %        returned data array, (in the same column order), built merging
0028 %        the BYTES field of all elements in META_LIST.
0029 %      SOURCES: string cell array built concatenating the SOURCES field
0030 %        of all elements in META_LIST.
0031 %    DATA is a numeric array combining the rows of arrays in DATA_LIST,
0032 %    reordering the sensor columns if needed, and sorting the resulting rows
0033 %    according to a timestamp from sensor named by string TIMESTAMP.
0034 %
0035 %    [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP, OPTIONS) and
0036 %    [META, DATA] = DBACAT(META_LIST, DATA_LIST, TIMESTAMP, OPT1, VAL1, ...)
0037 %    accept the following options given in key-value pairs OPT1, VAL1...
0038 %    or in a struct OPTIONS with field names as option keys and field values
0039 %    as option values:
0040 %      FORMAT: data output format.
0041 %        String setting the format of the output DATA. Valid values are:
0042 %          'array': DATA is a matrix with sensor readings in the column order
0043 %            specified by the SENSORS metadata field.
0044 %          'struct': DATA is a struct with sensor names as field names
0045 %            and column vectors of sensor readings as field values.
0046 %        Default value: 'array'
0047 %      SENSORS: sensor filtering list.
0048 %        String cell array with the names of the sensors of interest.
0049 %        If given, only sensors present in both the input data sets and this
0050 %        list will be present in output. The string 'all' may also be given,
0051 %        in which case sensor filtering is not performed and all sensors
0052 %        in the input list will be present in output.
0053 %        Default value: 'all' (do not perform sensor filtering).
0054 %      PERIOD: time filtering boundaries.
0055 %        Two element numeric array with the start and the end of the period
0056 %        of interest (seconds since 1970-01-01 00:0:00.00 UTC). If given,
0057 %        only sensor cycles with timestamps within this period will be present
0058 %        in output. The string 'all' may also be given, in which case time
0059 %        filtering is not performed and all sensors cycles in the input list
0060 %        will be present in output.
0061 %        Default value: 'all' (do not perform time filtering).
0062 %
0063 %  Notes:
0064 %    This function should be used to combine data from several navigation files,
0065 %    or from several science files, but not from both navigation and science
0066 %    files (use DBAMERGE instead).
0067 %
0068 %    Since sensor cycles (data rows) with the same timestamp may be present
0069 %    in several data sets (e.g. when combining data from sbd and mbd files),
0070 %    the function checks that data in those sensor cycles are consistent.
0071 %    If the same sensor is present in sensor cycles from different data sets
0072 %    with the same timestamp and different valid values (not NaN), an error is
0073 %    thrown. Otherwise the values are merged into a single sensor cycle.
0074 %
0075 %    However, note that sensor cycles with the same timestamp in the same data
0076 %    set are not merged, and the values in the latest sensor cycle will be used.
0077 %    This may be relevant when binary data files are converted to ascii format
0078 %    with the option -o (the initial sensor cycle values may be omited if the
0079 %    timestamp in the following sensor cycle is the same).
0080 %
0081 %    All values in the timestamp columns should be valid (not NaN).
0082 %
0083 %  Examples:
0084 %    [meta, data] = dbacat(meta_list, data_list, timestamp)
0085 %
0086 %  See also:
0087 %    XBD2DBA
0088 %    DBA2MAT
0089 %    DBAMERGE
0090 %
0091 %  Authors:
0092 %    Joan Pau Beltran  <joanpau.beltran@socib.cat>
0093 
0094 %  Copyright (C) 2013-2016
0095 %  ICTS SOCIB - Servei d'observacio i prediccio costaner de les Illes Balears
0096 %  <http://www.socib.es>
0097 %
0098 %  This program is free software: you can redistribute it and/or modify
0099 %  it under the terms of the GNU General Public License as published by
0100 %  the Free Software Foundation, either version 3 of the License, or
0101 %  (at your option) any later version.
0102 %
0103 %  This program is distributed in the hope that it will be useful,
0104 %  but WITHOUT ANY WARRANTY; without even the implied warranty of
0105 %  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
0106 %  GNU General Public License for more details.
0107 %
0108 %  You should have received a copy of the GNU General Public License
0109 %  along with this program.  If not, see <http://www.gnu.org/licenses/>.
0110 
0111   error(nargchk(3, 9, nargin, 'struct'));
0112   
0113   
0114   %% Set options and default values.
0115   options.format = 'array';
0116   options.sensors = 'all';
0117   options.period = 'all';
0118   
0119   
0120   %% Parse optional arguments.
0121   % Get option key-value pairs in any accepted call signature.
0122   argopts = varargin;
0123   if isscalar(argopts) && isstruct(argopts{1})
0124     % Options passed as a single option struct argument:
0125     % field names are option keys and field values are option values.
0126     opt_key_list = fieldnames(argopts{1});
0127     opt_val_list = struct2cell(argopts{1});
0128   elseif mod(numel(argopts), 2) == 0
0129     % Options passed as key-value argument pairs.
0130     opt_key_list = argopts(1:2:end);
0131     opt_val_list = argopts(2:2:end);
0132   else
0133     error('glider_toolbox:dbacat:InvalidOptions', ...
0134           'Invalid optional arguments (neither key-value pairs nor struct).');
0135   end
0136   % Overwrite default options with values given in extra arguments.
0137   for opt_idx = 1:numel(opt_key_list)
0138     opt = lower(opt_key_list{opt_idx});
0139     val = opt_val_list{opt_idx};
0140     if isfield(options, opt)
0141       options.(opt) = val;
0142     else
0143       error('glider_toolbox:dbacat:InvalidOption', ...
0144             'Invalid option: %s.', opt);
0145     end
0146   end
0147   
0148   
0149   %% Set option flags and values.
0150   output_format = lower(options.format);
0151   sensor_filtering = true;
0152   sensor_list = cellstr(options.sensors);
0153   time_filtering = true;
0154   time_range = options.period;
0155   if ischar(options.sensors) && strcmp(options.sensors, 'all')
0156     sensor_filtering = false;
0157   end
0158   if ischar(options.period) && strcmp(options.period, 'all')
0159     time_filtering = false;
0160   end
0161   
0162   
0163   %% Cat data and metadata checkin for trivial empty input.
0164   % Check for trivial empty input.
0165   if isempty(meta_list)
0166     sources_cat = cell(0, 1);
0167     headers_cat = ...
0168       struct('dbd_label', cell(0, 1), 'encoding_ver', cell(0, 1), ...
0169              'num_ascii_tags', cell(0, 1), 'all_sensors', cell(0, 1), ...
0170              'filename', cell(0, 1), 'the8x3_filename', cell(0, 1), ...
0171              'filename_extension', cell(0, 1), 'filename_label', cell(0, 1), ...
0172              'mission_name', cell(0, 1), 'fileopen_time', cell(0, 1), ...
0173              'sensors_per_cycle', cell(0, 1), 'num_label_lines', cell(0, 1), ...
0174              'num_segments', cell(0, 1), 'segment_filenames', cell(0, 1));
0175     sensors_cat_list = cell(0, 1);
0176     units_cat_list = cell(0, 1);
0177     bytes_cat_list = cell(0, 1);
0178   else
0179     meta_struct = [meta_list{:}];
0180     sources_cat = vertcat(meta_struct.sources);
0181     headers_cat = vertcat(meta_struct.headers);
0182     sensors_cat_list = {meta_struct.sensors}';
0183     units_cat_list = {meta_struct.units}';
0184     bytes_cat_list = {meta_struct.bytes}';
0185   end
0186   
0187   % Build list of sensor information for concatenated data and metadata.
0188   [~, ~, sensors_cat_indices_to] = unique(vertcat(sensors_cat_list{:}));
0189   sensors_cat = cell(0, 1);
0190   units_cat = cell(0, 1);
0191   bytes_cat = zeros(0, 1, 'int32');
0192   sensors_cat(sensors_cat_indices_to) = vertcat(sensors_cat_list{:});
0193   units_cat(sensors_cat_indices_to) = vertcat(units_cat_list{:});
0194   bytes_cat(sensors_cat_indices_to) = vertcat(bytes_cat_list{:});
0195   
0196   % Build list of indices of input data entries in concatenated data output.
0197   stamp_cat_list = cellfun(@(d, m) d(:, strcmp(timestamp, m.sensors)), ...
0198                            data_list(:), meta_list(:), 'UniformOutput', false);
0199   [~, ~, stamp_cat_indices_to] = unique(vertcat(stamp_cat_list{:}));
0200   stamp_cat = zeros(0, 1);
0201   stamp_cat(stamp_cat_indices_to) = vertcat(stamp_cat_list{:});
0202   
0203   % Build list of indices of input data entries in concatenated data output.
0204   row_num_total = numel(stamp_cat);
0205   row_num_list = cellfun(@numel, stamp_cat_list(:));
0206   row_end_list = cumsum(row_num_list);
0207   row_start_list = 1 + [0; row_end_list(1:end-1)];
0208   col_num_total = numel(sensors_cat);
0209   col_num_list = cellfun(@numel, sensors_cat_list(:));
0210   col_end_list = cumsum(col_num_list);
0211   col_start_list = 1 + [0; col_end_list(1:end-1)];
0212   
0213   % Set output concatenated data checking for consistency of overlapped data.
0214   data = nan(row_num_total, col_num_total);
0215   for data_idx = 1:numel(data_list)
0216     row_range = row_start_list(data_idx):row_end_list(data_idx);
0217     row_indices = stamp_cat_indices_to(row_range);
0218     col_range = col_start_list(data_idx):col_end_list(data_idx);
0219     col_indices = sensors_cat_indices_to(col_range);
0220     data_old = data(row_indices, col_indices);
0221     data_new = data_list{data_idx};
0222     data_old_valid = ~isnan(data_old);
0223     data_new_valid = ~isnan(data_new);
0224     data_inconsistent = ...
0225       (data_old ~= data_new) & data_old_valid & data_new_valid;
0226     if any(data_inconsistent(:))
0227       [row_inconsistent, col_inconsistent] = find(data_inconsistent);
0228       err_msg_arg_list = cell(4, numel(row_inconsistent));
0229       err_msg_arg_list(1, :) = sensors_cat(col_indices(col_inconsistent));
0230       err_msg_arg_list(2, :) = num2cell(stamp_cat(row_indices(row_inconsistent)));
0231       err_msg_arg_list(3, :) = num2cell(data_old(data_inconsistent));
0232       err_msg_arg_list(4, :) = num2cell(data_new(data_inconsistent));
0233       err_msg_fmt = '\nInconsistent value of %s at %12f: %12f %12f';
0234       error('glider_toolbox:dbacat:InconsistentData', ...
0235             'Inconsistent data:%s', sprintf(err_msg_fmt, err_msg_arg_list{:}));
0236     end
0237     data_old(data_new_valid) = data_new(data_new_valid);
0238     data(row_indices, col_indices) = data_old;
0239   end
0240   
0241   % Set metadata fields.
0242   meta.sources = sources_cat;
0243   meta.headers = headers_cat;
0244   meta.sensors = sensors_cat;
0245   meta.units = units_cat;
0246   meta.bytes = bytes_cat;
0247   
0248   
0249   %% Perform time filtering if needed.
0250   if time_filtering
0251     stamp_select = ~(stamp_cat < time_range(1) | stamp_cat > time_range(2));
0252     data = data(stamp_select, :);
0253   end
0254   
0255   
0256   %% Perform sensor filtering if needed.
0257   if sensor_filtering
0258     [sensor_select, ~] = ismember(meta.sensors, sensor_list);
0259     meta.sensors = meta.sensors(sensor_select);
0260     meta.units = meta.units(sensor_select);
0261     meta.bytes = meta.bytes(sensor_select);
0262     data = data(:, sensor_select);
0263   end
0264   
0265   
0266   %% Convert output data to struct format if needed.
0267   switch output_format
0268     case 'array'
0269     case 'struct'
0270       data = cell2struct(num2cell(data, 1), meta.sensors, 2);
0271     otherwise
0272       error('glider_toolbox:dbacat:InvalidFormat', ...
0273             'Invalid output format: %s.', output_format)
0274   end
0275 
0276 end

Generated on Fri 06-Oct-2017 10:47:42 by m2html © 2005