432 Newell Drive
Gainesville, FL 32603
Mostafa Reisi Gahrooei
Georgia Institute of Technology
Abstract: Modeling and Improvement of Complex Systems with High Dimensions Heterogeneous Data
Complex systems in manufacturing, energy, and healthcare sectors are equipped with numerous sensors that generate massive amount of high-dimensional (HD), heterogenous data. By extracting the right information from such data, efficient models for the purpose of monitoring, assessing, and improving systems can be constructed.
In this talk, two topics related to this area are presented. First, a sequential approach for sampling high-accuracy (HA) data based on the information obtained from low-accuracy (LA) data is presented. In several applications, a large amount of LA data can be acquired at a small cost. However, such LA data is not sufficient for generating a high-fidelity model of a system. To adjust and improve the model constructed by LA data, a small sample of HA data, which is expensive to obtain, is usually fused with the LA data. Unfortunately, current data fusion techniques assume that the HA data is already collected and concentrate on fusion strategies, without providing guidelines on how to sample the HA data. To address this issue, an approach that takes advantage of the information provided by LA data as well as the previously selected HA data points and computes an improvement criterion over a design space to choose the next HA data point is proposed. The results of simulation and case studies illustrate the importance of intelligent sampling of HA data in reducing the cost and improving the model accuracy.
The second topic focuses on the problem of estimating a process output, measured by a scalar, curve, image, or structured point cloud by a set of heterogeneous process variables such as scalar process setting, profile sensor readings, and/or images. To create a unified modeling framework that effectively combines different forms of data points while exploiting the correlation structure within an HD data point, a general multiple tensor-on-tensor regression (MTOT) approach is proposed. In order to avoid overfitting and to reduce the number of parameters to be estimated, model parameters are decomposed using several basis matrices that span the input and output spaces. An efficient optimization algorithm for learning the bases and coefficients is provided. Several simulation and case studies reveal the advantage of the proposed method over some benchmarks in the literature in terms of the mean square prediction error.
Department of Industrial and Systems Engineering at the University of Florida