Monitoring Complex Industrial Processes and Bringing Them Back-on-Track After Instabilities Have Been Met

Monitoring Complex Industrial Processes and Bringing Them Back-on-Track After Instabilities Have Been Met

Contributed by: Koen Knapen, Principal Consultant Analytics at SAS.

Process manufacturing is a branch of manufacturing that is associated with formulas and manufacturing recipes, and can be contrasted with discrete manufacturing, which is concerned with discrete units, bills of materials and the assembly of components. Examples of the former are: food and beverages, specialty chemicals, bulk-drug pharmaceuticals (pharmaceutical formulations), biotech products, …

Batch production refers to a method in manufacturing in which products are created as specified groups, or amounts, that each go through a series of steps to make the desired product. The entire time frame for getting the finished product(s) can be easily one or two weeks.

This article looks into the monitoring of complex industrial processes in the context of batch production in process manufacturing. The following aspects will be discussed:

  1. Multivariate statistical process control and stability monitoring
  2. Soft sensors (virtual sensors) to have, at any time, visibility of the (probable) quality and yield at the end of the batch
  3. Optimization to make informed adjustments (and ensure batch objectives / goals as far as possible)

Multivariate statistical process control and stability monitoring

Univariate control charts have been ubiquitous in manufacturing and Statistical Quality Control for almost 100 years (and rightly so). Think Shewhart charts, cumulative sum (CUSUM) charts and exponentially weighted moving average (EWMA) charts.

Due to their univariate nature, these charts may give a false image of control. Hence, process engineers should look at the consistency (coherence, cross-correlations) of hundreds of process parameters during periods of stable production to make a model. That explanatory and well-calibrated model should then be implemented and deployed on (live) streaming data such that consistency can be monitored over time (by control-room operators and process engineers); deviations from that quantified coherence ultimately leads to problems that one wants to avoid. Models with hundreds of parameters are not uncommon, though keep in mind the curse-of-dimensionality when creating a model: if you can isolate several dozen process parameters to create a model with, then go for the smaller model.

Hotelling’s T² and Squared Prediction Error (SPE), in a plot format, are popular measures to detect an out-of-control situation and a significant deviation from “steady state”. Moreover, changes in a previously stable process can be diagnosed (for example with “contribution plots” for individual observations).

The underlying multivariate statistical techniques to arrive at those charts are Principal Components Analysis (or PCA) and, used to an even greater extent, Partial Least Squares regression (PLS). SVDD-based nonparametric multivariate control charts are also increasingly popular. SVDD stands for Support Vector Data Description, inspired by the Support Vector Classifier and is a one-class classifier and used to detect anomalies.

One can work even more “intelligently” and build stability monitoring models using time series and forecasting techniques (like transfer function modeling or state space models).

The analytical approach of stability monitoring is based on the assumption that within a subset of sensors, events, or both (collectively called “signals”) that are related to the system being monitored there exists a “target” signal whose behavior can be explained by other signals (called “explanatory” signals). The main idea is that in a healthy system (that is, during a period of time where the process is in steady-state) there is a robust statistical relationship between the target signal and explanatory signals. After the relevant statistical model is selected, fitted on historical data during stable periods, and saved, it can then be used for monitoring data that arrive in a new time period. With the new data that contain both target and explanatory signals, the stored model is used to generate a forecast of a target signal from the explanatory signals (this process is called “scoring”). The forecast is then compared to the actual values of the target signal, and, based on a set of user-defined rules, anomalies in the behavior of the monitored system during the new time period can be flagged.

On top of the process variables, it is recommended to include «controller output» as one of the important variables to monitor for stability (jointly with process variables). In absence of controller output, the process may appear stable, but it might be just the side effect of controller controlling the process. When this happens, the signal in the process data is very transient, and stability monitoring could/would fail.

Soft sensors (virtual sensors) to have, at any time, visibility of the (probable) quality and yield at the end of the batch

Machine Learning (ML) models are becoming widely used to formulate and describe processes’ key metrics. Specifically, for process manufacturing batches, ML models can explain and predict quality as well as yielding metrics using manufacturing settings as regressors.

When a batch takes one or two weeks to complete, it is always advantageous, at any time during that time frame, to have a forecast (prognosis) of the final yield and quality. That way, adjustments to the process can also be made if necessary.

Such ML models can be activated by a time trigger (e.g. every hour as from day 2 onwards) or by an event trigger (e.g. after an instability was overcome).

Soft sensors, especially in process industry, are highly relevant because the quality determination cannot be done in real-time. A sample is usually taken to the lab for quality determination. Infrequent quality determination (like every 6 hours) can lead to delayed responses to adjust the process … because, between those measurements separated by six hours, one actually sails blind.

Optimization to make informed adjustments (and ensure batch objectives / goals as far as possible)

Once one knows how the settings affect the key metric, one can do «goal programming» using an optimization model. That can be especially useful, after a serious instability incident, to try to set up the process so that the negative effect (impact) of that instability is erased to the extent possible. The goal of such a model is to find, starting from the present situation, the right combination of manufacturing settings to maximize yield (or minimize costs), while keeping the key quality metric within required bounds.

However, unlike linear regression models which have a “natural fit” within linear optimization formulations, optimization formulations and solution methodologies become more challenging with more sophisticated models such as Neural Nets, or Gradient Boosting models.

A key challenge here however is that incorporating non-closed-form and nonlinear models such as Neural Nets or Gradient Boosting models) in optimization does not allow for traditional sound-and-proof algorithms to work (such as branch-and-bound or simplex).

Fortunately, cutting-edge solvers exist which are getting good at tackling this challenge. These cutting-edge solvers optimize general nonlinear problems by simultaneously applying multiple instances of global and local search algorithms in parallel. Given a suitable number of threads and processors, the resulting run time should be reasonable and solution quality should be valuable up to even excellent.

There’s only one warning: be cautious and careful with that optimization. Since the ML-models were built with observational data from the past (some years of production), these models deliver less info on the «response surface» and are hence less reliable than DOE models (DOE = Design of Experiments). That’s because observational data from the past can be rather “one-sided” (one-way data without a lot of variability in all directions).

One needs to avoid the optimization suggesting a combination of setpoints that did rarely occur in the past, if at all. The point estimates for yield and quality may look interesting in that space, but the interval estimates will suggest that we better not go there (far too much uncertainty). Therefore, a crucial element is to build boundaries into this optimization-step.

References: