DM Answers

Data Mining with STATISTICA

Multivariate SPC

5th of November, 2013

I would like to continue talking about Statistical Process Control, but this time I would like to focus on the case where there are multiple correlated metrics being monitored.  Should each of the metrics be monitored separately or would it make more sense to consider them together in a multivariate analysis?  I hope to answer this question in today’s blog post.

There is an R package called MSQC that provides some multivariate SPC data with correlated inputs.  The data set I will consider today is called bimetal1.  This is a dataset consisting of five physical characteristics for 28 bimetal thermometers.  The thermometers are constructed by fusing two different metals together that expand and contract at different rates.  The five physical characteristics that are recorded should stay fairly constant, but is this enough evidence to say that each of the thermometers are good?

I will demonstrate in the video below that the metrics are correlated.  This fact I believe answers the question above.  Since the metrics are correlated we should consider them together in the analysis.  If the metrics are considered separately, we are missing out on the additional information contained in the correlations.  A potential special cause could be indicated when a particular sample breaks from the correlation seen in the past.

Here is the data set in Excel format for those that would like to replicate the analysis shown in the video:


Notice in the following video that I made a mistake when I created the ellipse on the scatter plot.  If I would have used the normal ellipse with .95 and .99 confidence instead of range, the conclusions drawn from the scatterplot would have been consistent with the multivariate SPC charts.



As I mentioned in the video I wish I could have come up with a public domain data set that would have shown no signals for the individual data points and a signal for the multivariate SPC chart.  I hope data point number 20 was enough to help you see the potential for the multivariate SPC charts.  I would like to say that I know from personal experience that these types of charts can be very useful when there are multiple correlated metrics being monitored.  Unfortunately these data are proprietary and I cannot share them.

Next time I plan on demonstrating the Predictive SPC charting functionality from the STATISTICA QC Miner package.  Until then, I hope you enjoy discovering Multivariate SPC charting in STATISTICA.  If you find any other errors or if you have any questions, please feel free to make a comment below.

Leave a Reply

Your email address will not be published. Required fields are marked *