Content area
Full text
Abstract
Predictability of a manufacturing process or system is vital in virtual manufacturing. Various data mining techniques are available in developing predictive models. Cross validation is critical in determining the quality of a predictive model and the costs in data collection and data mining. Several cross-validation (CV) techniques are available, including the v-fold CV, leave-one-out CV, and the bootstrap type of CV. Some past studies have not revealed any statistical advantages of using tenfold cross validation over fivefold cross validation. Determining the number of hidden layers is important in predictive modeling with neural networks. This study attempts to compare the performance of fivefold over threefold CV and that of one-hidden-layer over two-hidden-layer neural nets in predictive modeling for surface roughness parameters defined in ISO 13565 for turning and honing. Statistical hypothesis tests and different prediction errors are employed to compare the competitive models. This study does not reveal any significant statistical advantages of using fivefold CV over threefold CV and of using two-hidden-layer neural nets over one-hidden-layer neural nets for the cases under study. Furthermore, the procedure presented here is applicable in comparing competitive data modeling or data mining methods.
Keywords: Data Mining; Cross Validation; Neural Networks; Predictive Modeling; Machining Surface Roughness; ISO 13565
Introduction
Predictability of a manufacturing process or system is vital in moving toward virtual manufacturing. Without a proper model, it is not possible to predict the outcome of a manufacturing process or system. Various data mining techniques are available in developing predictive models. Finding such a model is difficult, interesting, and sometimes rewarding. A model may originate from introspection or observation (or both) (Gershenfeld 1999). Although developing an analytical model is feasible in some simplified situations, most manufacturing processes are complex and, therefore, empirical models that are less general, more practical, and less expensive than the analytical models are of interest.
Box and Draper (1987) and Gershenfeld (1999) classified mathematical models into analytical (or mechanistic) and empirical (or observational). Both regression analysis (RA) and neural networks (NN) have been used for years in empirical modeling and have only recently been termed data mining (Witten and Frank 2000; Groth 1998).
Neural networks possess a number of attractive properties for modeling a complex product system and manufacturing process or system: universal...