Threefold vs. Fivefold Cross Validation in

Try and log in through your institution to see if they have access to the full text.

Full text

Headnote

Abstract

Predictability of a manufacturing process or system is vital in virtual manufacturing. Various data mining techniques are available in developing predictive models. Cross validation is critical in determining the quality of a predictive model and the costs in data collection and data mining. Several cross-validation (CV) techniques are available, including the v-fold CV, leave-one-out CV, and the bootstrap type of CV. Some past studies have not revealed any statistical advantages of using tenfold cross validation over fivefold cross validation. Determining the number of hidden layers is important in predictive modeling with neural networks. This study attempts to compare the performance of fivefold over threefold CV and that of one-hidden-layer over two-hidden-layer neural nets in predictive modeling for surface roughness parameters defined in ISO 13565 for turning and honing. Statistical hypothesis tests and different prediction errors are employed to compare the competitive models. This study does not reveal any significant statistical advantages of using fivefold CV over threefold CV and of using two-hidden-layer neural nets over one-hidden-layer neural nets for the cases under study. Furthermore, the procedure presented here is applicable in comparing competitive data modeling or data mining methods.

Keywords: Data Mining; Cross Validation; Neural Networks; Predictive Modeling; Machining Surface Roughness; ISO 13565

Introduction

Predictability of a manufacturing process or system is vital in moving toward virtual manufacturing. Without a proper model, it is not possible to predict the outcome of a manufacturing process or system. Various data mining techniques are available in developing predictive models. Finding such a model is difficult, interesting, and sometimes rewarding. A model may originate from introspection or observation (or both) (Gershenfeld 1999). Although developing an analytical model is feasible in some simplified situations, most manufacturing processes are complex and, therefore, empirical models that are less general, more practical, and less expensive than the analytical models are of interest.

Box and Draper (1987) and Gershenfeld (1999) classified mathematical models into analytical (or mechanistic) and empirical (or observational). Both regression analysis (RA) and neural networks (NN) have been used for years in empirical modeling and have only recently been termed data mining (Witten and Frank 2000; Groth 1998).

Neural networks possess a number of attractive properties for modeling a complex product system and manufacturing process or system: universal...

Show less

Threefold vs. Fivefold Cross Validation in One-Hidden-Layer and Two-Hidden-Layer Predictive Neural Network Modeling of Machining Surface Roughness Data

Full text

Suggested sources

Threefold vs. Fivefold Cross Validation in One-Hidden-Layer and Two-Hidden-Layer Predictive Neural Network Modeling of Machining Surface Roughness Data

Content area

Full text

Suggested sources