INTRODUCTION
The first carbon hit rate (FCHR) is the core indicator in the steelmaking process, directly correlating with enterprises' economic efficiency, resource utilisation, and environmental protection. Many factors, such as billet quality, oxygen consumption, and smelting temperature, affect the FCHR. Among them, oxygen consumption is the most critical factor in determining the quality of steel and the efficiency of steel production. Therefore, accurately predicting smelting oxygen consumption and further predicting the FCHR is crucial for guiding steel production. With the adjustment of industrial structure and the development of intelligent technology, steel companies are actively seeking intelligent and automated smelting solutions. Applying machine learning models provides new possibilities for predicting and analysing the steelmaking process. By leveraging these advanced technologies, steel companies can control the steelmaking process more effectively, improve the FCHR, and achieve more efficient and environmentally friendly production.
Currently, there are two main strategies for machine learning to predict the FCHR in steelmaking processes: feature-based methods and end-to-end prediction. The feature-based approach focuses on extracting critical features from the steelmaking process, such as the chemical composition, temperature, and oxygen consumption of the molten steel. These features are then fed into machine learning models for learning and prediction. In particular, deep learning algorithms have demonstrated superior performance in this domain, as they can automatically extract features from massive amounts of data and accurately predict the FCHR. On the other hand, the end-to-end prediction method is more direct. It does not require explicitly extracting features but instead directly inputs various parameters from the steelmaking process. Utilising the powerful fitting capabilities and global information capture abilities of deep learning models, it can learn and output the prediction of the FCHR directly from the raw data. It is worth mentioning that Wang Shuanglong has proposed an innovative dynamic control technology for furnace gas analysis. This technology has successfully increased the FCHR to 82%, providing new possibilities for efficiency improvement and quality control in the steelmaking industry [1].
In the realm of oxygen consumption prediction research, prediction models primarily fall into three categories: mechanistic models, statistical models, and intelligent models. Guanzhou et al. made a groundbreaking contribution by developing a data model tailored to the unique mechanistic characteristics of the converter steelmaking process [2]. This model, which incorporates interval constraints and extreme gradient boosting, enhances the precision and accuracy of oxygen consumption prediction in steelmaking. It builds upon the efficiency of the established XGBoost model, offering a refined approach to predicting oxygen consumption. Meanwhile, some organisations, such as Haiyue Yufeng Iron and Steel Co., Ltd., favour statistical methods. They have uncovered latent patterns through a rigorous analysis of vast amounts of data, establishing a linear regression equation between oxygen consumption and various characteristics. This approach provides robust support for predicting oxygen consumption, underscoring the importance of statistical methods in this field. Guangjun et al. have further expanded on this statistical approach [3]. Using statistical regression analysis methods, they successfully constructed and optimised a multiple linear regression model between static blowing oxygen quantity and scrap steel quantity. This advancement demonstrates the utility of statistical models in capturing complex relationships and predicting oxygen consumption accurately.
A prediction model combining a grey system with a back propagation (BP) neural network optimised by genetic algorithms has been proposed in intelligent models [4]. This model leverages the neural networks exceptional non-linear fitting capabilities to predict oxygen consumption, albeit with limited interpretability effectively. Lizhong et al. have integrated mechanistic methods with BP neural networks, establishing a static model that significantly enhances the hit rate of endpoint predictions [5]. Building upon these foundations, Yanmin [6] and Shicun et al. [7] have employed dynamic data-driven approaches and mechanistic analysis methods. This innovative approach integrates the mechanistic analysis results of dynamic data into neural network models, thereby significantly enhancing the prediction performance of these models. These advancements highlight the potential of intelligent models in oxygen consumption prediction and their role in optimising steelmaking processes. A novel approach that utilises a Bayesian algorithm-optimised Light Gradient Boosting Machine deep learning model has been introduced to enhance prediction outcomes [8]. This innovative model seamlessly integrates deep learning techniques with traditional machine learning methodologies, specifically Bayesian algorithms, yielding a refined model with superior prediction accuracy, stability, and robust generalisation capabilities.
Regardless of the predictive model established, the quality of data is crucial. Noise is an unavoidable problem, particularly in the field, so data cleansing is the first step. Yongqing Jiang et al. [9] discussed the advantages and drawbacks of data cleansing methods, providing specific applications for traditional statistical learning methods (such as median, mean, and mode filling) and the latest machine learning and deep learning methods. Siyuan Ma et al. [10] introduced attention mechanism-based deep learning models for air quality data cleansing; Jizhe Lu et al. [11] used clustering and long short-term memory deep learning models for anomaly filling in electricity usage data; Yibo Guo et al. [12] adopted generative adversarial networks to fill missing values in flight fuel data, achieving commendable results. In addition to deep learning models, numerous studies have employed machine learning-based regression methods for data imputation. For instance, Jian Tang et al. [13] utilised regression random forests and gradient-boosting trees to handle missing values, while Xiaoyan Liu [14] employed the least squares method for regression prediction. Beyond regression techniques, Zheng Gao and Zhen Xu [15] also utilised the k-nearest neighbor clustering method to fill in missing data.
Despite current research progressing in predicting the FCHR and oxygen consumption in steelmaking, several challenges and issues persist. Firstly, most studies focus on predicting and optimising steel quality, raw material addition, and final composition, with relatively less emphasis on predicting whether steelmaking can achieve the FCHR and on oxygen consumption research. Secondly, the complexity and diversity of data in the steel production process demand high model complexity. However, the amount of data available from the production process is often insufficient for training effectively, leading to sub-optimal data cleansing outcomes. Additionally, the generalisation capability of models is a crucial concern. Due to differences in steelmaking processes and equipment, machine learning models must be trained and validated in various scenarios to ensure adaptability and generalisation. Lastly, the interpretability of models is also a critical consideration. The black-box nature of machine learning models can make their predictions challenging to interpret and understand, limiting their application in industrial production to some extent.
There is very little literature on the prediction of the primary pull-up rate in the field of steelmaking, and the prediction methods tried are relatively few, and a hybrid prediction model that takes into account the prediction accuracy and convergence efficiency is proposed in this paper, which makes up for the gap of prediction using this method in this field, and for the prediction of oxygen consumption, the data filling strategy combined with the regression model also achieves good prediction results, which proves the effectiveness of the data filling strategy and the regression model. Firstly, Section 1 employs a Stacking ensemble learning model for data preprocessing to handle outliers in the data. This model effectively fills in the outliers, providing a more accurate data foundation for subsequent analysis. Secondly, this paper normalises the data to eliminate the impact of dimensional differences in data on model performance. This step helps enhance the generalisation ability and prediction accuracy of the model. Section 2 adopts a classification neural network model to predict FCHR. Through training and optimisation of the neural network, we successfully construct a model that can accurately predict the FCHR in steelmaking, providing strong support for optimising the smelting process. Furthermore, this paper conducts regression prediction research to effectively predict a critical factor in the smelting process—oxygen consumption. Section 3 employs a random forest regression model to handle outliers, and a Stacking ensemble method combines XGBoost and random forest models for prediction. This approach improves the accuracy of oxygen consumption prediction and provides a more reliable basis for optimising the smelting process.
In conclusion, the proposed method combining ensemble learning and neural network models for predicting the FCHR and oxygen consumption in steelmaking provides effective support for smelting process optimisation. By accurately predicting the FCHR and oxygen consumption, we aim to guide the smelting process efficiently, improving efficiency and productivity.
DATA CLEANSING
Steelmaking is exceedingly complex, and collecting complete and accurate data from the smelting process is challenging. There are numerous factors during steelmaking that can lead to data anomalies. For example, variations in steelmaking raw materials such as hot metal, scrap steel, and ore can cause anomalies due to quality changes or batch differences. Fluctuations in process parameters, like temperature, pressure, and oxygen content during steelmaking, might result from equipment malfunctions or operational errors, leading to anomalous values. Sensor malfunctions at the steelmaking site can also result in abnormal data collection. Environmental factors such as temperature, humidity, and air quality can also impact the steelmaking process and cause data anomalies.
Accurate prediction of the FCHR in steelmaking is a complex task that requires meticulous attention to outlier management. Before applying neural network models to classify input feature data, addressing and filling in any missing or aberrant values is imperative. Failure to do so can significantly compromise the model's learning capabilities. The presence of outliers, mainly when they constitute a significant portion of the dataset, can lead to increased bias and variance, resulting in unstable predictions. Furthermore, if outliers are concentrated in specific data points, it can introduce imbalances that skew the model's accuracy. Consequently, addressing outlier management effectively is a critical prerequisite for accurate one-pass drawability predictions in steelmaking, underscoring the importance of rigorous data preprocessing in scientific modelling.
Adopting an effective strategy for imputing anomalous values can further enhance the predictive capabilities of neural network models. Many effective anomaly imputation strategies have emerged with the rapid development of machine learning technologies in recent years. In particular the Stacking ensemble learning model has shown excellent performance in the regression domain. It improves model performance by using the predictions of multiple models as inputs through ensemble learning, as different models may have varying strengths and weaknesses in predicting the same issue, while Stacking can automatically amalgamate the strengths of various models to enhance performance.
This paper carefully designed two sets of comparative experiments to further explore the superiority of Stacking ensemble learning and neural networks in predicting the FCHR. The main difference between these two sets of experiments lies in the handling strategies for outliers. In the first set of experiments, we adopted the traditional mean-filling method to handle the outliers in the selected dataset. Subsequently, we used a classification neural network model to predict whether the steel-making process can achieve carbon pulling successfully. Although this method is simple and direct, it may need to fully capture the complex relationships and potential information behind the outliers. We adopted a more sophisticated outlier handling strategy in the second set of experiments. Specifically, when the proportion of outliers in a feature is less than 10%, we still use mean filling; however, for feature columns with outlier proportions exceeding 10%, we innovatively combined Stacking ensemble learning technology, using XGBoost and random forests to establish regression models to predict and fill these outliers. This strategy not only takes into account the differences between different feature columns but also fully utilises the advantages of ensemble learning to enhance the robustness of the model. After completing the outlier handling, we also used a classification neural network model to predict the FCHR. It is worth noting that the neural network models used in these two sets of experiments are precisely the same, thus ensuring the fairness and comparability of the experimental results. Through such experimental design, we can more comprehensively evaluate the performance of Stacking ensemble learning and neural networks in predicting the FCHR, and provide strong support for quality control and efficiency improvement in actual production.
The principle of the XGBoost model
XGBoost is a machine-learning algorithm based on Gradient Boosting Decision Trees [16]. It combines the advantages of tree models and linear regression, iteratively merging multiple weak classifiers into a robust classifier to enhance predictive capabilities.
XGBoost uses decision trees as its base model. Each node in these decision trees corresponds to a feature and a threshold value. In building each tree, XGBoost employs a greedy algorithm to find the optimal feature and threshold to purify the resulting subsets - concentrating similar samples and separating dissimilar ones. Specifically, XGBoost utilises Information Gain as the criterion for splitting. The formula for Information Gain is as follows:
During the iterative process of XGBoost, multiple weak classifiers are progressively merged into a robust classifier. In each iteration, XGBoost calculates the residuals of the current model and then constructs a new decision tree based on these residuals. This process makes minor adjustments to the original data to approximate the true value gradually. As the iterations proceed, residuals decrease, and the model's performance improves.
Model training can lead to overfitting. To prevent this, XGBoost incorporates regularisation terms to control model complexity. Specifically, XGBoost adds L2 and L1 regularisation terms to the loss function to penalise the complexity of the model. The L2 regularisation term primarily prevents overfitting, while the L1 regularisation term promotes model sparsity, enabling some leaf nodes to be zero, thereby compressing the model. The L1 and L2 regularisation terms are as follows:
The model's training is a continuous process of seeking the optimal solution to the objective function. The objective function of XGBoost comprises two parts: the loss function and the regularisation term. The loss function measures the discrepancy between the model's predictions and the actual values, while the regularisation term controls the complexity of the model. By minimising the objective function, XGBoost can find the optimal model parameters [17]. The formula for the objective function is as follows:
Principle of the random forest model
Random Forest is an ensemble learning algorithm that bases its predictions on multiple decision trees, enhancing the model's accuracy and stability by aggregating the results of these trees [18].
Each decision tree within a Random Forest model is constructed by randomly selecting samples and features [19]. Specifically, for each decision tree, a subset of samples (subsamples) is randomly chosen from the training dataset. A subset of features is then randomly selected from these subsamples to build the decision tree.
During constructing decision tree models, we employed pruning techniques to prevent overfitting effectively. Pruning is an essential model optimisation strategy that enhances the generalisation ability of decision trees, preventing them from exhibiting excellent performance on training data but poor performance on unknown data. We reserved a portion of the data as a validation set, which was not used in constructing the decision tree but rather for assessing model performance. During the construction process of the decision tree, we continuously adjusted its structure, including node splitting and merging, to find the optimal tree structure. The adjustments were made based on the error rate on the validation set. When the error rate was increased, indicating the emergence of overfitting, we stopped adjusting the decision tree structure and completed the pruning process [20].
The Random Forest model is formed after constructing multiple decision trees and integrating them. Predictions are made using the Random Forest model. The operational principle of a Random Forest is that a new sample is run through each decision tree for prediction, and the final prediction is the average of all these predictions. Specifically, if there are T decision trees in the Random Forest, the final prediction is given by the following:
The Random Forest model boasts advantages, such as fast learning speed, high prediction accuracy, ease in handling imbalanced data, robustness, and effectiveness in processing high-dimensional data and non-linear relationships. It is a powerful machine-learning model suitable for handling the non-linear data associated with steelmaking.
Imputing anomalous values with the stacking ensemble learning model
Stacking is an ensemble learning method that combines multiple base learners (such as XGBoost and Random Forest) into a meta-learner to improve the accuracy and stability of the model. The following outlines the process of establishing a Stacking ensemble of XGBoost and Random Forest models:
-
Step 1 Data Preparation
Selecting data features is a complex process, requiring the analysis of steelmaking mechanisms and correlations to determine the final features.
Steelmaking mechanism analysis involves studying the process under high temperatures where impurities in pig iron are oxidised to a certain extent using various sources of oxygen. Simultaneously, harmful elements and compounds are removed through slag making and other methods, controlling chemical compositions and adjusting temperatures to produce iron-carbon alloy (steel) with specific components and properties.
For steelmaking mechanism analysis, the extremely complex chemical processes occurring during steelmaking are initially examined to preliminarily screen for features important for predicting the FCHR of steelmaking. These features include total oxygen consumption, primary blowing duration, secondary blowing duration, white ash, duration before slag discharge, smelting cycle, total argon consumption, aluminium iron, steel ladle weight, equalising duration, fresh white, total nitrogen consumption, scrap quality, hot metal quality, fine white ash, tapping temperature, calcined dolomite, and over 20 other features.
However, relying solely on features derived from mechanism analysis to build a model is insufficient due to the strong correlations existing among these features. For example, strong correlations exist among features, such as total oxygen consumption, primary blowing duration, secondary blowing duration, and duration before slag discharge. Here, the Pearson correlation coefficient depicts the strength of correlations among features. Figure 1 shows the correlation graph:
[IMAGE OMITTED. SEE PDF]
Through visualisation, the correlations between features, or their ‘influence factors’, are easily discernible. The darker the colour or the closer the value to 1 in the intersecting areas of the horizontal and vertical axes, the deeper the relationship between them. It can be inferred that total oxygen consumption directly influences steelmaking, while the other three features are indirect factors positively correlated with total oxygen consumption. Therefore, total oxygen consumption is retained among these four features, and the other three are excluded. Following this method of correlation analysis and processing all features accordingly, 17 dimensions of features are finally obtained. These include total oxygen consumption, white ash, smelting cycle, fresh white, total argon consumption, total nitrogen consumption, aluminium iron, hot metal quality, scrap quality, hot metal P, hot metal S, carbon content, temperature measurement and sampling carbon content (TSC) (Temperature and Carbon content), temperature measurement and sampling oxygen content (TSO) (Temperature, Carbon content, Oxygen content), etc, with the label column indicating whether the FCHR is achieved. Before proceeding with anomaly data imputation, the FCHR label column is removed to prevent interference with the effectiveness of the classification neural network model.
-
Step 2 Training of Random Forest and XGBoost
A few selected features exhibit a significant proportion of anomalies (negative values, null values, non-normal zero values etc.). The features with a higher proportion of anomalies are: hot metal S, TSO temperature, and TSC temperature. For these data columns with many anomalies, separate regression models must be established. The dataset consists of all data where no anomalies exist in the feature columns, excluding hot metal S, TSO temperature, and TSC temperature, with 14 features in total. The labels are the features that must be predicted and filled (hot metal S, TSO temperature, and TSC temperature). The dataset was divided into a training set and a test set in a 9:1 ratio for each prediction feature. XGBoost and Random Forest were trained separately through grid search and cross-validation to minimise loss and select the best hyperparameters for the models.
Key hyperparameters for grid search in Random Forest include: the number of decision trees (n_estimators), maximum depth of the trees (max_depth), minimum number of samples required to split a node (min_samples_split), minimum number of samples required at a leaf node (min_samples_leaf), the number of features to consider when looking for the best split (max_features), and whether bootstrap sampling is performed (bootstrap).
For XGBoost, the primary parameters adjusted during grid search are: the number of decision trees (n_estimators), learning rate (learning_rate)—a higher learning rate converges faster but can lead to overfitting, maximum depth of the trees (max_depth), minimum number of samples required to split a node (min_samples_split), and minimum number of samples required at a leaf node (min_samples_leaf).
This process is repeated until regression models are established for all features with a high proportion of anomalies, such as hot metal S, TSO temperature, and TSC temperature.
-
Step 3 Stacking Learner Model
The first layer consists of Random Forest and XGBoost, and the second layer is linear regression. The model structure is shown in Figure 2.
-
Step 4 Workflow of the Stacking Learner
[IMAGE OMITTED. SEE PDF]
The input training data is first processed through both Random Forest and XGBoost models for prediction, yielding two datasets. These data are then fed into the second-layer linear regression model to obtain the final prediction.
-
Step 5 Training of the Stacking Learner
The dataset, comprising all data without anomalies in the feature columns, was divided into a training set and a test set in a 9:1 ratio. The Stacking model was trained using the training dataset. Since multiple features have a significant proportion of anomalies, a Stacking ensemble learning model needs to be established for each column. Specifically, the training dataset was divided into K subsets for data columns with a high proportion of anomalies, each used to train a Stacking learner. For the ith Stacking learner, the prediction results from the ith Random Forest and XGBoost learners are used as input features, while the prediction results of the test dataset serve as the output labels for training using linear regression. The process is shown in Figure 3.
-
Step 6 Model Evaluation and Optimisation
[IMAGE OMITTED. SEE PDF]
The Stacking model was evaluated using the test dataset. The evaluation metric used is the Mean Squared Error (MSE), calculated using the following formula, where yi represents the actual value and Yi the predicted value.
The final Stacking model was trained by continually reducing the MSE loss and improving the score.
-
Step 7 Model Prediction
The trained model was then used to predict on the dataset with anomalies.
PREDICTION OF THE FIRST CARBON HIT RATE
Neural networks are computational models formed by interconnections of multiple neurons, designed to perform specific artificial intelligence tasks through learning and training [21]. Neural networks can be categorised into feedforward and feedback networks, with feedforward networks being more common. In a feedforward network, information enters through the input layer, is processed by multiple hidden layers, and finally outputs to the output layer. Each neuron has an activation function transforms the input signal into an output signal. The connection weights between neurons can be learnt and adjusted, allowing the neural network to adapt better to the input data and achieve the desired output target.
A classification neural network is a feedforward neural network that can categorise its prediction results into different classes based on the features of the input data. It consists of an input layer, one or more hidden layers, and an output layer. Each layer is made up of multiple neurons interconnected by weights. Classification neural networks learn to extract features from input data and use them for predicting output categories. The basic structure of a neural network is shown in Figure 4:
[IMAGE OMITTED. SEE PDF]
Classification neural networks possess a potent non-linear mapping capability to handle non-linear classification problems. They also have strong self-learning and adaptive abilities, allowing them to process complex and vast datasets. Moreover, they exhibit high accuracy and robustness optimisation capabilities. These characteristics make them well-suited for handling the large and complex data generated in steel mills [22].
Dataset
The original data comes from the steel smelting data of Baotou Iron and Steel Group from 2022 to 2023, and the data of the whole process of smelting is recorded.
The dataset employed in this study includes complete data imputed using the Stacking ensemble model for most of the columns and mean value imputation for a minority. Additionally, it encompasses data filled with mean values for handling anomalies. A balancing procedure was initially implemented to address the imbalance between instances of achieving and not achieving the FCHR, aiming to achieve an equal distribution of 3000 instances for each category. Moreover, as the range of values across different feature columns varies, their influence on the model can differ substantially. Data normalisation was undertaken to mitigate the dominance of certain features, scaling the values to fall within the [0,1] range. The formula utilised for normalisation is as follows:
The normalised data was then divided into a training set and a test set in a 9:1 ratio for this study.
Neural network model
Seventeen features, including oxygen consumption, smelting cycle, white ash, hot metal weight, scrap endpoint, and others, exhibit a robust non-linear relationship with the FCHR in steelmaking. The BP neural network, known for its ability to approximate any non-linear mapping relationship and its simple structure coupled with strong generalisation capabilities, was employed to establish a classification prediction model for determining the likelihood of achieving the FCHR in steelmaking.
The constructed neural network model consists of an input layer with 20 nodes, followed by three hidden layers. The first two hidden layers are linear, each succeeded by a non-linear activation function layer. The final hidden layer is a Softmax classification layer, and the model culminates in an output layer containing a single node.
Upon establishing the model and considering it addresses a binary classification problem, the cross-entropy loss function is utilised and expressed as follows:
The Adam optimiser, known for its adaptive learning rate, was selected for optimisation. The model was trained over 20 epochs. Using data imputed with mean values, the training loss converged to 0.31, with a training accuracy of 88% and a test accuracy of 82%. The loss curve and the curves for changes in training and test accuracies are depicted in Figures 5 and 6.
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
When utilising data imputed with anomalies predicted and filled by the Stacking model, the training loss value converged to 0.05, with a training accuracy reaching 94.5% and a test accuracy of 90.5%. The corresponding loss curve is depicted in Figures 7 and 8.
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
The above charts and data comparison indicate that using the Stacking ensemble learning model first to predict and impute anomalies, followed by employing a neural network model for prediction, yields better results. Relative to training with data imputed using mean values, the loss is reduced by 0.26, training accuracy is increased by 6.05%, and test accuracy is improved by 8.05%.
OXYGEN CONSUMPTION PREDICTION
For predicting oxygen consumption, selecting features that significantly impact oxygen usage is crucial, as this directly affects the model's accuracy. Following the feature selection method used for predicting whether steelmaking achieves the FCHR, based on steelmaking mechanism analysis and correlation analysis, nine variables were ultimately chosen as input features for the model. These are hot metal quality, scrap quality, hot metal temperature, hot metal Si, hot metal Mn, hot metal P, hot metal S, endpoint temperature, and endpoint C. The label value is oxygen consumption. The obtained feature data is only partially complete and anomaly-free, necessitating the imputation of anomalies.
The strategy for imputing anomalies is similar to predicting whether steelmaking achieves the FCHR. For features where the proportion of anomalies does not exceed 10%, mean value imputation was used. For features where the proportion of anomalies exceeds 10%, only two features fall into this category: hot metal Mn and hot metal S. A regression random forest was established for predicting and imputing these anomalies. The following outlines the process of establishing a regression random forest:
-
Step 1 Data Preparation
At this stage, the data had already undergone mean value imputation, except for two features: hot metal Mn and hot metal S. The complete data from the remaining seven columns was used as input features, with hot metal Mn and hot metal S serving as prediction labels for separate model constructions. The datasets are thus composed of the complete data from the seven features, with hot metal Mn and hot metal S serving as labels, creating two distinct datasets. Once the datasets were obtained, they were divided into training and test sets in a 9:1 ratio.
-
Step 2 Model Construction
The training set was fed into the regression random forest model. Grid search was used to adjust and select the optimal parameters for the regression random forest model, aiming to achieve the highest possible score for the regression model. The primary parameters adjusted include: the number of decision trees (n_estimators), maximum depth of the trees (max_depth), minimum number of samples required to split a node (min_samples_split), minimum number of samples required at a leaf node (min_samples_leaf), the number of features to consider when looking for the best split (max_features), and whether bootstrap sampling is performed (bootstrap). Eventually, regression random forest models for predicting the content of hot metal Mn and hot metal S were developed, achieving score percentages of 76% and 72%, respectively.
-
Step 3 Model Evaluation
After parameter optimisation, the trained models were evaluated on the test set. It is observed that the scores on the test set are slightly lower than those on the training set, at 73% and 70% respectively, indicating that the models performed well in both training and prediction, and they are suitable for regression imputation of anomalous values.
-
Step 4 Anomaly Imputation
Rows with missing values for hot metal Mn and hot metal S were imputed using the respective regression random forest models for hot metal Mn and hot metal S. This process resulted in a complete dataset comprising all eight features.
After the anomaly imputation, nine complete data feature columns were obtained. The data was normalised to prevent certain column features from dominating the model, scaling the values to fall within the range [0,1]. The formula for normalisation is as follows:
At this stage, the prediction of oxygen consumption employed a Stacking model that integrates XGBoost and Random Forest, similar in structure and workflow to the Stacking model used for imputing anomalies in predicting the FCHR in steelmaking. The primary difference lies in the training data input for the XGBoost and Random Forest models. The model construction process is as follows:
-
Training of the Stacking Learner: With 11,200 data entries post-imputation, the complete dataset was divided into a training set and a test set in a 9:1 ratio. The Stacking model was then trained using the training dataset. The training set was initially fed separately into the XGBoost and Random Forest models to obtain the optimally trained models. Subsequently, these two models were integrated into the first layer of the Stacking model, with the second layer being a linear regression model. Finally, the training set was fed back into the entire Stacking model for training to reduce the MSE loss and improve the score to determine the optimal parameters of the Stacking model.
-
Evaluation of the Stacking Model: The test set was input into the trained model, resulting in a score slightly lower than the score obtained on the training set, which indicates the model's good predictive performance.
The model was then applied to predict across the entire dataset, totalling 11,200 data entries. The prediction was generally within 3%, and the hit rates for relative errors in oxygen content prediction at 1%, 2%, 3%, 4%, and 5% are 64%, 85.3%, 92.8%, 96.2%, and 97.9%, respectively. The formula for calculating relative prediction error is as follows:
The scatter plot of the relative prediction error and the curve of the hit rate are illustrated in Figures 9 and 10.
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
The scatter plot of relative prediction error and the hit rate curve indicate that the model performs well in the static prediction of oxygen consumption.
CONCLUSION
This paper introduces an advanced integrated learning method using Stacking with random forest algorithms to precisely predict carbon picking success in steelmaking. To enhance accuracy, we apply Stacking regression to predict outliers and fill columns with <10% abnormal values using mean prediction. This refined dataset is then trained in a neural network model, achieving 94.5% training accuracy and 90.5% FCHR prediction accuracy, surpassing traditional methods. Moreover, the framework is also applied to predict oxygen consumption in steelmaking, achieving 92.8% accuracy with a <3% relative error and 97.9% accuracy with a <5% error, validating the model's excellence. In conclusion, this model offers scientific insights for FCHR assessment and oxygen consumption prediction, enhancing steelmaking efficiency and cost-effectiveness. Its potential for widespread industrial application is promising.
AUTHOR CONTRIBUTIONS
Lingyun Yang: Conceptualisation; Formal analysis; Funding acquisition; Writing - original draft; Writing - review & editing. Qianchuan Zhao: Conceptualisation; Investigation; Methodology; Supervision. Tan Li: Data curation; Investigation; Software; Visualisation; Writing - original draft. Mu Gu: Data curation; Formal analysis; Investigation; Resources. Kaiwu Yang: Investigation; Software; Visualisation. Weining Song: Data curation; Methodology; Software.
ACKNOWLEDGEMENTS
This work was supported by the National Key R&D Program of China (Grant No. 2022YFE0197600).
CONFLICT OF INTEREST STATEMENT
No conflict of interest exists in the submission of this manuscript, and it is approved by all authors for publication.
DATA AVAILABILITY STATEMENT
Authors elect to not share data.
Wang, S.L.: Application Research on Dynamic Control Model of Converter Smelting in No.2 Steelmaking Plant of Baotou Steel. University of Science and Technology, Beijing (2009)
Ke, K., et al.: Prediction model of oxygen supply in converter steelmaking based on ensemble learning. Steelmaking 39(1), 47–52 (2023)
Zhu, G.J., Liang, B.C.: Optimum model of static control on BOF steel‐making process. Steelmaking(04), 25–28 (1999)
Wang, H.J., et al.: The converter oxygen consumption forecast based on optimization combination model. J. Henan Polytech. Univ. (Nat. Sci.) 36(02), 94–98 (2017)
Chang, L.Z., Li, Z.B.: Study on BP neural net based converter static control model. Steelmaking(06), 41–44 (2016)
Yang, S.C., et al.: Study on oxygen consumption model based on data drive in a converter. Steel‐Making (2022). https://doi.org/10.27007/d.cnki.gdbeu.2018.002082
Zheng, Z., Fan, J.P., Jiang, S.L.: A prediction method of converter oxygen consumption based on MechanismAnalysis and data driving. Steelmaking 38(4), 7–13 (2018)
Gao, Y.M.: The Research on Intelligent Algorithm of Steel‐Making and its Application in 100T Converter. Northeastern University (2018)
Jiang, Y.Q., Yang, Y.H., Yang, H.: Influence of data cleaning on the quality of information retrieval and cleaning methods. J. China Soc. Indexers 10(1), 16–20 (2012)
Ma, S.Y., et al.: Missing value filling for multi‐variable urban air quality data based on attention mechanism. Comput. Sci. Eng. 45(8), 1354–1364 (2023)
Lu, J.Z., et al.: Missing value treatment for minute freezing data of electricity based on clustering and LSTM. Control Eng. China 29(4), 611–616 (2022)
Guo, Y.B., et al.: An aircraft fuel data missing value filling method with generative adversarial network. J. Zhejiang Univ. (Sci. Ed.) 48(4), 402–409 (2021)
Tang, J., et al.: Filling method of missing data for municipal solid waste incineration processes with its application. J. Beijing Univ. Technol. 49(4), 435–448 (2023)
Liu, X.Y.: Regression Analysis Based on Missing Values for Compositional Data. Shanxi University (2019)
Gao, Z., Xu, Z.: Filling method of missing data in oilfield based on multiple regression KNN. Inf. Technol. 44(4), 9–83 (2020)
Chen, T., Guestrin, C.: XGBoost: A Scalable Tree Boosting System. ACM (2016)
Su, W., et al.: An XGBoost‐based knowledge tracing model. Int. J. Comput. Intell. Syst. 16(1), 13 (2023). https://doi.org/10.1007/s44196‐023‐00192‐y
Breiman, L.: Random forests, machine learning 45. J. Clin. Microbiol. 2, 199–228 (2001)
Quinlan, J.R.: Introduction of decision trees. Mach. Learn. 1(1), 81–106 (1986). https://doi.org/10.1007/bf00116251
Breiman, L., et al.: Classification and regression trees. Biometrics 40(3), 358 (1984)
Mishra, M., Srivastava, M.: A view of artificial neural network. In: 2014 International Conference on Advances in Engineering & Technology Research (ICAETR ‐ 2014). IEEE (2015)
Liu, Y., et al.: Survey on robustness verification of feedforward neural networks and recurrent neural networks. Ruan Jian Xue Bao/J. Software 34(7), 3134–3166 (2023)
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2025. This work is published under http://creativecommons.org/licenses/by/4.0/ (the "License"). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
First carbon hit rate (FCHR) is an essential indicator of steel converter smelting, reflecting the proportion of steel tapping completed without additional oxygen blowing. However, significant data loss has occurred due to equipment ageing and worker operations, resulting in difficulties in analysing the FCHR. This paper uses mechanism analysis and feature screening to determine the model input, predicts and fills in abnormal data through ensemble learning, and then optimises it through data transformation. Finally, the Stacking model predicts the FCHR, with a training accuracy of up to 94.5% and a test set accuracy of 90.5%. In addition, the authors also conducted a predictive study on oxygen consumption, and the hit rate performed well under different error thresholds, with a maximum of 97.9%. These results provide powerful decision support for steel production and effectively overcome the challenges of data missingness.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details
1 Department of Automation, Tsinghua University, Beijing, China, Beijing Aerospace Intelligent Manufacturing Technology Development Co., Ltd., Beijing, China
2 Department of Automation, Tsinghua University, Beijing, China
3 Department of Automation, School of Information Engineering, Nanchang University, Nanchang City, Jiangxi Province, China
4 Beijing Aerospace Intelligent Manufacturing Technology Development Co., Ltd., Beijing, China
5 Department of Computer Science, School of Mathematics and Computer Science, Nanchang University, Nanchang City, Jiangxi Province, China
6 Department of Network Engineering, School of Software, East China University of Technology, Nanchang City, Jiangxi Province, China